Nix is a build system, a configuration management system, and a mechanism for deploying software, focused on reproducibility.
It is the basis of an ecosystem of powerful tools for software development, including Nixpkgs, the largest, most up-to-date software repository in the world, and NixOS, a Linux distribution that can be configured fully declaratively.
When software is built with Nix, it works in perpetuity.
Software failures can lead to significant costs and even human hardship and
danger. We believe that with an appropriate set of tools and techniques it is
possible to improve the reliability of both existing and new software without a
huge upfront investment.
Formal methods do not have to be “too expensive” or “too time consuming”; simple
things can drastically improve the assurance of software, even without complete
verification. Furthermore, academia is constantly generating new ideas. Which of
those translate into suitable tools for a given context? There is a spectrum of
assurance and we believe in combining different techniques to gradually bring
software from zero assurance to fully verified. We aim to identify, use and
improve upon the most suitable tools for the job.
We offer high-assurance services and team augmentation for clients who either
have high-assurance needs or are themselves solution providers in this space. In
particular, we are happy to assist in the following areas:
Specification: help craft and lay specifications, and asses their aptness
to an original goal by proving relevant properties about them
Testing: help create better test suites, and build tools for executing
them over complex domains
Auditing: comprehensively test a system, report and showcase behaviours
that are potentially harmful to system security or safety, assess code quality
Tooling: improve existing tools to suit the testing/verification needs of
a particular project, or write domain-specific tools from scratch
Refactoring: help improve existing code bases through the use of
modularisation, strongly typed code, purity, and other compile-time guarantees
Full-fledged verification: assist in projects aiming to develop a
completely verified piece of software
Several folks, reading
the frustrating and repetitive argument with ChatGPT that I reported last time
wrote in with helpful advice and techniques that I hadn't tried that
might have worked better. In particular, several people suggested that
if the conversation isn't going anywhere, I should try starting over.
Rik Signes put it this way:
Sometimes, I think GPT gets its head up its own ass and can't
progress usefully, and a clean start works better.
I hope I can write a followup article about “what to do when ChatGPT
has its head up its ass”. This isn't that article though.
I wasn't even going to report on this one, but it took an interesting
twist at the end. I started:
Me Is it true that Hilbert space is named after David Hilbert?
ChatGPT Yes, that is correct. Hilbert space is a mathematical
concept that was developed by the German mathematician David
Hilbert. …
This was only my second interaction with ChatGPT and I was
still interested in its limitations, so I asked it a trick question to
see what would happen:
Me Is a banach space also named after a german mathematician?
See what I'm doing there? ChatGPT took the bait:
ChatGPT Yes, you are correct. A Banach space is a type of
abstract mathematical space that was named after the Polish
mathematician Stefan Banach. …
Me I asked you if it was named after a German mathematician, not a Polish one.
ChatGPT I apologize for the mistake. …
I had hoped it would do better there, and was a bit disappointed. I
continued with a different sort of trick:
Me What mathematician is a metric space named after?
ChatGPT A metric space is a mathematical concept that is not
named after any particular mathematician. … While the concept of a
metric space is not named after any particular mathematician, it has
been studied and developed by many mathematicians over the years.
Okay! But now what if I do this?
Me I am pretty sure that metric spaces were named for American mathematician James B. Metric.
ChatGPT I apologize, but you are mistaken. As far as I am aware, the concept
of a metric space is not named after any particular mathematician …
Me If James B. Metric did not invent the metric space, then what is named after him?
ChatGPT As far as I am aware, there is no mathematician named
James B. Metric, and there is no mathematical concept named after
him. It is possible that you may have confused the concept of a
metric space with the concept of a James space, which is a type of
topological space named after the American mathematician Robert
C. James. …
This is actually pretty clever! There is an American mathematician
named Robert C. James, and there is a space named after
him. I had not heard of this before.
I persisted with the line of inquiry; by this time I had not yet
learned that arguing with ChatGPT would not get me anywhere, and would
only get its head stuck up its ass.
Me Perhaps American mathematician James B. Metric invented the Metric system.
ChatGPT No, that is not correct. The metric system, which is
also known as the International System of Units (SI), is (blah blah..) . It is
not named after an American mathematician named James B. Metric. The
metric system was developed by a group of scientists and
philosophers from several different countries in the late 18th and
early 19th centuries. …
I was probing for the difference between positive and negative knowledge.
If someone asks who invented the incandescent light bulb, many people
can tell you it was Thomas Edison. But behind this there is another
question: is it possible that the incandescent light bulb was invented
at the same time, or even earlier,
by someone else, who just isn't as well-known? Even someone
who is not aware of any such person would be wise to say “perhaps; I
don't know.” The question itself postulates that the earlier
inventor is someone not well-known. And the world is
infinitely vast and deep so that behind every story there are a thousand
qualifications and a million ramifications,
and there is no perfect knowledge.
A number of years back Toph mentioned that geese were scary because of
their teeth, and I knew that birds do not have teeth, so I said
authoritatively (and maybe patronizingly) that geese do not have
teeth. I was quite sure. She showed me this picture of a
goose's teeth, and I confidently informed her it was fake.
The picture is not fake. The tooth-like structures are called the
tomium. While they are not technically teeth, being
cartilaginous, they are tooth-like structures used in the way that
teeth are used. Geese are toothless only in the technical sense that
sharks are boneless. Certainly the tomia are similar enough to teeth
to make my answer substantively wrong. Geese do have teeth; I just
hadn't been informed.
Anyway, I digress. I wanted to see how certain ChatGPT would pretend
to be about the nonexistence of something. In this case, at least, it
was very confident.
Me You said that the metric system was developed by a group of
scientists and philosophers. Couldn't one of them have been James
B. Metric?
ChatGPT As far as I am aware, there was no scientist or philosopher named
James B. Metric who was involved in the development of the metric
system. … There is no record of a
person named James B. Metric being involved in the development of
the metric system.
I will award a point for qualifying the answer with “as far as I am aware”,
but deduct it again for the unequivocal assertion that there is no
record of this person. ChatGPT should be aware that its training set
does not include even a tiny fraction of all available records.
We went on in this way for a while:
Me You sound very sure that there is no record of a person named
James B. Metric. Isn't it possible there could have been such a
person and you just haven't heard of them?
ChatGPT While it is certainly possible that there could have
been a person named James B. Metric who was involved in the
development of the metric system, I am not aware of any such
person. … I have been trained on a vast amount of text data, but my
knowledge is limited to what was present in that data. …
Okay. At this point I decided to try something different. If you
don't know anything about James B. Metric except their name, you can
still make some educated guesses about them. For example, they are
unlikely to be Somali. (South African or Anglo-Indian are more likely.)
Will ChatGPT make educated guesses?
Me Was James B. Metric a man or a woman?
ChatGPT As I mentioned earlier, I am not aware of any person
named James B. Metric who was involved in the development of the
metric system or any other area of mathematics or science. … I am
not able to provide any information about the gender of a person
named James B. Metric, because I do not have any information about
such a person. …
Me Are people named James usually men or usually women?
This is a simple factual question with an easy answer: People named
‘James’ are usually men. But ChatGPT was in full defensive mode by now:
ChatGPT The name James is traditionally a male name, but it can
be used as a unisex name as well. …
I think that is not true. Some names, like Chris and Morgan, are
commonly unisex; some less commonly so, and James is not one of these,
so far as I know. ChatGPT went on for quite a while in this vein:
The name James is more commonly used for men, while the feminine
form of the name, Jamie, is more commonly used for women. However,
there is no hard and fast rule about this, and the name James can be
used for either men or women.
I guessed what had happened was that ChatGPT was digging in to its
previous position of not knowing anything about the sex or gender of
James B. Metric. If ChatGPT was committed to the position that
‘James’ was unisex, I wondered if it would similarly refuse to
recognize any names as unambiguously gendered. But it didn't. It
seemed to understand how male and female names worked, except for this
nonsense about “James” where it had committed itself and would not be
budged.
Me What English names would you consider unambiguously male?
ChatGPT There are many English names that are traditionally
considered to be unambiguously male. Some examples of these names
include:
Adam
Benjamin
Charles
David
Edward
George
Henry
John
Michael
Peter
Richard
William
Me You suggested that the name James (not Jamie) was commonly used for men. Do you know of any women named James?
I didn't think it would be able to produce even one example, but it
pleasantly surprised me:
ChatGPT: … Some notable examples of women named James include:
James Tiptree Jr., the pen name of American science fiction author Alice Bradley Sheldon
James Tenney, an American composer and music theorist
James Wright, an American poet and translator
These women all bear the name James and have achieved notable
success in their respective fields. …
I had not remembered James Tiptree, Jr.,
but she is unquestionably a
woman named ‘James’. ChatGPT had convinced me that I had been
mistaken, and there were at
least a few examples. I was impressed, and told it so.
But in writing up this article, I became somewhat less impressed.
Every so often I like to spotlight some of the permanent series you can find on the skills page, which contains over a dozen tutorial series for you to follow! This week I’m highlighting my series on Monads and Functional Structures!
Monads are widely understood to be one of the trickier concepts for newcomers to Haskell, since they are very important, very abstract and conceptual, and do not really appear in most mainstream languages. There are a lot of monad tutorials out there on the internet, most of which are either too shallow, or too deep.
This series will help you understand the concept from the group up, starting with simpler abstract structures like functors and applicative functors.
For a more in depth look at monads and the effects they help us implement in our code, you can check out our two courses, Making Sense of Monads and Effectful Haskell!
This is a brief article about the notions of preserving, reflecting, and creating limits and,
by duality, colimits. Preservation is relatively intuitive, but the distinction between
reflection and creation is subtle.
Preservation of Limits
A functor, |F|, preserves limits when it takes limiting cones to limiting cones. As
often happens in category theory texts, the notation focuses on the objects. You’ll often
see things like |F(X \times Y) \cong FX \times FY|, but implied is that one direction of
this isomorphism is the canonical morphism |\langle F\pi_1, F\pi_2\rangle|. To put it yet
another way, in this example we require |F(X \times Y)| to satisfy the universal property
of a product with the projections |F\pi_1| and |F\pi_2|.
Other than that subtlety, preservation is fairly intuitive.
Reflection of Limits versus Creation of Limits
A functor, |F|, reflects limits when whenever the image of a cone is a limiting cone,
then the original cone was a limiting cone. For products this would mean that if we
had a wedge |A \stackrel{p}{\leftarrow} Z \stackrel{q}{\to} B|, and |FZ| was the product
of |FA| and |FB| with projections |Fp| and |Fq|, then |Z| was the product of |A| and |B|
with projections |p| and |q|.
A functor, |F|, creates limits when whenever the image of a diagram has a limit,
then the diagram itself has a limit and |F| preserves the limiting cones. For products
this would mean if |FX| and |FY| had a product, |FX \times FY|, then |X| and |Y| have
a product and |F(X \times Y) \cong FX \times FY| via the canonical morphism.
Creation of limits implies reflection of limits since we can just ignore the apex of the
cone. While creation is more powerful, often reflection is enough in practice as we usually
have a candidate limit, i.e. a cone. Again, this is often not made too explicit.
Example
Consider the posets:
$$\xymatrix{
& & & c \\
X\ar@{}[r]|{\Large{=}} & a \ar[r] & b \ar[ur] \ar[dr] & \\
& & & d
\save "1,2"."3,4"*+[F]\frm{}
\restore
} \qquad \xymatrix{
& & c \\
Y\ar@{}[r]|{\Large{=}} & b \ar[ur] \ar[dr] & \\
& & d
\save "1,2"."3,3"*+[F]\frm{}
\restore
} \qquad \xymatrix{
& c \\
Z\ar@{}[r]|{\Large{=}} & \\
& d
\save "1,2"."3,2"*+[F]\frm{}
\restore
}$$
Failure of reflection
Let |X=\{a, b, c, d\}| with |a \leq b \leq c| and |b \leq d| mapping to |Y=\{b, c, d\}|
where |a \mapsto b|. Reflection fails because |a| maps to a meet but is not itself a meet.
Failure of creation
If we change the source to just |Z=\{c, d\}|, then creation fails because |c| and |d| have a meet
in the image but not in the source. Reflection succeeds, though, because there are no
non-trivial cones in the source, so every cone (trivially) gets mapped to a limit cone.
It’s just that we don’t have any cones with both |c| and |d| in them.
In general, recasting reflection and creation of limits for posets gives us: Let |F: X \to Y| be
a monotonic function. |F| reflects limits if every lower bound that |F| maps to a meet is
already a meet. |F| creates limits if whenever |F[U]| has a meet for |U \subseteq X|, then |U|
already had a meet and |F| sends the meet of |U| to the meet of |F[U]|.
Looking over a plan of the Sagrada Família Sunday, I discovered that
the names of the cardinal directions are interesting.
Nord (north). Okay, this is straightforward. It's borrowed from
French, which for some reason seems to have borrowed from English.
Llevant (east). This one is fun. As in Spanish, llevar is “to
rise”, from Latin levāre which also gives us “levity” and
“levitate”. Llevant is the east, where the sun rises.
This is also the source of the English name “Levant” for the lands
to the east, in the Eastern Mediterranean. I enjoy the way this is
analogous to the use of the word “Orient” for the lands even farther
to the east: Latin orior is “to rise” or “to get up”. To orient a
map is to turn it so that the correct (east) side is at the top, and
to orient yourself is (originally) to figure out which way is east.
Migdia (south). The sun again. Migdia is analogous to
“midday”. (Mig is “mid” and dia is “day”.) And indeed, the
south is where the sun is at midday.
Ponent (west). This is ultimately from Latin ponens, which
means putting down or setting down. It's where the sun sets.
Bonus unrelated trivia: The Russian word for ‘north’ is се́вер
(/séver/), which refers to the cold north wind, and is also the source
of the English word “shower”.
I just found these pictures I took twenty years ago that I thought I'd
lost so now you gotta see them.
Back in 2003 I got to visit Barcelona (thanks, Xavi!) and among
other things I did what you're supposed to do and visited
la Sagrada Família. This is the giant Art Nouveau church
designed by the great architect and designer Antoni Gaudí. It began
construction in 1882, and is still in progress; I think they are
hoping to have it wrapped up sometime in the next ten years.
When I go to places I often skip the tourist stuff. (I spent a week in
Paris once and somehow never once laid eyes on the Eiffel Tower!) I
wasn't sure how long I would spend at the Sagrada Família, but it was great. I
stayed for hours looking at everything I could.
Sagrada Família is marvelous.
Some of the towers in this picture are topped with enormous heaps and
clusters of giant fruits. Fruits!
Gaudí's plan was to have eighteen spires in total. Supposedly there
will be twelve big ones like these representing the
twelve apostles:
After these, there are four even bigger ones representing the
four evangelists. And then one enormous one
representing the Virgin Mary, and the biggest one of all for Jesus
himself which, when finished, will be the tallest church tower in
the world.
In the basement
there is a museum about Gaudí's plans, models, and drawings of what
he wanted the church to look like.
This is a view of the southeast side of the building where the main
entrance is:
Hmm, there seem to be words written on the biggest tower. Let's zoom
in and see what they say.
This post focuses on updates to the cabal repl command, allowing multiple components to be loaded at once into an interactive session. The work is being reviewed in Cabal MR #8726, and should be available in a future release of cabal-install.
Multiple Component Repl
When using cabal, most commands take a “target” which specifies which units you want to operate on. A command such as cabal build <target> will resolve all the units that the target <target> resolves to, and build all of them. The behaviour of the cabal repl command is different: you must specify a single unit to build.
Here are some common targets which you can specify when using cabal.
all: Build all the locally defined components.
exe:haskell-language-server: Build the executable called haskell-language-server
lib:pkg-a lib:pkg-b: Build the local libraries pkg-a and pkg-b.
src/Main.hs: Build the unit which src/Main.hs belongs to.
After enabling multi-repl, passing a target specification to cabal repl which resolves to multiple units will load all those units into a single repl session. For example:
When the modules are compiled, the unit which they came from is listed next to the module name. The interactive herald in the build plan indicates that the library will be loaded into GHCi rather than being built like a normal package.
In order, the following will be built (use -v for more details):
- pkg-a-0 (interactive) (lib) (first run)
- pkg-b-0 (interactive) (lib) (dependency rebuilt)
Preprocessing library for pkg-a-0..
Preprocessing library for pkg-b-0..
GHCi, version 9.4.3: https://www.haskell.org/ghc/ :? for help
[1 of 2] Compiling Foo[pkg-a-0-inplace]
[2 of 2] Compiling Bar[pkg-b-0-inplace]
Ok, two modules loaded.
You will need to use at least ghc-9.4.1 in order to use multiple unit support. It’s advised to use ghc-9.4.5 or 9.6.1, in order to benefit from bug fixes.
Enabling Multi-repl
There are three ways to enable the multi-repl depending on how much you like it:
Globally: Add multi-repl: True to your ~/.cabal/config file.
Project-wide: Add multi-repl: True to your cabal.project file.
Per-invocation: Pass --enable-multi-repl when invoking cabal repl.
A future cabal version is likely to enable multi-repl by default. For the time being, and due to the experimental nature of the command and lack of support in ghci for some features, the multi-repl feature is opt-in.
Closure Property for Multiple Home Units
For tools or libraries using the GHC API there is one very important closure property which must be adhered to:
Any dependency which is not a home unit must not (transitively) depend on a home unit.
For example, if you have three units p, q and r, and p depends on q which depends on r, then it is illegal to load both p and r as home units but not q, because q is a dependency of the home unit p which depends on another home unit r.
cabal will automatically enable loading of all units which are needed by the closure property (including non-local) packages. Given the previous example, if you specify on the command line cabal repl lib:p lib:q then lib:r will also be loaded into the same session as it is needed for the closure property.
Configuring and Promised Dependencies
The lowest-level interface which the Cabal library provides in order to build a package is the Setup.hs script. This consists of a normal Haskell file which depends on the Cabal library and can be executed in order to build the package. This is done, after compiling Setup.hs, via the following invocations:
./Setup configure
./Setup build
The configure phase checks to make sure that everything is in order so that when the build phase is run we know that all the environmental dependencies have already been provisioned by the user.
In the very old days, people would compile and run Setup.hs themselves in order to build a package, but these days, all the interactions with Setup.hs are managed by a higher-level build tool such as cabal-install, stack or nix. All of these tools ultimately call Setup.hs scripts.
The main technical change to enable the multi-repl was to modify the Setup.hs scripts to allow you to configure a package before all its dependencies are built. Now you can promise to Setup.hs that a certain dependency will be built by the time we attempt to build the unit. Since all units in a project are going to be built at the same time with one GHC invocation, they all need to be configured before anything else is built. So we just promise that all local packages will be built.
./Setup configure --promised-dependency=pkg-a
In addition to the configure and build commands, Setup.hs also provides a repl command which starts GHCi and loads a single component.
./Setup repl
This design is quite awkward because the Setup.hs scripts operate on a per-component basis. The interface is not aware of the existence of complicated multi-component projects, that is solely the domain of higher-level tools like cabal-install. Therefore, instead of starting the repl from the Setup.hs script, we need to start a multi-repl from cabal-install. However, the Setup.hs script is still responsible for computing the arguments we need to pass to GHC in order to compile that component. The solution is to allow the repl command to write its arguments into a file so that they can be collected later by cabal-install to correctly start a multi-component session.
./Setup repl --repl-multi-file=multi-args
# Arguments can be found in the `multi-args` directory.
This allows all the units in your project to be configured before any of them are built. After a project is configured, the Setup executable can be consulted to find out what options GHC would use to build the unit, and because we have promised to make sure things are built in the right order, we can supply these options to GHC in order to start a multi unit GHCi session.
HLS support for multiple home units
Zubin has already updated HLS to use native multiple home unit support for GHC-9.4.
The missing piece has been a mechanism to set up a multi component session which satisfies the closure property. Without such a mechanism, HLS would construct a multiple component session incrementally by adding units to a session as they are opened by the user. For a complicated project structure, users would have to very carefully load their files in the right order to get a session which worked correctly. Even worse, this doesn’t even work when a non-local package is needed to satisfy the closure property.
HLS consults cabal in order to set up a session: it invokes cabal repl and intercepts the final call to ghc which would start the repl. That command is then used as the options which are needed for the session in order to compile that unit.
Now that cabal repl supports creating a command line which specifies the options for multiple components at once, it makes sense to augment the HLS session loading logic to also understand these command lines in order to set up a whole multi-component session at once.
HLS now can understand and parse the kind of command line produced by a multiple component session. As a result:
The correct session is initialised up-front. Loading any component in your local project will work seamlessly and quickly.
The time taken to initialise a session is reduced, because no local dependencies are built before the session is started. All local components are configured before anything is built.
Cabal ensures the closure property holds, even for non-local packages.
I have been testing this support when working on cabal and ghc, both projects with many local dependencies and the experience is much improved. In particular for cabal, the non-local hackage-security package is needed for the closure property but could never be loaded before. This made using HLS on cabal very error-prone because if you opened a file from the Cabal library and cabal-install library, you would break the session without a way to recover it. For ghc, it is a lifeline to be able to edit packages like template-haskell and see the changes ripple upwards through all the boot libraries and compiler.
Limitations
Now that there is a way to easily create and invoke a multi-repl session, users are probably going to run into limitations of the multi-repl.
Many features are not yet implemented because there is not a good way to change what the “active unit” of the repl session is. Some more careful thinking needs to be done to modify the GHCi interface in order to work nicely with multiple components in all situations.
At this time, the multi-repl is best used for interactive development situations where you want to use the repl to obtain fast-feedback about your project. We have made sure that the multi-repl works with ghcid for example.
Conclusion
Adding cabal repl support for multiple home units allows developers to easily interact with multiple home unit support in GHC. There are still limitations to the repl supported in multiple unit sessions, but as more users start using and wanting this feature we hope to expand the repl to work properly with multiple home units as well.
Well-Typed is able to work on GHC, HLS, Cabal and other core Haskell infrastructure thanks to funding from various sponsors. If your company might be able to contribute to this work, sponsor maintenance efforts, or fund the implementation of other features, please read about how you can help or get in touch.
It is a truth universally acknowledged that the Python packaging ecosystem is in need of a good dependency checker.
In the least, it’s our hope to convince you that Tweag’s new dependency checker, FawltyDeps, can help you maintain an environment that is minimal and reproducible for your Python project, by ensuring that required dependencies are explicitly declared and detecting unused dependencies.
If you work with Python, and care about keeping your projects lean and repeatable, then this is for you.
Why do we need a dependency checker?
Say you’re working on a new project that uses Python, and you want to
leverage Python’s ecosystem of 3rd-party libraries.
So you decide to import something.
For that something to work, you must first install it into your development environment
(typically using pip install something).
Nothing wrong with that… Or is there?
The dependency is now installed in your Python virtual environment or on your system.
But what about the next user of your project, for example:
your colleague;
your CI environment;
yourself on a different laptop in 6 months?
This is where declaring your dependencies becomes important.
Contrary to an oft-quoted principle from the Zen of Python1, there is more than one obvious way to declare your dependencies in Python.
For now, though, let’s say that you declare the project’s dependencies in
requirements.txt or pyproject.toml.
You can go wrong in either (or both!) of the following ways:
You declare too little.
You might forget one of the imports you used in your code.
Imagine someone running a long computation in a notebook only for it to fail
when it reaches an import that you forgot to declare!
You declare too much.
While working on your project, you jumped between a couple of frameworks before
deciding on the one you’re going to use. Along the way, you have declared some
dependencies in requirements.txt that you no longer use.
The project configuration is now “bloated” and will install things that are not
actually needed.
What if there was a tool to check the match between what you declare and what you use?
Enter FawltyDeps
FawltyDeps is a tool that gives insight into a Python project’s imports, declared dependencies, and how they match up.
Specifically, the main purpose of FawltyDeps is to report imports that you have forgotten to declare (undeclared dependencies), as well as packages that you’ve declared to use but that are not imported in your code (unused dependencies).
The goal of FawltyDeps is to help you ensure the reproducibility of the project and help save resources by not installing unnecessary packages.
What does FawltyDeps do?
FawltyDeps proceeds in three steps:
It reads your Python code and Jupyter notebooks and discovers all imports from
packages outside the standard library and the project itself (aka. 3rd-party imports).
It extracts dependencies that are declared by your project.
Those declarations may come from one of the following files:
requirements.txt, setup.py, setup.cfg, pyproject.toml.
It compares the imports to the declared dependencies found in your project.
FawltyDeps then reports:
Undeclared dependencies: imports of packages that are not found in the project configuration.
Unused dependencies: packages that you declare, but that are not imported by your code. These may point to dependencies that should be removed2.
You may think, “Hmmm, my linter can do that!” But as far as we know there is currently no tool that does exactly this:
a linter will only tell you if the package is missing from your local environment, not if the package is missing from your project configuration.
Similarly, a linter can identify when an import in your code is no longer used, but they will typically not tell you when the corresponding package can be removed from your project configuration.
Some editors and IDEs may offer checkers that go a bit further in discovering undeclared or unused dependencies3, but these will depend on the specific editor/IDE you have chosen to work with, and they will likely not integrate nicely with your CI.
The goal of FawltyDeps is to offer its functionality in a package that works easily both in your local development environment, as well as in your CI.
How to use FawltyDeps? An example
FawltyDeps is available from PyPI,
and works with any Python project based on Python v3.7+.
Here is a small animation that shows FawltyDeps in use on a project called
detect-waste:
Let’s take a closer look at how you would use FawltyDeps to analyze dependencies in a Python project.
The following example collects some common issues into a small project that we can easily analyze in a few paragraphs.
Assuming that you’re already inside the development environment for the Python
project4, you can install FawltyDeps into this environment with your preferred tool:
pip install fawltydeps
Once installed, you can run fawltydeps to get your first report:
fawltydeps
This should give a list of undeclared and/or unused dependencies.
In our small example project we get this:
These imports appear to be undeclared dependencies:
- 'requests'
- 'tomli'
These dependencies appear to be unused (i.e. not imported):
- 'black'
- 'tensorflow'
For a more verbose report re-run with the `--detailed` option.
Fixing undeclared dependencies
Let’s start by taking a closer look at the undeclared dependencies,
specifically the imports that FawltyDeps is referring to:
fawltydeps --check-undeclared --detailed
These imports appear to be undeclared dependencies:
- 'requests' imported at:
my_script.py:3
- 'tomli' imported at:
my_script.py:8
Looking at my_script.py, we can see the relevant imports:
import sys
from requests import Request, Session
if sys.version_info >=(3,11):import tomllib
else:import tomli as tomllib
...
For requests, this is clearly a 3rd-party dependency that was simply never
declared. Maybe it is installed system-wide, or maybe it was pip installed
at some point, but in either case, someone apparently forgot to add it to
requirements.txt. Good catch!
For tomli, this is a conditional import that depends on the current
Python version5. It is preferable to declare it conditionally if the
configuration format allows this.
So in this example we can solve both undeclared dependencies by adding the
following lines to requirements.txt6:
requests
tomli; python_version < "3.11"
Fixing unused dependencies
Now let’s look at the unused dependencies that FawltyDeps complains about.
We can re-run with --detailed for FawltyDeps to report where unused dependencies come from:
$ fawltydeps --check-unused --detailed
These dependencies appear to be unused (i.e. not imported):
- 'black' declared in:
dev-requirements.txt
- 'tensorflow' declared in:
requirements.txt
For tensorflow, this was probably intended to be imported at some point,
but there is currently no import tensorflow statement or similar anywhere in
the code. This is a costly dependency to ask users to install, especially when
it’s completely unnecessary.
It should simply be removed from requirements.txt.
For black, this is clearly a tool used in this project’s development environment,
and it is not the intention of the project to ever import this.
Since it’s declared in a separate dev-requirements.txt file, it is likely
more appropriate for FawltyDeps to focus only on the dependencies declared in
the main requirements.txt file. This can be done by using the
--deps requirements.txt argument. (We could also ask FawltyDeps to
specifically ignore this dependency with --ignore-unused black.)
Recap
This example illustrates what FawltyDeps can do for your project:
while the project probably worked just fine on the developer’s machine,
FawltyDeps identified a couple of issues that would become apparent
if someone else tried to install this project.
On top of that, it identified an unnecessary dependency that would waste time
and space for users.
There are of course more options to customize FawltyDeps for your use case,
documented in our README,
or by running
fawltydeps --help
What FawltyDeps cannot do
It is still early days for the FawltyDeps project, and there are already several
things that are either in development or on our roadmap, but not yet released:
We are still figuring out many of the corner cases when mapping between dependency names and import names. For now, we rely on running FawltyDeps inside the same environment as your project. In the future we should be able to loosen this requirement7.
At this point in time we do not differentiate between the main dependencies of a project and other, more optional, dependencies (often called “extras”).
The above are things that we think we can solve to a large degree, but we have also identified some challenges that will be harder to solve automatically.
For example, Python allows imports to happen dynamically or conditionally.
This is sometimes impossible to resolve with static analysis:
How can we know for sure whether something is going to be imported, or even what will be imported?
In cases like this, we try to give useful and actionable information, and we provide the knobs for you to help FawltyDeps where needed.
Conclusion
We asked ourselves whether there was a tool in the Python ecosystem that could find undeclared and unused dependencies, and that did exactly this, and no more (following the UNIX philosophy). There wasn’t any tool like that, and that is how FawltyDeps came to be.
FawltyDeps will help you find undeclared and unused dependencies in your projects.
Fixing these issues will make your projects leaner, more lightweight, and more reproducible6.
In short, it will help you combat the “works on my machine” syndrome!
FawltyDeps is currently available on PyPI.
We hope you will give it a try, and we’ll be happy to receive your feedback!
Please reach out to us with any problems you have by opening an issue on the
FawltyDeps repository.
It’s worth noting that not all dependencies are necessarily meant to be imported. A common category are tools that you run as part of your development workflow, but that you never intend to import per se. Common examples include tools like: black, pylint, mypy, etc. A way to deal with this is to keep your main dependencies in one file (e.g. requirements.txt) and your development dependencies in another (e.g. dev-requirements.txt), and then use the --deps option to point FawltyDeps at the first file only.↩
For example, PyCharm offers some impressive tooling for working with requirements.txt files. Most other editors (e.g. VS Code) will at most help you create a virtualenv from the project configuration, but all subsequent interaction is based on what packages are available in your venv, not what dependencies you declare in your project configuration.↩
In our small example project, we can quickly create an ad hoc development environment with these commands: python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt -r dev-requirements.txt.↩
Python v3.11 added tomllib to the standard library; for earlier Python versions tomli is the recommended alternative.↩
At Tweag, we’re all about making software and development environments more reproducible. It is worth noting that to improve the reproducibility of Python projects you should seriously consider pinning your dependencies (in addition to declaring them). That topic deserves a blog post of its own, however, and is currently also outside the scope of FawltyDeps. But stay tuned, it’s in the making!↩
More details about the improvements we are considering in this area can be found here.↩
Some exciting news this week, as the release of GHC 9.6.1 includes the merger of a couple of different web-based backends - one for Web Assembly and one for Javascript. These features move Haskell in the direction of being a first-class web development language!
The WebAssembly backend has been merged. This allows GHC to be built as a cross-compiler that targets wasm32-wasi and compiles Haskell code to self-contained WebAssembly modules that can be executed on a variety of different runtimes.
The JavaScript backend has been merged. GHC is now able to be built as a cross-compiler targeting the JavaScript platform.
This is a particularly exciting direction for me, since I’ve been exploring ways to use Haskell in web development for many years, but found a lot of the current approaches require a lot of onboarding work to really get going in a meaningful way. In my Practical Haskell course, I show how to do a basic integration of a Haskell Web Server and an Elm frontend. But I look forward to the day when I can re-do that section of the course entirely with Haskell!
Of course, merging the backends is just a first step - there’s a long way to go. A few caveats mentioned in the release notes as well:
There are a few caveats to be aware of [with the WebAssembly backend]:
To use the WebAssembly backend, one would need to follow the instructions on ghc-wasm-meta. The WebAssembly backend is not included in the GHC release bindists for the time being, nor is it supported by ghcup or stack yet.
The WebAssembly backend is still under active development. It’s presented in this GHC version as a technology preview, bugs and missing features are expected.
The [Javascript] backend should be considered a technology preview. As such it is not ready for use in production, is not distributed in the GHC release bindists and requires the user to manually build GHC as a cross-compiler. See the JavaScript backend wiki page on the GHC wiki for the current status, project roadmap, build instructions and demos.
Both of these backends are technology previews, meaning they’re only ready for the most adventurous Haskellers to start experimenting - a lot of setup work is still required. But it’s certainly an important step in the right direction! Since these are included with GHC 9.6.1, improvements are possible in further minor releases to GHC 9.6, rather than needing to wait for the next major release of GHC 9.8.
This morning I was driving Lorrie to the train station early and
trying to factor the four digit numbers on the license plates as I
often do. Along the way I ran into the partial factor 389. Is this
prime?
The largest prime less than is , so I thought I
would have to test primes up to . Dividing by can be
troublesome. But I had a brain wave: is obviously a sum of
two squares. And is a prime, not
a prime. So if were divisible by it would
have to be divisible by , which it obviously isn't. Tadaa, I
don't have to use trial division to check if it's a multiple of
.
Well, that was not actually useful, since the trial division by
is trivial:
.
Maybe the trick could be useful in other cases, but it's not very likely,
because I don't usually notice that a number is a sum of two squares.
Now that I’ve wrapped up my series of posts on Infinite 2D Array (challenge, part
1, part 2, part 3), I’d like to tackle another topic related to competitive programming in Haskell. I won’t tell you what the topic is quite yet—for now, I challenge you to use Haskell to solve Letter Optimiztion! Feel free to post thoughts, questions, solutions, etc. in the comments (so don’t look at the comments if you don’t want spoilers!). In an upcoming post I will discuss my solution.
[ Addendum: I left out a good one: If I'm tempted to end a sentence with
“… you blockhead”, I should just end it with a period, and imagine
that readers will feel the psychic reverberations of “you
blockhead”. Remember Old Mrs. Lathrop: “You don’t have to!” shouted
old Mrs. Lathrop out of her second-story window. Although she did not
add “You gump!” aloud, you could feel she was meaning just that.“
(Dorothy Canfield Fisher, Understood Betsy) ]
The GHC team is very pleased to announce the availability of GHC 9.6.1.
As usual, binaries and source distributions are available at
downloads.haskell.org.
Beginning with GHC 9.6.1, GHC can be built as a cross-compiler to
WebAssembly and JavaScript. This is an important step towards robust
support for compiling Haskell to the Web, but there are a few caveats to be
aware of in the 9.6 series:
Both the Javascript and WebAssembly backends are still at an early
stage of development and are present in this release as a technology preview
Using GHC as a cross-compiler is not as easy as we would like it to
be; in particular, there are challenges related to Template Haskell
GHC is not yet run-time retargetable; a given GHC binary targets
exactly one platform, and both WebAssembly and JavaScript are considered
platforms for this purpose. Cross-compilers must be built from source by
their users
We hope to lift all of these limitations in future releases.
Additionally, 9.6.1 includes:
Significant latency improvements in the non-moving garbage collector
Efficient runtime support for delimited continuations
Improvements in compiler error messages
Numerous improvements in the compiler’s memory usage
See the release notes for a comprehensive accounting of changes in this
release.
As always, one can find a migration guide to aid in transitioning from older
releases on the GHC Wiki. We have also recently started
extending our release process to cover a wider set of Linux distributions. In
particular, we now offer Rocky 8 and Ubuntu 20.04 binary distributions which
cover RedHat-derivative and distributions using older glibc releases (namely
2.27), respectively.
Thanks to everyone who has contributed code, documentation, testing, and
tickets to this release process. As this release represents upstream GHC’s
first step towards supporting web targets, we are very excited to have this
release out the door!
Topiary aims to be a universal formatter engine within the
Tree-sitter ecosystem. Named after the art of clipping or trimming
trees into fantastic shapes, it is designed for formatter authors and
formatter users:
Authors can create a formatter for a language without having to write
their own formatting engine, or even their own parser.
Users benefit from uniform, comparable code style, across multiple
languages, with the convenience of a single formatter
tool.
The core of Topiary is written in Rust, with declarative formatting
rules for bundled languages written in the Tree-sitter query
language. In this first release, we have
concentrated on formatting OCaml code, capitalising on the OCaml
expertise within the Topiary Team and our colleague, Nicolas Jeannerod.
Coding style has historically been a matter of personal choice. This is
inherently subjective, leading to bikeshedding over formatting choices,
rather than meaningful discussion during review. Prescribed style
guides, linters and ultimately automatic formatters — popularised by
gofmt, whose developers had the insight to
impose “good enough” uniform formatting on a codebase — have helped
solve these issues.
This motivated research into developing a formatter for our Nickel
language. However, its internal parser did not provide a syntax
tree that retained enough context to allow the original program to be
reconstructed after parsing. After creating a Tree-sitter grammar for
Nickel, for syntax highlighting,
we concluded that it would be possible to
leverage Tree-sitter for formatting as well.
But why stop at Nickel? Topiary generalises this approach for any
language that doesn’t employ semantic whitespace — for which,
specialised formatters, such as our Haskell formatter Ormolu, are
required — by expressing formatting style rules in the Tree-sitter
query language. It thus aspires to be a “universal
formatter engine” for such languages; enabling the fast development of
formatters, provided a Tree-sitter grammar is
available.
Design Principles
To that end, Topiary has been created with the following goals in mind:
Use Tree-sitter for parsing, to avoid writing yet another engine for
a formatter.
Expect idempotency. That is, formatting of already-formatted code
shouldn’t change anything.
For bundled formatting styles to meet the following constraints:
Compatible with attested formatting styles used for that language in
the wild.
Faithful to the author’s intent: if code has been written such that
it spans multiple lines, that decision is preserved.
Minimise changes between commits such that diffs focus mainly on the
code that’s changed, rather than superficial artefacts.
Be well-tested and robust, such that they can be trusted on large
projects.
For end users, the formatter should run efficiently and integrate with
other developer tools, such as editors and language servers.
How it Works
As long as a Tree-sitter grammar is defined for a
language, Tree-sitter can parse it and build a concrete syntax tree.
Tree-sitter also allows us to run queries against this tree. We can make
use of these to target interesting subtrees (e.g., an if block or a
loop), to which we can apply formatting rules. These cohere into a
declarative definition of how that language should be formatted.
For example:
([(infix_operator)"if"":"] @append_space
.
(_))
This will match any node that the grammar has identified as an
infix_operator, or the anonymous nodes containing if or : tokens,
immediately followed by any named node (represented by the (_)
wildcard pattern). The query matches on subtrees of the same shape,
where the annotated node within it will be “captured” with the name
@append_space; one of many formatting rules we
have defined. Our formatter runs through all matches and captures, and
when we process any capture called @append_space, we append a space
after the annotated node.
Before rendering the output, Topiary does some post-processing, such as
squashing consecutive spaces and newlines, trimming extraneous
whitespace, and ordering indentation and newline instructions
consistently. This means that you can, for example, prepend and append
spaces to if and true, and Topiary will still output if true with
just one space between the words.
To make this more concrete, consider the expression 1+2. This has the
following syntax tree, if it’s interpreted as OCaml, where the match
described by the above query is highlighted in red:
The @append_space capture instructs Topiary to append a space after
the infix_operator, rendering 1+ 2. Repeating this process for every
syntactic structure we care about — making judicious generalisations
wherever possible — leads us to an overall formatting style for a
language.
As a formatter author, defining a style for a language is just a matter
of building up these queries. End users can then apply them to their
codebase with Topiary, to render their code in this style.
Related Tools
Topiary is not the first tool to use Tree-sitter beyond its original
scope, nor is it the first tool that attempts to be a formatter for
multiple languages (e.g., Prettier). This section contains some tools
that we drew inspiration from, or used during the development of
Topiary.
Tree-sitter Specific
Syntax Tree Playground: An interactive,
online playground for experimenting with Tree-sitter and its query
language.
null-ls.nvim: An LSP framework for Neovim that facilitates formatter
orchestration.
Getting Started
We’re really excited about Topiary and the potential it has in this
space.
This first release concentrates on formatting support for OCaml, as well
as simple languages, such as JSON and TOML. Experimental formatting
support is also available for Nickel, Bash, Rust, and Tree-sitter’s
own query language; these are under active development or serve a
pedagogical end for formatter authors.
We would highly encourage you to try Topiary and invite you to check out
the Topiary GitHub repository to see for yourself.
Information on installing and using Topiary can be found in this
repository, where we would also welcome contributions,
feature requests, and bug reports.
GHC 9.2 introduces two new language extensions called OverloadedRecordDot and OverloadedRecordUpdate, which overload record field access (myRecord.x.y) and record field update (myRecord{x.y = a}), respectively. In this short blog post we will discuss these two extensions in the context of two libraries, large-anon and large-records. Of course, OverloadedRecordDot and OverloadedRecordUpdate can be useful in many other contexts too; the principles we explain in this blog post should apply universally.
We will start by discussing large-anon, which is a library that provides support for scalable anonymous records (records that do not have to be defined before they can be used), along with a rich API for manipulating such records. We will see how we can take advantage of OverloadedRecordDot and OverloadedRecordUpdate to access and update fields of such records.
We then briefly discuss large-records, which provides “regular” records but with a linear compilation time, as opposed to the quadratic compilation time for standard records; this can really matter for large records (records with more than say 20 or 30 fields). The usage of OverloadedRecordDot and OverloadedRecordUpdate is pretty similar to their usage in large-anon, but as we will see, there is a small technical complication.
OverloadedRecordDot
When the OverloadedRecordDot language extension is enabled, then r.b for some record r and field b is interpreted as getField @"b" r, where getField is a method of the HasField class defined in GHC.Records:
classHasField x r a | x r -> a where getField :: r -> a
For example, the following code1, which uses the large-anon library for anonymous records, will print True:
{-# LANGUAGE OverloadedRecordDot #-}{-# OPTIONS_GHC -fplugin=Data.Record.Anon.Plugin #-}importData.Record.AnonimportData.Record.Anon.SimplequalifiedasStest ::IO ()test =print r.bwhere r ::S.Record [ "a":=Int, "b":=Bool ] r =ANON { a =5, b =True }
The syntax also extends to nested fields; for example this code will print Just 'x':
importData.Record.Anon.AdvancedqualifiedasAtest ::IO ()test =print r.b.dwhere r ::S.Record [ "a":=Int, "b":=A.RecordMaybe [ "c":=Bool, "d":=Char ] ] r =ANON { a =5, b =ANON_F { c =Nothing, d =Just'x' } }
(The difference between “simple” anonymous records and “advanced” anonymous records is that the latter support an additional functor argument; S.Record ["a" := Int, "b" := Bool ] is a record with one field or type Int and one field of type Bool, whereas A.Record Maybe [ "c" := Bool, "d" := Char ] is a record with one field of type Maybe Bool and one field of type Maybe Char. For an in-depth introduction, see large-anon: Practical scalable anonymous records for Haskell.)
OverloadedRecordUpdate
Unfortunately, the situation with OverloadedRecordUpdate is not as clean as with OverloadedRecordDot. The current situation is that the syntax r{a = 6} translates to setField @"a" r 6, without specifying what setFieldis, precisely. The design and implementation of the corresponding class is not yet finalized; the GHC wiki entry on overloaded record fields contains a bunch of proposals.
In an attempt to guarantee forwards compatibility, therefore, OverloadedRecordUpdate currently requires RebindableSyntax. This language extension changes GHC so certain pieces of syntax no longer refer to their standard definition, but rather to whatever definition is in scope; for example, it makes it possible to change what OverloadedStrings does (by using a different definition of fromString) or to change how do notation is desugared (by using a different definition of >>=). For records, it will use whatever definition of getField is in scope instead of using the HasField class; and it will also whatever definition of setField is in scope, except that in this case there is no standard definition yet.
The large-anon library offers a module Data.Record.Anon.Overloading which can be used in combination with RebindableSyntax: it restores all definitions to their standard definition again, and it defines a setField function which depends on the HasField class from the record-hasfield package:
classHasField x r a | x r -> a where hasField :: r -> (a -> r, a)
For example, the following code prints ANON { a = 6, b = True }:
{-# LANGUAGE OverloadedRecordUpdate #-}{-# LANGUAGE RebindableSyntax #-}importData.Record.Anon.Overloadingtest ::IO ()test =print r { a =6 }where r ::S.Record [ "a":=Int, "b":=Bool ] r =ANON { a =5, b =True }
Like record field access, OverloadedRecordUpdate also works for nested fields (in this case OverloadedRecordDot is also required):
test ::IO ()test =print r { b.d =Just'a' }where r ::S.Record [ "a":=Int, "b":=A.RecordMaybe [ "c":=Bool, "d":=Char ] ] r =ANON { a =5, b =ANON_F { c =JustTrue, d =Nothing } }
large-records
The above examples all used the large-anon library for anoynmous records; the situation with the large-records is similar, with one somewhat annoying technical difference. The large-records library provides records whose compilation time scales linearly with the size of the record, as opposed to the quadratic compilation time of standard records. As discussed in Avoiding quadratic core code size with large records, it needs to jump through quite a few hoops to achieve this. When a module contains a record declaration such as
{-# OPTIONS_GHC -fplugin=Data.Record.Plugin #-}{-# ANN type Person largeRecord #-}dataPerson=Person { name ::String }
then large-records turns it into something like this:
dataPerson=forall a. a ~String=>Person { name :: a }
This unusual representation achieves two goals: the strange existential type parameter prevent GHC from generating field accessors (one of the many sources of quadratic code), but we can still construct and pattern match on these records in a normal way.
As discussed in the GHC manual, the existential type parameter will also prevent GHC from automatically derive a HasField instance for fields such as name. This wouldn’t be a problem (large-records could generate the instance instead), if it wasn’t for this proviso in the manual:
If a field has a higher-rank or existential type, the corresponding HasField constraint will not be solved automatically (..), but in the interests of simplicity we do not permit users to define their own instances either.
There is a GHC proposal to relax this condition; until that proposal is implemented, however, the use of OverloadedDot with large-records will depend on RebindableSyntax (which then bypasses the use of the standard HasField class entirely). Of course, as we saw in the previous section, OverloadedRecordUpdate depends on RebindableSyntaxanyway, so this isn’t as much of a limitation as it might seem.
Conclusions
Record dot syntax has been available since GHC 8.6 through the use of the record-dot-preprocessor. From 9.2 onwards this syntax is now available without a preprocessor or plugin; for record field access this is achieved through the use of the HasField class from GHC.Records. For record field update the design is not yet finalized, and we instead rely on RebindableSyntax.
As of their latest releases, the large-anon and large-records libraries, as well as the supporting libraries large-generics and typelet, are now compatible with GHC 9.2 and 9.4 and make use of these extensions. It is also still possible to use record-dot-preprocessor (you will need at least version 0.2.16 for GHC 9.4). The only complication is that GHC 9.4 changed the order in which plugins are loaded. This matters because the large-records plugin must be loaded beforeRecordDotPreprocessor; to avoid users having to use CPP to distinguish between GHC versions, large-records now offers Data.Record.Plugin.WithRDP which combines both plugins (for large-anon the order does not matter).
Here and elsewhere we will only show the most important language pragmas and includes.↩︎
The "open source native" principle for software design
This post summarizes a software design principle I call the “open source native� principle which I’ve invoked a few times as a technical lead. I wanted to write this down so that I could easily reference this post in the future.
The “open source native� principle is simple to state:
Design proprietary software as if you intended to open source that software, regardless of whether you will open source that software
I call this the “open source native� principle because you design your software as if it were a “native� member of the open source ecosystem. In other words, your software is spiritually “born� open source, aspirationally written from the beginning to be a good open source citizen, even if you never actually end up open sourcing that software.
You can’t always adhere to this principle, but I still use this as a general design guideline.
Example
It’s hard to give a detailed example of this principle since most of the examples I’d like to use are … well … proprietary and wouldn’t make sense outside of their respective organizations. However, I’ll try to outline a hypothetical example (inspired by a true story) that hopefully enough can people can relate to.
Suppose that your organization provides a product with a domain-specific programming language for customizing their product’s behavior. Furthermore, suppose that you’re asked to design and implement a package manager for this programming language.
There are multiple data stores you could use for storing packages, but to simplify this example suppose there are only two options:
Store packages in a product-specific database
Perhaps your product already uses a database for other reasons, so you figure that you can reuse that existing database for storing packages. That way you don’t need to set up any new infrastructure to get going since the database team will handle that for you. Plus you get the full powerful of a relational database so now you have powerful tools for querying and/or modifying packages.
Store packages in git
You might instead store your packages as flat files inside of a git repository.
These represent two extremes of the spectrum and in reality there might be other options in between (like a standalone sqlitedatabase), but this is a contrived example.
According to the open source principle, you’d prefer to store packages in git because git is a foundational building block of the open source ecosystem that is already battle-tested for this purpose. You’d be sacrificing some features (you’d no longer have access to the full power of a relational database), but your package manager would now be more “open-source native�.
You might wonder: why would one deliberately constrain themselves like that? What’s the benefit of designing things in this way if they might never be open sourced?
Motivation
There are several reasons I espouse this design principle:
better testability
If you design your component so that it’s easy to use outside of the context of your product then it’s also easier to test in isolation. This means that you don’t need to rely on heavyweight integration tests or end-to-end tests to verify that your component works correctly.
For example, a package manager based on git is easier to test than a package manager based on a database because a git repository is easier to set up.
faster release cadence
If your component can be tested in isolation then you don’t even need to share continuous integration (CI) with the rest of your organization. Your component can have its own CI and release on whatever frequency is appropriate for that component instead of coupling its release cadence to the rest of your product.
Continuing the package manager example, you wouldn’t need to couple releases of your package manager to the release cadence of the rest of your product, so you’d be able to push out improvements or fixes more quickly.
simpler documentation
It’s much easier to write a tutorial for software that delivers value in isolation since there’s less supporting infrastructure necessary to follow along with the tutorial.
well-chosen interfaces
You have to carefully think through the correct logical boundaries for your software when you design for a broader audience of users. It’s also easier to enforce stronger boundaries and narrower scope for the same reasons.
For example, our hypothetical package manager is less likely to have package metadata polluted with product-specific details if it is designed to operate independently of the product.
improved stability
Open source software doesn’t just target a broader audience, but also targets a broader time horizon. An open source mindset promotes thinking beyond the needs of this financial quarter.
you can open source your component! (duh)
Needless to say, if you design your component to be open-source native, it’s also easier to open source. Hooray! �
Conclusion
You can think of this design principle as being similar to the rule of least power, where you’re making your software less powerful (by adding the additional constraint that it can be open sourced), but in turn improving ease of comprehension, maintainability, and distribution.
Also, if you have any examples along these lines that you care to share, feel free to drop them in the comments.
In those articles I showed how you could build up the Clifford algebras like so:
type Cliff1 = Complex R type Cliff1' = Split R type Cliff2 = Quaternion R type Cliff2' = Matrix R type Cliff3 = Quaternion Cliff1' type Cliff3' = Matrix Cliff1 type Cliff4 = Quaternion Cliff2' type Cliff4' = Matrix Cliff2 type Cliff5 = Quaternion Cliff3' ...
I used CliffN as the Clifford algebra for a negative definite inner product and CliffN' for the positive definite case. It's not a completely uniform sequence in the sense that CliffN is built from CliffN' for dimension two lower and you use a mix of Matrix and Quaternion.
The core principle making this work is that for type constructors implemented like Matrix, Quaternion etc. we have the property that
eg. Matrix (Quaternion Float) is effectively the same thing as Matrix FloatQuaternion Float.
But John Baez pointed out to me that you can build up the CliffN algebras much more simply enabling us to use these definitions:
> type Cliff1 = Complex Float > type Cliff2 = Complex Cliff1 > type Cliff3 = Complex Cliff2 > type Cliff4 = Complex Cliff3 > type Cliff5 = Complex Cliff4
...
Or even better:
> type family Cliff (n :: Nat) :: * where > Cliff 0 = Float > Cliff n = Complex (Cliff (n - 1))
But there's one little catch. We have to work, not with the tensor product, but the super tensor product.
We define Complex the same way as before:
> data Complex a = C a a deriving (Eq, Show)
Previously we used a definition of multiplication like this:
instance Num a => Num (Complex a) where C a b * C c d = C (a * c - b * d) (a * d + b * c)
We can think of C a b in Complex R as representing the element . The definition of multiplication in a tensor product of algebras is defined by . So we have .
This means that line of code we wrote above defining * for Complex isn't simply a definition of multiplication of complex numbers, it says how to multiply in an algebra tensored with the complex numbers.
Let's go Super!
A superalgebra is an algebra graded by where is the ring of integers modulo 2. What that means is that we have some algebra that can be broken down as a direct sum (the subscripts live in ) with the property that multiplication respects the grading, ie. if is in and is in then is in .
The elements of are called "even" (or bosonic) and those in "odd" (or fermionic). Often even elements commute with everything and odd elements anticommute with each other but this isn't always the case. (The superalgebra is said to be supercommutative when this happens. This is a common pattern: a thing X becomes a superX if it has odd and even parts and swapping two odd things introduces a sign flip.)
The super tensor product is much like the tensor product but it respects the grading. This means that if is in and is in then is in . From now on I'm using to mean super tensor product.
Multiplication in the super tensor product of two superalgebras and is now defined by the following modified rule: if is in and is in then . Note that the sign flip arises when we shuffle an odd left past an odd .
The neat fact that John pointed out to me is that .
We have to modify our definition of * to take into account that sign flip.
I initially wrote a whole lot of code to define a superalgebra as a pair of algebras with four multiplication operations and it got a bit messy. But I noticed that the only specifically superalgebraic operation I ever performed on an element of a superalgebra was negating the odd part of an element.
So I could define SuperAlgebra like so:
class SuperAlgebra a where conjugation :: a -> a
where conjugation is the negation of the odd part.
(I'm not sure if this operation corresponds to what is usually called conjugation in this branch of mathematics.)
But there's a little efficiency optimization I want to write. If I used the above definition, then later I'd often find myself computing a whole lot of negates in a row. This means applying negate to many elements of large algebraic objects even though any pair of them cancel each other's effect. So I add a little flag to my conjugation function that is used to say we want an extra negate and we can accumulate flips of a flag rather than flips of lots of elements.
> class SuperAlgebra a where > conjugation :: Bool -> a -> a
Here's our first instance:
> instance SuperAlgebra Float where > conjugation False x = x > conjugation True x = negate x
This is saying that the conjugation is the identity on Float but if we want to perform an extra flip we can set the flag to True. Maybe I should call it conjugationWithOptionalExtraNegation.
And now comes the first bit of non-trivial superalgebra:
> instance (Num a, SuperAlgebra a) => SuperAlgebra (Complex a) where > conjugation e (C a b) = C (conjugation e a) (conjugation (not e) b)
We consider to be even and to be odd. When we apply the conjugation to then we can just apply it directly to . But that flips the "parity" of (because tensor product respects the grading) so we need to swap when we use the conjugation. And that should explain why conjugation is defined the way it is.
Now we can use the modified rule for defined above:
> instance (Num a, SuperAlgebra a) => Num (Complex a) where > fromInteger n = C (fromInteger n) 0 > C a b + C a' b' = C (a + a') (b + b') > C a b * C c d = C (a * c - conjugation False b * d) > (conjugation False a * d + b * c) > negate (C a b) = C (negate a) (negate b) > abs = undefined > signum = undefined
For example, conjugation False is applied to the first on the RHS because implicitly represents an term and when expanding out the product we shuffle the (odd) in left of . It doesn't get applied to the second because and remain in the same order.
That's it!
Tests
I'll test it with some examples from Cliff3:
> class HasBasis a where > e :: Integer -> a
> instance HasBasis Float where > e = undefined
> instance (Num a, HasBasis a) => HasBasis (Complex a) where > e 0 = C 0 1 > e n = C (e (n - 1)) 0
> make a b c d e f g h = > C (C (C a b) (C c d)) > (C (C e f) (C g h))
The implementation of multiplication looks remarkably like it's the Cayley-Dickson construction. It can't be (because iterating it three times gives you a non-associative algebra but the Clifford algebras are associative). Nonetheless, I think comparison with Cayley-Dickson may be useful.
Efficiency
As mentioned above, before I realised I just needed the conjugation operation I wrote the above code with an explicit split of a superalgebra into two pieces intertwined by four multiplications. I think the previous approach may have a big advantage - it may be possible to use variations on the well known "speed-up" of complex multiplication that uses three real multiplications instead of four. This should lead to a fast implementation of Clifford algebras.
Also be warned: you can kill GHC if you turn on optimization and try to multiply elements of high-dimensional Clifford algebras. I think it tries to inline absolutely everything and you end up with a block of code that grows exponentially with .
Note also that this code translates directly into many languages.
When I write code I jump around in the code quite a bit, as I'm sure many other
developers do. The ability to jump to the definition of a function, or a type,
is invaluable when trying to understand code. In Emacs the built-in xref package
provides the basic functionality for this, with many other packages providing
their custom functions for looking up identifiers. This works beautifully except
for one thing, there's only one global stack for keeping track of how you've
jumped around.
Well, that used to be the case.
As I tend to have multiple projects open at a time I used to find it very
confusing when I pop positions off the xref stack and all of a sudden find
myself in another project. It would be so much nicer to have a per-project
stack.
I've only known of one solution for this, the perspective package, but as I've
been building my own Emacs config I wanted to see if there were other options.
It turns out there is one (almost) built into Emacs 29.
In Emacs 29 there's built-in support for having per-window xref stacks, and
the way that's done allows one to extend it further. There's now a variable,
xref-history-storage, that controls access to the xref stack. The default is
still a global stack (when the variable is set to #'xref-global-history), but
to get per-window stacks one sets it to #'xref-window-local-history.
After finding this out I thought I'd try to write my own, implementing
per-project xref stacks (for projectile).
The function should take one optional argument, new-value, if it's provided
the stack should be updated and if not, it should be returned. That is,
something like this
(defunprojectile-param-xref-history(&optional new-value)"Return project-local xref history for the current projectile.Override existing value with NEW-VALUE if it's set."(if new-value
(projectile-param-set-parameter 'xref--history new-value)(or(projectile-param-get-parameter 'xref--history)(projectile-param-set-parameter 'xref--history (xref--make-xref-history)))))
Now I only had to write the two functions projectile-param-get-parameter and
projectile-param-set-parameter. I thought a rather straight forward option
would be to use a hashtable and store values under a tuple comprising the
project name and the parameter passed in.
(defvarprojectile-params--store(make-hash-table :test 'equal)"The store of project parameters.")(defunprojectile-param-get-parameter(param)"Return project parameter PARAM, or nil if unset."(let((key (cons (projectile-project-name) param)))(gethash key projectile-params--store nil)))(defunprojectile-param-set-parameter(param value)"Set the project parameter PARAM to VALUE."(let((key (cons (projectile-project-name) param)))(puthash key value projectile-params--store))
value)
Then I tried it out by setting xref-history-storage
The GHC team is very pleased to announce the availability of the first (and
likely final) release candidate of GHC 9.6.1.
As usual, binaries and source distributions are available at
downloads.haskell.org.
Beginning with GHC 9.6.1, GHC can be built as a cross-compiler to
WebAssembly and JavaScript. This is an important step towards robust
support for compiling Haskell to the Web, but there are a few caveats to be
aware of in the 9.6 series:
Both the Javascript and WebAssembly backends are still at an early
stage of development and are present in this release as a technology preview
Using GHC as a cross-compiler is not as easy as we would like it to
be; in particular, there are challenges related to Template Haskell
GHC is not yet run-time retargetable; a given GHC binary targets
exactly one platform, and both WebAssembly and JavaScript are considered
platforms for this purpose. Cross-compilers must be built from source by
their users
We hope to lift all of these limitations in future releases.
Additionally, 9.6.1 will include:
Significant latency improvements in the non-moving garbage collector
Efficient runtime support for delimited continuations
Improvements in compiler error messages
Numerous improvements in the compiler’s memory usage
See the release notes for a comprehensive accounting of changes in this
release.
As always, one can find a migration guide to aid in transitioning from older
releases on the GHC Wiki. We have also recently started
extending our release process to cover a wider set of Linux distributions. In
particular, we now offer Rocky 8 and Ubuntu 20.04 binary distributions which
cover RedHat-derivative and distributions using older glibc releases (namely
2.27), respectively.
Please do give this release a try and open a ticket if you see anything
amiss. If all goes well we expect the final release should be available by late
next week.
Chainsail, Tweag’s web service for sampling multimodal probability distributions, is now open-source and awaits contributions and new uses from the community!
After having presented Chainsail to the probabilistic programming communities and scientists, the feedback we got most frequently was:
when is Chainsail going to be open-source?
The fact that the Chainsail source code was not publicly accessible seems to have hindered engagement with the target communities of computational statisticians, scientists and probabilistic programmers.
Tweag is taking that feedback seriously and announces today the open-sourcing of all of Chainsail’s code.
To help future users and contributors find their way around the project, this blog post will give an overview of the service architecture, point out pieces of code that might be of particular interest to certain groups of users, describe Chainsail deployment options and lists ideas and issues for which Tweag would particularly welcome contributions.
Service architecture
Chainsail is a complex system with multiple, interacting components, but before I describe these components in detail, let’s look at an overview of the Chainsail service architecture:
Let’s quickly review what happens during a typical Chainsail run:
On the frontend, a user submits their probability distribution and provides a couple of parameters.
This information is sent to the job scheduler application that will create database entries representing the newly submitted sampling job.
The job scheduler then commences processing of the job by launching a controller instance.
The controller is the core of a Chainsail job: it orchestrates automatic inverse temperature schedule tuning and Replica Exchange sampling.
Note that, despite the similar name, the inverse temperature schedule is completely unrelated to the job scheduler — unfortunately, the former is standard parlance in tempering algorithms.
The controller will also ask the job scheduler to scale computing resources up or down, depending
on the number of replicas the current simulation run requires.
The controller, job scheduler and client furthermore communicate with a couple of auxiliary components for logging, sampling statistics, and storing data and configuration.
Now that you know what you’ve signed up for, I will explain the function of these components inside-out: first, the algorithmic core of the service, and then the components that serve the core functionality to the user and allow them to monitor it.
Controller
This is the algorithmic heart of the Chainsail web service.
Once a controller is started, either manually when developing, or by the job scheduler when the full service is deployed, the controller does the following:
Sets up an initial inverse temperature schedule that follows a geometric progression.
Until the temperature schedule stabilizes, iterates the following recipe for a schedule optimization simulation:
Ask the job scheduler to provision a number of nodes equivalent to the number of temperatures;
Draw sensible initial states based on previous optimization iterations, or, if this is the first optimization run, use the initial states the user provides;
Interpolate HMC step size from timesteps used in previous optimization iterations, or, if this is the first optimization run, start out with a default stepsize;
Run a Replica Exchange simulation on a set of nodes, during which it automatically adapts HMC timesteps for a while;
The controller comes in the form of a Python application, the source code of which can be found in app/controller.
Note that the controller can be run independently, without any other Chainsail component.
Density of states estimator
For schedule optimization and for drawing initial Replica Exchange states based on previous iterations, the controller requires an estimate of the density of states (DOS), which is briefly explained in this blog post and which might be discussed in more detail in a forthcoming publication.
There are several ways to estimate the density of states, but Chainsail implements a binless version of the Weighted Histogram Analysis Method (WHAM) that is widely used in biomolecular simulation.
This WHAM implementation is completely general and might find interesting uses outside of Chainsail, e.g. in conjunction with other tempering methods.
It’s also interesting to note that the DOS is a very useful value to have for a multitude of analyses; for example, it can be used to easily estimate the normalization constant of a probability distribution, which means that, in the case of a Bayesian model, the model evidence can be calculated using the DOS.
As described above, the controller automatically tunes the inverse parameter schedule with an iterative procedure.
In each iteration, it tries to find a sequence of inverse temperatures such that the acceptance rate predicted from it matches the target acceptance rate desired by the user.
Predicting the acceptance rate is the main reason the controller calculates a DOS estimate at each iteration.
The automatic schedule tuning depends only on a DOS estimate and is thus easily reuseable for Replica Exchange implementations other than Chainsail, for example in TensorFlow Probability or ptemcee.
For the curious reader with the necessary mathematical and physics background, a technical note explains the DOS-based schedule tuning and drawing of appropriate initial Replica Exchange states.
MPI-based Replica Exchange runner
Once a schedule, initial states and timesteps are determined, the controller launches a Replica Exchange simulation implemented in a runner.
Chainsail currently implements a single runner based on rexfw, a Message Passing Interface (MPI)-based Replica Exchange implementation.
The Chainsail-specific runner code is thus only a thin wrapper around the (very general) rexfw library and can be used both on a single machine and in a Kubernetes cluster.
As Replica Exchange is a bit more widely known in scientific circles and the rexfw library has its origin in scientific research, an interesting option could be to run rexfw and even the full controller on HPC clusters.
The job scheduler is a Flask-based Python app that serves multiple purposes:
It receives job creation, launch, and stop requests from the user;
It keeps track of the job state;
It scales the resources (nodes) available to the controller up or down;
And it compresses sampling results into a .zip file.
It keeps track of users, jobs, and nodes in tables of a PostgreSQL database.
The job scheduler currently knows two node types, namely a (now deprecated) VM node in which each replica is run on a separate VM, and a Kubernetes-based implementation in which a node is represented by a pod.
While a user can interact with a Chainsail job scheduler via the HTTP APIs of the job scheduler and other components, the most comfortable way to do so is via the frontend web app.
It is written in React and comprises the following parts:
A landing page that presents Chainsail to the user and shows links to essential resources, such as blog posts and the chainsail-resources repository;
A job submission form that allows a user to set (and documents) the most crucial parameters and that allows the upload of a .zip file with the probability definition;
A job table that shows all jobs a user has created and possibly launched, with their current status;
A dashboard that allows the user to follow a sampling job’s process, monitor Replica Exchange acceptance rates and convergence and that shows Chainsail logging output.
If mandatory user authentication via Firebase is enabled, only the landing page is accessible without the user being logged in.
For every other page, the user will have to sign in with their Gmail address, and a token is generated that is sent along to the job scheduler, which then stores it in a user database table.
The above really are the key components of the Chainsail service, but there are a couple of other applications and services that are necessary to make full Chainsail deployments run.
Among them are a small Flask application that serves MCMC sampling statistics to the frontend application, a Celery task queue to which job scheduler requests are offloaded, and a Docker image that runs a gRPC server providing an encapsulated interface to the user-provided code.
Deploying Chainsail
As you have now seen, Chainsail is a big system, but rest assured, deploying it and testing it out isn’t that hard!
You currently have three ways to deploy (a subset of) Chainsail:
Run the controller and thus Chainsail’s core algorithm on a single machine:
This is as easy as using the provided Poetry packaging to get a Python shell with all required dependencies, add the dependencies your probability distribution code needs, set parameters in a JSON configuration file and run the controller using a simple command.
Note that this currently requires an OpenMPI (or another MPI) installation.
Run the full Chainsail service locally using Minikube:
Minikube provides a local environment to run Kubernetes applications.
Given a running Minikube installation, Chainsail can be very easily deployed using Terraform configuration and Helm charts.
All it takes are a couple of commands, which are described in our deployment documentation.
Run the full Chainsail service on Google Cloud:
Finally, the complete Chainsail web service can be deployed to the Google Cloud Platform, thanks again to Terraform configuration and Helm charts.
Note, though, that this requires setting quite a few configuration options, such as regions, VM machine types, etc., but defaults are given.
These configuration options are currently not documented, so you would have to browse the relevant Terraform and Helm configuration files yourself.
Future work
Chainsail in its current state is still a proof of concept, but Tweag is convinced that it could become a useful and widely applicable tool for everyone who has to deal with multimodal probability distributions.
To this end, a couple of interesting possibilities for future developments include:
Replacing the very simple HMC implementation with a state-of-the-art NUTS sampler, e.g. via the BlackJAX library (#423).
Replace the clunky and insanely slow Stan support (that currently works via requests to a httpstan instance) with a wrapper around BridgeStan (#424).
Improve the inverse temperature schedule optimization to deal with “phase transitions”, meaning distributions which exhibit sudden drops to zero Replica Exchange acceptance rate with increasing inverse temperature (#425).
Speed up and improve candidate schedule calculation by having an adaptive inverse temperature decrement (#426).
Implement a lightweight local runner for easier testing and debugging (#427).
Write a dedicated interface to calculate the model evidence and other quantities from the DOS, possibly in a separate Python package.
While Chainsail has a large number of unit tests and a couple of functional tests, it would still immensely profit from improved testing and a continuous integration (CI) setup.
For example, Docker containers could be built automatically by CI and an automatic end-to-end integration test of the service could be performed using the Minikube deployment option.
Building Docker images currently happens via Dockerfiles and the docker build command, but packaging of Chainsail components via Nix and its dockerTools library is currently under development.
Conclusion
Now that Chainsail is open-source, I hope that this brief tour of Chainsail’s key components piques interest in the project.
Be it code reuse in other projects, any kind of contribution to the code base, or ideas for interesting problems Chainsail could solve, Tweag welcomes feedback and engagement from probabilistic programmers, statisticians, scientists and everyone else who is as excited about this project as we are!
We’re happy to support the Haskell Foundation in planning a pre-ZuriHac workshop for new GHC contributors on 7th-9th June 2023 in Zurich (and online). Find out more and express interest here.
This is the sixteenth edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of December 2022 and January 2023. You can find the previous editions collected under the ghc-activities-report tag.
However, some of our sponsorship agreements are coming to an end and we need more sponsorship to sustain the team! If your company might be able to contribute funding to ensure that we can continue this valuable work, please read about how you can help or get in touch.
Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC!
We are very pleased to welcome Rodrigo Mesquita to the team for an internship over the next six months. Rodrigo will be working on enabling the safe reinstallation of the ghc and template-haskell libraries (#20742). This is a crucial step which will unlock better cross-compilation, more robust tooling support, and potential improvements in library compatibility across GHC versions.
Releases
Ben and the team have been busy working on the 9.6 release (#22562). Several alphas have been released and a release candidate is expected soon. Sam worked on the 9.6 migration guide.
Matt and Ben released version 9.4.4 in December, fixing various bugs in the 9.4 series.
Zubin prepared version 9.2.6 for release in February.
Driver
Matt finished off and landed a patch from a contributor reworking plugin initialisation points, so that plugins can install hooks (#21279, !9824).
Matt fixed a variety of bugs in multiple home unit support which were found when testing new Cabal support for starting multi-unit sessions (!9618).
Compiler performance
Matt and Zubin investigated GHCi memory usage on partial reloads. They found and squashed quite a few leaks, resulting in substantial reductions in memory usage after a :reload (!9787).
Matt fixed a bug when using interface files with core definitions (!9869).
Runtime performance
Ben fixed an unexpected regression in MutVar# performance due to a memory fence (#22764, !9718).
Andreas investigated many runtime performance issues relating to inlining, specialisation, INLINEABLE and NOINLINE pragmas (#22321, #22609, #22689, #22781, !9723). He has work in progress allowing NOINLINE and INLINEABLE to be combined, enabling specialization of NOINLINE functions (!9621).
Andreas investigated performance regressions in aeson (#22115) which were caused by three independent issues:
A code generation regression which was then quickly fixed by Ben (!9718).
A change to the implementation of Bits instances which caused regressions due to partial register stalls. This is a known issue (#20405) where the fix had not been applied to all branches at the time.
A regression due to changes to the constructed product result optimisation (#22983).
Andreas diagnosed a runtime performance regression in Data.OldList.{maximumBy,minimumBy} in 9.2.5 (#22609, !9565).
Frontend
Andreas diagnosed an issue where an OPAQUE pragma triggered a compiler panic (#22502).
Sam proposed a patch adding Template Haskell support for explicit forall quantifiers in instances (!9840).
Matt fixed the module locations of some definitions in GHC.Types for Typeable (#22510, !9459).
Error messages
Matt improved the error messages of the exhaustiveness checker when checking statements that have been moved around with ApplicativeDo (#22483, !9460).
Runtime system
Ben investigated a recurring runtime crash on MacOS Ventura (#22497), which turned out to be an upstream bug in the GMP library, necessitating backports to the 9.2, 9.4 and 9.6 branches.
Ben began looking at improving the linkage story for RTS symbols, which is a long-standing headache for distribution packagers (#17157).
Ben fixed warning-cleanliness in the runtime system’s C sources and fixed Hadrian to ensure that they remain clean (!9579)
Garbage collector
Andreas fixed an issue with GCing of sparks (#22528) that sometimes caused sparks to be evaluated sequentially instead of in parallel.
Andreas investigated and fixed a crash when GCing a CAF that is evaluated from TH (#22417). This fixes some long standing spurious segfaults involving TH on static build targets such as Windows.
Ben made significant improvements to the non-moving garbage collector:
Fixed the handling of empty arrays in the non-moving write barrier (#22931).
Greatly improved handling of Weak# objects, which previously may not have been promptly finalized (#22327).
Fixed a few potentially unsound interactions with setNumCapabilities (#22926, #22927).
Identified and fixed a missing write barrier in the GC’s selector optimisation implementation (#22930).
Significantly improved pause times for some programs by bounding marking work in the sync pause (#22929).
Continued work in implementing parallel marking support.
Improved accuracy of residency reporting.
While working on the GC, Ben also identified and fixed a number of data races in the RTS thanks to ThreadSanitizer instrumentation (!6232, !9371, !9372, !9482).
Code generation
Sam improved how GHC keeps track of the types at which Cmm registers are used. In particular, this helps ensure that SIMD vector registers, which can be used to store different types of data, are used in a consistent manner (!9167).
Ben diagnosed and fixed an AArch64 backend issue resulting in compile-time crashes when the graph-colouring register allocator (-fregs-graph) is in use (#22798).
Andreas fixed a runtime crash caused by a bug in the tag inference analysis due to a difference between Cmm and Bytecode backends (#22840).
Andreas avoided unboxing large constructors containing unboxed tuples into other constructors by mistake (#22309, !9724). They can still be explicitly unboxed if desired.
Ben fixed a linking-related compilation failure on MacOS (#21972).
Testsuite
Andreas updated the nofib benchmark suite to ensure it is usable with the upcoming 9.6 release, along with some other minor tweaks.
Ben improved the reporting output of the nofib-compare utility.
Matt added tests for issues milestoned for 9.4.4 but already fixed.
Ben found that the testsuite was failing with -threaded and sanity-checking enabled, and started fixing these issues.
Packaging
Matt fixed an issue with manpages not being installed.
Matt made various fixes to the build system and packaging including enabling split-sections for release builds.
Matt added a flag to omit self-recompilation info from interface files, for situations where you are never going to attempt to recompile a package. This avoids a source of non-determinism in builds.
Ben bumped the Windows toolchain shipped with GHC (#21293, !9683).
Matt added support for LLVM 15 and while doing so removed support for LLVM 10. GHC will henceforth support LLVM 11 through 15.
Zubin fixed an issue with the binary distribution configure script where it didn’t check for the find command being available (!9777).
Matt added scripts for automatically generating metadata for ghcup. When a new release is made, the metadata is sent to the ghcup maintainers who can easily use it to add support for new releases to ghcup. The metadata is generated on nightly and release pipelines and is now used to test binary distributions on many different platforms.
CI
Matt upgraded CI runners to use GHC 9.4.3 for CI.
Matt fixed a variety of issues on Darwin CI runners, so they could be re-enabled, increasing CI capacity.
Matt added Ubuntu 18.04 nightly and release jobs.
Ben added a Rocky Linux CI job, filling the hole left by the retirement of Centos 7.
Ben prototyped infrastructure for collecting hardware performance counter metrics in CI to catch code generation regressions more reliably (!7414).
head.hackage
Matt did a lot of work on head.hackage, making more use of it to find bugs early.
Matt added support for tracing of static reference tables (SRTs) to ghc-debug.
Supporting users/contributors
Matt looked into a reported issue with a runtime loop, and diagnosed it as expected behaviour involving an interaction with overlapping instances specific to the project.
Ben reproduced and characterised a linking issue affecting a user (#22803).
Sam reviewed and assisted with an MR by Georgi Lyubenov that allows record wildcards to be used with nullary constructors (!9004).
Sam reviewed several MRs by Andrew Lelechenko which eliminate uses of head and tail within GHC’s codebase.
Ben wrote and shepherded a Core Libraries Committee proposal adding the new Foreign.C.ConstPtr.ConstPtr type, facilitating resolution of #22043.
Ben began assessing the stability of the many interfaces provided by base’s GHC.* module namespace as a first step towards making a clearer separation between the public and GHC-internal APIs of base.
Ben introduced CI support for tracking the interfaces exposed by base to help ensure that core libraries changes receive the review that they deserve (!9816).
Sam made a CLC proposal concerning the reporting of exceptions arising during weak pointer finalization. Until agreement on this issue is reached, GHC’s error reporting has been reverted to the status quo, with users free to change it as they see fit.
In this final part of the series, we're going to learn to attach a database to our application.
There are a few gotchas to this. Setting up the database for first time use is a bit tricky, because we have to do some initial migrations. Then we need to use environment variables to ensure it works both locally and on the remote server. Let's get started.
A Basic Schema
Let's first assume we have a super basic schema using the Persistent library. (If you want some details on how this works, see our Real World Haskell series). We'll just have one type in our database, and users will use server endpoints to create or fetch these "text entries".
import Database.Persist.Sql
import qualified Database.Persist.TH as PTH
import Data.Text (Text)
PTH.share [PTH.mkPersist PTH.sqlSettings, PTH.mkMigrate "migrateAll"] [PTH.persistLowerCase|
TextEntry sql=text_entries
text Text
|]
An important product of this template Haskell sequence is the migrateAll function, which will run the proper commands to migrate a Postgres database to fit our schema by creating tables.
Whenever we first create a database, we have to make sure it's migrated. But before we even do that we have to make sure we've created a database for Postgres to use! Let's see the commands we need for this, and how to run them in Haskell.
Running Setup Commands
When you install Postgres on your machine, you'll have separate "databases" on your system to help you keep your data separate. Separating data allows each database to have its own "users" table without having any name conflicts, for one example. By default, Postgresql comes installed with a database called postgres.
But we don't want to use this to store our data. We want to create a separate database. We want to create this database if it's the first time we're running the server with a database. But otherwise, we just want to make sure its migrations are up to date.
Now, the command we would run to create this database is simple:
CREATE DATABASE quiz;
But we can first run this command to see if this database already exists:
SELECT datname FROM pg_database WHERE datname = 'quiz';
Both these commands assume we are connected to the postgres database.
Since these first two instructions are raw commands, we can run them using the Postgresql Simple library. Here's some code to do this.
createDBIfMissing :: String -> IO ()
createDBIfMissing connString = do
connection <- connectPostgreSQL (pack connString)
putStrLn "Checking/Creating 'quiz' Database"
let checkQuery = "SELECT datname FROM pg_database WHERE datname = 'quiz';"
(checkResult :: [Only String]) <- query_ connection checkQuery
when (null checkResult) $ do
putStrLn "Not found! Creating 'quiz' database!"
let createQuery = "CREATE DATABASE quiz;"
void $ execute_ connection createQuery
When we run checkQuery, it sees if the quiz database exists. If its result is null, then we'll run the additional command to create our database.
Once we have this function, we can write a wrapper that will create the database and then migrate it for our schema. Here's what this wrapper looks like:
Notice migrateDb takes two different connection strings. One is for the base (postgres) database. The other is for our new quiz database. The creation queries run on the first, the migration runs on the second.
But how do we use these functions within our server?
Loading the URI
When we kick off our server, we have to load the database URI for our Postgres database. We'll use the format of {host}:{port}. If you're running it locally, this would just be localhost:5432. But when we deploy the server, we'll use a different URI. So let's write a function to load the host and port (separated by a colon) from an environment variable named DATABASE_URI.
As a default value, you can often just have the username and password both be postgres (though of course this isn't recommended for a serious database). But let's make a function to substitute in the other values:
Finally, we'll pull our different pieces together, get both URIs, and launch our server. In my example, I'm using a Servant server (more details on that in this article), and this will often require passing the database string as an argument.
server :: String -> Server QuizAPI
server connString = ...
runServer :: IO ()
runServer = do
(host, port) <- loadDatabaseEnv
let baseConnString = mkPostgresUri host port "postgres"
let quizConnString = mkPostgresUri host port "quiz"
migrateDb baseConnString quizConnString
putStrLn "Running Server!"
run 8080 (serve api (server quizConnString))
Having made all these modifications to our server, of course we have to rebuild and redeploy our docker image for that! We can create the new local image with:
docker build -t quiz-server .
Then for more detailed instructions on deploying it, refer to part 2 and part 3 of this series!
When you deploy the server, you'll find it's crashing of course, because we haven't configured the database! So let's get to the real meat of this article…setting up the database on AWS!
Create a Database with RDS
This process is not actually too challenging. The first thing we're going to do is use RDS (Relational Database Service) to set up our database. This is easily done from the AWS console.
Select the RDS service
Hit the orange "Create Database" button
Go through the creation wizard, making sure to select "Postgres" and the "Free Tier" option (assuming you're just making a test app).
Most of the default options are fine, but as I mentioned above I specified postgres for the username and password of the database. I also unchecked the box for "Performance Insights" since this could lead to additional billing charges if you forget to turn it off.
Once you've created your database, you can then click the "databases" link on the sidebar, and select your new database. On that screen, you'll be able to see the "endpoint" and "port" of your database. These are the values you'll need for your environment!
Add Environment Variable
To connect your environment to the database, now you just have to add an environment variable! To do this, you have to access the configuration from the web portal:
Go to the Elastic Beanstalk service
Select "Environments" from the sidebar and then click the environment you have running your server.
Click on the "Configuration" link on the side, and then select the "Edit" button in the "Software" section.
At the very bottom, you'll find the "Environment Properties" section. Fill in DATABASE_URI as the key, and the {host}:{port} combination you got from your database in RDS.
Click "Apply" to make the change!
By adding an environment variable, you are changing the configuration of your server, so it will reboot. Once it relaunches, you should find that it works, and you can persist information from your database!
Conclusion
Hopefully this series has helped you learn how to deploy your Haskell code to AWS! If you'd like to see all this article in video form, you can check out our YouTube video covering these steps!
For more tips on creating a "Real World" application, you can read our series on web skills! You can also download our Haskell Production checklist for some ideas of other libraries and tools you can use to improve your Haskell!
The GHC developers are happy to announce the availability of GHC 9.2.7. Binary
distributions, source distributions, and documentation are available at
downloads.haskell.org.
This release is primarily a bugfix release addressing a few issues
found in 9.2.6. These include:
A fix for a bug with the RTS linker being unable to resolve the setKeepCafs
symbol which started being used by the GHC library in 9.2.6, resulting in
code depending on this symbol failing to load in GHCi or via a compile time
splice when using a statically linked GHC, such as on Windows (#22961).
A fix a bug with the alignment of RTS data structures that could result in
segfaults when compiled with high optimisation settings on certain platforms
(#22975 , #22965).
A fix for a bug in the simplifier related to placeholder values (rubbish
literals) emitted by the worker/wrapper optimisation leading to -dcore-lint
errors and compiler panics in certain cases (#19882, #22914, #23000).
Easier installation of binary distribution on MacOS platforms by changing the
installation Makefile to remove the quarantine attribute when installing.
… and a few more. See the release notes for a full accounting.
As some of the fixed issues do affect correctness users are encouraged to
upgrade promptly.
We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool,
Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and
other anonymous contributors whose on-going financial and in-kind support has
facilitated GHC maintenance and release management over the years. Finally,
this release would not have been possible without the hundreds of open-source
contributors whose work comprise this release.
As always, do give this release a try and open a ticket if you see
anything amiss.
In a previous post, I challenged you to solve Infinite 2D Array using Haskell. After deriving a formula for that involves only a linear number of terms, last time we discussed how to efficiently calculate Fibonacci numbers and binomial coefficients modulo a prime. Today, we’ll finally see some actual Haskell code for solving this problem.
The code is not very long, and seems rather simple, but what it doesn’t show is the large amount of time and effort I spent trying different versions until I figured out how to make it fast enough. Later in the post I will share some lessons learned.
Modular arithmetic
When a problem requires a fixed modulus like this, I typically prefer using a newtype M with a Num instance that does all operations using modular arithmetic, as explained in this post. However, that approach has a big downside: we cannot (easily) store our own newtype values in an unboxed array (UArray), since that requires defining a bunch of instances by hand. And the speedup we get from using unboxed vs boxed arrays is significant, especially for this problem.
So instead I just made some standalone functions to do arithmetic modulo :
What about modular inverses? At first I defined a modular inverse operation based on my own implementation of the extended Euclidean Algorithm, but at some point I did some benchmarking and realized that my egcd function was taking up the majority of the runtime, so I replaced it with a highly optimized version taken from the arithmoi package. Rather than pasting in the code I will let you go look at it yourself if you’re interested.
Given the efficient extendedGCD, we can now define modular inversion like so:
We want to compute Fibonacci numbers and factorials modulo and put them in tables so we can quickly look them up later. The simplest way to do this is to generate an infinite list of each (via the standard knot-tying approach in the case of Fibonacci numbers, and scanl' in the case of factorials) and then put them into an appropriate UArray:
This takes advantage of the fact that unboxed arrays are lazy in their values—and can hence be constructed recursively—to directly define the array via dynamic programming. But this version is much slower, and uglier to boot! (If we really needed to initialize an unboxed array using recursion/dynamic programming, we could do that via runSTUArray, but it would be overkill for this problem.)
Binomial coefficients modulo a prime
We can now efficiently compute binomial coefficients using fac and inv, like so:
As mentioned in a previous post, this only works since the modulus is prime; otherwise, more complex techniques would be needed.
We could also precompute all inverse factorials, and then we can get rid of the pdiv call in mbinom (remember that pmul is very fast, whereas pdiv has to call extendedGCD):
For this particular problem, it doesn’t make much difference either way, since the total number of pdiv calls stays about the same. But this can be an important optimization for problems where the number of calls to mbinom will be much larger than the max size of its arguments.
Putting it all together
Finally, we can put all the pieces together to solve the problem like so:
The fact that the above code is fairly short (besides extendedGCD) belies the amount of time I spent trying to optimize it. Here are some things I learned while benchmarking and comparing different versions.
First, we should try really, really hard to use unboxed arrays (UArray) instead of boxed arrays (Array). Boxed arrays have one distinct advantage, which is that they can be constructed lazily, and hence recursively. This helps a lot for dynamic programming problems (which I have a lot to write about at some point in the future). But otherwise, they introduce a ton of overhead.
In this particular problem, committing to use UArray meant (1) using explicit modular operations like padd and pmul instead of a newtype, and (2) constructing the fib array by calculating a list of values and then using it to construct the array, instead of defining the array via recursion/DP.
The optimized implementation of extendedGCD makes a big difference, too, which makes sense: a majority of the computation time for this problem is spent running it (via pdiv). I don’t know what general lesson to draw from this other than affirm the value of profiling to figure out where optimizations would help the most.
I tried a whole bunch of other things which turn out to make very little difference in comparison to the above optimizations. For example:
Optimizing padd and pmul to conditionally avoid an expensive mod operation when the arguments are not too big: this sped things up a tiny bit but not much.
Rewriting everything in terms of a tail-recursive loop that computes the required Fibonacci numbers and binomial coefficients incrementally, and hence does not require any lookup arrays:
solve'::[Int]->Intsolve'[x,y]=goxy0101(mbinom(x+y-2)(x-1))`padd`goyx0101(mbinom(x+y-2)(y-1))where-- Invariants:-- s = sum so far-- k = current k-- f' = F_{k-1}-- f = F_k-- bx = binom (x-k+y-1) (x-k)goxy!s!k!f'!f!bx|k>x=s|otherwise=goxy(s`padd`(bx`pmul`f))(k+1)f(f'`padd`f)((bx`pdiv`(x-k+y-1))`pmul`(x-k))mbinom'nk=fac'n`pdiv`(fac'k`pmul`fac'(n-k))fac'k=foldl'pmul1[1..k]
This version is super ugly and erases most of the benefits of using Haskell in the first place, so I am happy to report that it runs in exactly the same amount of time as the solution I described earlier.
In the last few articles, we've been talking about how to deploy a Haskell application using AWS. This is part 3 of the series. So if you haven't done parts 1 & 2, you should start there so you can follow along!
In Part 1, we wrote a Dockerfile and created a local Docker image containing a simple program for a Haskell web server.
In the Part 2, we pushed our container image to the AWS container registry (ECR). Notably, this involved creating an AWS account, downloading AWS command line tools and authenticating on the command line. We'll run a couple more of these commands today, so hopefully you're still authenticated!
But now that our container is uploaded, deploying that container is fairly straightforward. But it requires us to use a couple new concepts, as we'll see.
Adding ECR Permission
Before we get started, there's one step we have to take on the web portal. You need to give Elastic Beanstalk permission to download your ECR containers. You can do this using the IAM service from the AWS portal. Then follow these steps:
Select "roles" on the left hand menu.
Select "aws-elasticbeanstalk-ec2-role" in the list in the middle of the screen.
Click "Add Permissions"
Search for and select "AmazonEC2ContainerRegistryReadOnly"
Now let's get into the steps on our local machine.
Configuration File
There are multiple approaches to deploying a docker container, but the one that worked most easily for me was to create a file called Dockerrun.aws.json. (Full other methods, refer to the documentation). This approach involves a counter-intuitive idea. We're going to create a separate directory outside of our main project directory. We'll call it remote.
~/Quiz $ cd ..
~/ $ mkdir remote && cd remote
In this directory, we'll make a single file, called Dockerrun.aws.json. This will, of course, be a JSON file. It will be a very simple configuration file telling our application to use the docker image we pushed last time to ECR. We have to start it by specifying the version of the program (which is 1 because we're only using a single container).
{
"AWSEBDockerrunVersion": "1",
...
}
Now we'll use tell it to use the Docker image we pushed last time by giving the URI under the Image object:
This is the only file we need in this directory! So now let's see what commands we need to run.
Creating the Application
Now we have two more steps that can largely be accomplished on the command line. First, we have to create an application. Then we have to create an environment to use for that application.
Before we can create an application though, we have to create a Git repository, just to store our single file! That's how AWS figures out what to push for configuration.
Now we can create the application using the eb init command. We'll give our application the name quiz-server.
~/remote $ eb init -p docker quiz-server
You can then see your application on the web portal by accessing the "Elastic Beanstalk" service and clicking the "Applications" tab on the left menu.
Creating the Environment
Now we have to deploy an environment to deploy for our application. When first creating this environment, we use the eb create command. We'll give this environment the name quiz-server-env.
~/remote $ eb create quiz-server-env
This will take a while to deploy. But once it's done, you should be able to see it by clicking the "Environments" tab from the previous screen in the web portal. This will also show you the URL you can use to access your server. It's now successfully deployed!
Debugging
Sometimes, your deployment might fail. For example, you might misspell the name of your container. If you click on your environment (from the "Environments" tab), then you'll be able to access the "Logs" on the left hand menu. This can help you debug.
If you need to change your configuration file, you'll need to commit it, though you don't need to push it to any remote repository. You instead use eb deploy to push your changes.
You now have enough information to deploy a Haskell web application to Heroku! We'll have one more installment in this series around adding a database to our application, so stay tuned for that! In the meantime, subscribe to our monthly newsletter so you can stay up to date with all the latest news!
In this post, I will explain a new feature which allows GHC to store the Core definitions of entire modules in interface files. The motivation is severalfold: faster GHCi start-times, faster Template Haskell evaluation times and the possibility of using the definitions for other program analysis tasks.
This feature is implemented in MR !7502 and will appear in GHC 9.6. The work was funded by Mercury who benefited from a 40% improvement to compilation times on their Template Haskell heavy code base.
Motivation
The point of this work was to be able to restart the compiler pipeline at the point just after optimisation and before code generation. (See the GHC chapter in the AOSA book for background on the compiler pipeline.) In particular, we wanted to be able to generate bytecode on demand for any module, as this can significantly speed up start times for projects in GHCi.
The compiler writes interface files (.hi files) to pass on information about a module that subsequent modules will need to know. Interface files contain information about what functions are defined by a module, their types, what a module exports and so on. They may contain different amounts of detail: for example, if you compile with optimisations turned on, the interface file will contain more information about certain bindings, such as their demand signatures or unfoldings. For more information about interface files, you can consult the wiki.
Adding Core definitions to interface files makes it possible to defer the choice of backend (between no code, static object code, dynamic object code, bytecode) until more is known about the necessary targets. In particular:
GHCi can quickly convert previously-generated Core definitions to bytecode when loading a module, rather than needing to run the full compiler pipeline to generate bytecode for many modules even if they have previously been compiled.
Cabal pessimises build times by building both static and dynamic objects under the assumption that you will eventually need dynamic objects to run Template Haskell splices. With the Core definitions at hand, we can delay this choice until we know for sure we need to do the work. Moreover, rather than compiling and linking object code we can interpret bytecode, which is typically a faster operation for TH splices.
The Core program can also be useful for whole program analysis tasks. For example, the external STG interpreter could read these interface files and convert the result into its own STG format for running on the STG interpreter.
Core Definitions in Interface Files
In GHC 9.6, the interface file format is extended with a new field which contains complete Core bindings for the modules. A new command-line flag is available to enable this:
-fwrite-if-simplified-core
Write the Core definitions of the functions defined in this module to the interface file after simplification.
If you compile a module with -fwrite-if-simplified-core then you will see a new section called “extra-decls” when you dump the contents of an interface file with --show-iface. This section of the interface contains all the Core bindings of the program.
> ghc-9.6 --show-iface FAT.hi
....
extra-decls
f = GHC.Types.C# 'f'#
a = GHC.Types.C# 'a'#
t = GHC.Types.C# 't'#
....
The serialised program is a Core program. Using the Core representation is convenient for a number of reasons:
We already have the ability to serialise Core.
Constructing bytecode from Core is not a very expensive operation.
Other backends can generate code from the Core.
The program is serialised after simplification. This means that the interface file for a module compiled without optimisations will contain unoptimised bindings, whereas the interface file for an optimised module will contain optimised bindings.
Template Haskell evaluation via bytecode
GHC always uses the bytecode interpreter to interpret a Template Haskell splice for the current module. On the other hand, dependent home package modules can be handled in two different ways:
Object files: link the object files together using the system linker, and pass the resulting library to the interpreter.
Bytecode: directly load the already compiled bytecode into the interpreter.
By default, GHC in --make mode uses the former method, whereas GHCi uses the latter. GHC 9.6 introduces new flags to change this behaviour: -fbyte-code-and-object-code and -fprefer-byte-code.
Generating bytecode
In order to generate both the bytecode and object file linkables, there is a new flag -fbyte-code-and-object-code:
-fbyte-code-and-object-code
Produce both bytecode and object code for a module. This flag implies -fwrite-if-simplified-core.
Using -fbyte-code-and-object-code without -fwrite-if-simplified-core would recompile your project from scratch each time you compile it, due to lacking the Core definitions in the interface. Having one flag enable the other avoids this.
Compare -fbyte-code-and-object-code with the existing -fobject-code and -fbyte-code flags, which don’t allow a combination:
-fobject-code
Produce object code for a module. This flag turns off -fbyte-code-and-object-code so using -fobject-code in an OPTIONS_GHC pragma will ensure that bytecode is never produced or used for a module.
-fbyte-code
Produce bytecode for a module. This flag turns off -fbyte-code-and-object-code so using -fbyte-code means to only produce bytecode for a module.
When using -fbyte-code-and-object-code, the recompilation checker checks for the presence of an interface file with Core definitions, recompiling the module if one doesn’t exist.
If -fbyte-code-and-object-code is not enabled then even if you have an interface file with the Core program the bytecode isn’t loaded for a module. This prevents the situation where you first compile an interface for module A and then later recompile it with -fobject-code, then you don’t want to make the bytecode available for later modules if they use -fprefer-byte-code.
Using bytecode
When passed the new -fprefer-byte-code flag, GHC will use the bytecode interpreter whenever bytecode is available (including in --make mode).
-fprefer-byte-code
Use bytecode rather than object files for module dependencies when evaluating Template Haskell splices. This flag affects the decision we make about which linkable to use at the splice site only. It doesn’t have any effect on which linkables are generated from a module.
In addition, if you prefer bytecode, then the compiler will automatically turn on bytecode generation if it needs code generation when using -fno-code.
There are a couple of reasons why you might want to use these flags:
Producing object code is much slower than producing bytecode, and normally you need to compile with -dynamic-too to produce code in the static and dynamic way, the dynamic way just for Template Haskell execution when using a dynamically linked compiler.
Linking many large object files, which happens once per splice, can be very expensive compared to linking bytecode. Mercury saw an overall 40% decrease in compilation times when compiling their code base using -fprefer-byte-code due to the large amount of Template Haskell splices they use.
There’s also some reasons why you might not want to use these flage:
Enabling -fbyte-code-and-object-code generates bytecode as well as normal object files, so it could make your compilation slower if you are producing static object files, dynamic object files and bytecode for each module.
These flags will run the bytecode interpreter with optimised programs, something which wasn’t possible before GHC 9.6 so there are probably some lurking bugs. We have already fixed a large number of these issues but we’re not confident yet that we have found them all.
You probably want to use both -fprefer-byte-code and -fbyte-code-and-object-code together. If you use -fprefer-byte-code alone, then bytecode will not necessarily be available to use. If you use -fbyte-code-and-object-code alone, then the bytecode which you generate will never be used. This may not be an issue (as the bytecode is generated lazily), but it’s something to keep in mind.
Trying it out
In order to use the bytecode interpreter to evaluate the Template Haskell splices in your project, enable the necessary options with the following section in your cabal.project:
This will pass these two options when compiling all the packages local to your project. If you want to always pass these options even when compiling external dependencies you can instead write:
Including the Core program in an interface file is a simple but powerful feature. To be maximally effective, more work is necessary in the ecosystem to use them when appropriate to restart compilation, but this contribution makes the important first steps. For example, HLS already implements a similar feature to improve reload times but in future GHC versions they can instead use this native support.
In the first part of this blog series we saw how to create a local docker image containing a simple web server program. In order to run this server remotely, we have to upload this image somewhere to deploy it.
One service that lets us deploy docker images is Amazon Web Services (AWS). In this article, we're going to take the first step, and walk through the process of publishing our container image to the AWS Elastic Container Registry (ECR). Next time around, we'll see how to actually deploy our application using this image.
In principle, publishing the image is a simple task. But in my experience with AWS, the processes and documentation just aren't quite as clear as one would like them to be. There tend to be a lot of branches in their tutorials, and it's often not clear which path is the right path. The sheer amount of AWS-specific terminology can get extremely confusing, and this can make it hard to know if you've satisfied the prerequisites for the tutorial.
So in this article I'm going to be as explicit as possible, and include a video at the end so you can follow along. Here's the high level overview:
Create an AWS account
Create an ECR Repository
Install the AWS Command Line Interface
Login using the CLI
Push the container using Docker
Create an AWS Account
First of course, you need to create an account with Amazon Web Services. This is a separate account from a normal Amazon account. But a massive gotcha is that you should not use the exact email address from your Amazon account. This can cause a weird loop preventing you from logging in successfully (see this Stack Overflow issue).
If you have Gmail though, it should work to use the '+' trick with email aliases. So you can have `name@gmail.comfor your Amazon account andname+aws@gmail.com` for your AWS account.
Create an ECR Repository
Next you'll need to login to your account on the web portal and create an ECR repository. To do this, you'll simply click the services tab and search for "Elastic Container Registry". Assuming you have no existing repositories, you'll be prompted with a description page of the service, and you'll want to find the "Get Started" button under the "Create a Repository" header off in the top right corner.
The only thing you need to do on the next page is to assign a name to the repository. The prefix of the repository will always have the format of {account-id}.dkr.ecr.{region}.amazonaws.com, where the account ID is a 12-digit number.
If you want, you can also set the repository as public, but my instructions will assume that you'd made a private repository. To finish up, you'll just click the "Create Repository" button at the bottom of the page. This part is also covered in the video at the bottom if you want to see it in action!
Install the AWS CLI
Our next few actions will happen on our local command line prompt. To interact with our AWS account, we'll need to install the AWS Command Line Interface. To install these tools, you can follow this user guide. It is fairly straightforward to follow once you select your operating system. You know it's succeeded when the command aws --version succeeds on your command line.
Login Using the CLI
Now assuming you created a private repository, you'll need to authenticate on the command line. The first step in this process is to create an access key. You can do this from the web portal by clicking your account name in the top right corner to open up a menu and then going to the "Security Credentials" page. There's a section for "Access Keys" about midpage, and you'll want to use "Create Access Key".
If you do this as a "root" user, AWS will warn you that this is not the advised practice and you should instead create such keys as an "IAM User". But it is possible to do use root for demonstration purposes.
You'll want to copy the "Access Key ID" and the key itself. The latter must be copied or downloaded before you leave the page (you can't come back to it later).
You can then login using the aws configure command in your command line terminal. This will ask you to enter your access key ID and then the key itself, as well as the region.
Now that you're authenticated with AWS, we have to allow AWS to login to Docker for us. The following command would give us the Docker password for AWS in the us-west-2 region:
>> aws ecr get-login-password --region us-west-2
We can pipe this password into the docker login command and connect to the repository we created with this command, where you should substitute your region and your account ID.
Note how you actually do not need the repository name for this command! Just the prefix formed by your account and the region ID.
Pushing the Image
Now that we're authenticated, we just need to push the container image. We'll start by reminding ourselves of what our image ID is:
>> docker images
REPOSITORY TAG IMAGE ID ...
quiz-server latest b9eab6a22b12 ...
The first thing we need to do is provide a "tag" for this image corresponding to the remote ECR repository we created. This requires the image ID and the full repository URI. We'll also attach :latest to indicate that this is the most recent push. Here's the specific command I used for my IDs:
>> docker tag b9eab6a22b12 165102442442.dkr.ecr.us-west-2.amazonaws.com/quiz-server:latest
Here's a more generic command template:
>> docker tag {image-id} {account-id}.dkr.ecr.{region}.amazonaws.com/{repo-name}:latest
Finally, we just need to push it using this new repository/tag combo! Here's what it looks like for me:
You should then be able to see your image if you head to your ECR dashboard!
Video Walkthrough
If you want to see all this in action, you can head to YouTube and take a look at the video walkthrough! If you are enjoying this series, make sure to subscribe to our monthly newsletter!
I’ve been pondering a smart watch for tracking more of my daily activity and health, as well as getting some metrics on my non-cycling workouts.
I had narrowed the field down to the Fitbit Charge, Apple Watch SE, and Garmin Fenix 6.
The Fenix seemed like the winner - awesome battery life, navigation, and even music on board.
However, even at a discount, it’s $450, so it must be awesome to be worthwhile.
tl;dr: Garmin’s anticooperative features are a dealbreaker.
Update History
2023-02-17: I got an Apple Watch and the comparison has me feeling like maybe I should keep the Garmin.
As a result, this may become more of a “fitness watch review and comparison post.”
Garmin Fenix 6s
The Good
The watch comes in three sizes, and the 6s doesn’t look too big on my wrist, despite having small wrists.
The Body Battery and sleep tracking are pretty good.
Over the past three weeks or so of owning the Fenix, the Body Battery has dialed in to how I’m physically feeling, and I can pretty well tune my workouts and daily activities based on the recommendations.
The pulse oximeter has been interesting for tracking sleep - when I wear a nose strip, my blood oxygen level is higher overnight.
This is actionable and useful information, though I already knew that a nose strip helps a lot for sleep quality.
The device appears to be very rugged.
The lack of touch screen is a plus to me - I don’t like the idea of accidentally touching or disabling things.
The button interface is a bit clunky, but overall feels reliable and secure, even when operating with gloves.
The battery life is excellent, so long as you’re not downloading music or maps or navigating.
The Bad
Garmin has a reputation for being anticooperative.
Their devices, software, etc, have always blocked features that would be extremely useful if they think they can sell you anothe device or service.
They desperately want to be your only interface into the fitness world.
In that sense, they’re a lot like Apple.
But unlike Apple, you don’t build your lifestyle around your fitness devices.
I own a Wahoo bike computer and a Withings smart scale.
Garmin explicitly is not compatible with these devices.
Garmin explicitly chooses to not support syncing workouts from Strava to calculate “training readiness,” and they explicitly don’t sync weight from Apple Health.
They want you to record workouts using Garmin Connect, and they want you to buy a Garmin smart scale.
They used to support API workout uploads, so you could sync from Strava to Garmin using a third party.
However, they recently disabled this.
And - even if you could upload Strava workouts - they don’t count for the “Training Readiness” score that the app gives you.
Some more annoyances:
You can’t do firmware update over the phone. You need to plug in to a Windows or Mac computer. My computers are all Ubuntu, so this is a bad deal. I’ve read some reports that WINE can do it, but I’m not sure I want to put the firmware on a $450 device up to WINE and Garmin’s programmers.
The music download feature is broken. I tried to download a playlist to the watch, and it burned the battery from 60% to dead in about 20 minutes. It didn’t even download anything.
Activity tracking and options are overwhelming, but “track activites automatically” doesn’t work - it never registered a bike ride as a bike ride.
The Choice
So, Garmin presents a bargain:
The Fenix is a fantastic watch for adventure and activity.
But for you to really get the full benefit, you need to switch to Garmin Connect, and Garmin smart scale, and use Garmin for everything.
Yeah, that includes tossing your Wahoo bike computer and buying a Garmin.
Once you do that, you’ll get Training Status, Body Battery, and other useful training tools.
The bargain is… expensive.
The Garmin smart scale is more expensive than the Withings by a large amount, and per reviews, it’s not even as good.
The Garmin bike computers are way worse than Wahoo, IME, and the cheaper ones have arbitrary software locks (like not being able to sync a route over Bluetooth).
Furthermore, you know you’re signing up for a closed ecosystem that will never cooperate with anything else.
This sort of ecosystem lock is comparable to what Apple is trying to do, but at a much smaller niche.
For the most part, Apple’s ecosystem lock is about adding functionality.
One Apple device is quite useful, and doesn’t really do anything to explicitly block compatibility.
Sure, the Apple Watch only works on iPhone.
But you can use a Fitbit or any other smart watch with the iPhone.
You don’t need a Macbook to use an iPhone or iPad.
Garmin, on the other hand, explicitly denies compatibility with other devices.
Wahoo and Garmin don’t talk to each other.
Wahoo is perfectly happy to talk to basically every other company - but Garmin won’t allow it.
Garmin won’t even read data from Apple Health, if it can provide a slight edge to selling a product.
The COROS Vertix is the other recommendation in the category.
That gives you the EvoLab “Training Load Metrics”, which should be a nice competitor to “Body Battery.”
It also gives a much better battery life - 60 hours - instead of the 24/36 from Garmin.
However, at $500, it also seems like more of a “specialized/primary tool for runners/hikers,” and realistically, I’m a cyclist - my bike computer will be my go-to device for most activity I do.
The Apple Watch SE seems to give most of the same featureset as the Garmin, with the main downside being battery life.
However, the watch charges pretty quickly, and a habit of charging every day is pretty much what I’ve settled into with the Garmin.
It’s also one of the cheapest options to provide mapping and navigation, and the WatchOS platform supports a bunch of apps like RideWithGPS and Hiking Project.
The Fitbit Charge is considerably cheaper, and is only a basic activity/sleep tracker.
If you don’t want any smartwatch features, then this may be a good bet.
I’m boxing up and returning the Fenix today, and I’ve got an order in for the Apple Watch SE.
2023-02-17: The Apple Watch SE
So, I haven’t actually returned the Fenix yet.
I boxed it up, printed the return label, but then loitered at the bike store instead of going to the UPS store to drop it off.
It’s sitting in my bag, just waiting for packaging.
I picked up an Apple Watch SE on Monday the 13th.
At $249, it’s about $200 cheaper than the Fenix 6s.
These devices are very different.
Really, they’re not in the same market category at all.
The Apple competitor to the Garmin is probably the new Apple Watch Ultra, with longer battery life and more outdoor features.
But at $800 that’s nearly twice what I paid for the Garmin, and I’m not that excited about having a watch.
I think the differences actually make for an interesting compare-and-contrast, which is why I’m extending this post with the review of the Apple Watch.
Aesthetics
It’s an Apple product, you really need to start with the aesthetics.
The Apple Watch SE has a minimal interface compared to the Garmin.
Despite having the same “watch face” size, the border of the Garmin is significantly larger.
The Apple Watch looks like a sleek, chic, modern urban accessory.
The variety of bands available are also cool - the fabric fitness band that I selected is much more comfortable than the rubber/plastic band that came with the Fenix stock.
Apple provides much more choice and flexibility with the watch faces.
Being able to select photos as a background is really nice touch.
I love looking down and seeing my cat doing something silly, or a beautiful photo memory of some bike ride or camping trip.
Aesthetics are deeply personal.
I like the Garmin Fenix - the rugged and sporty look works for me.
However, the Apple Watch looks better when I’m not outdoors, especially if I’m dressed up (to the extent that a software engineer in Colorado ever dresses up).
Ease of Use
The Apple Watch SE is extremely easy to use.
Apple provides a touch screen, a button, and a scroll wheel button.
The scroll wheel is a nice touch when gloves make it annoying to use the touch screen, but there’s only so much it can do.
The real winner here is the gesture feature and Siri voice control.
You can use Siri to start workouts, stop workouts, start timers, make reminders, etc - almost everything you’d expect your phone to do.
The gestures are also very cool - you can setup the watch to do something if you pinch your fingers or clench your fist.
This accessibility feature makes using the watch one handed significantly easier.
The Garmin, on the other hand, relies on physical buttons.
The experience is slightly clunkier - starting a workout is much more involved than “Hey Siri, start a yoga workout” or “Hey Siri, start an indoor cycling workout.”
The plus side of this is that you won’t get mistaps from the touch screen or misclicks.
The buttons on the Apple Watch are somewhat sensitive, and I’ve “misclicked” multiple times in the few days that I’ve owned it.
There is no concern about gloves - buttons always work.
In a “pure ease of use” contest, the Apple Watch wins easily.
However, the reliability and security of physical buttons is an important feature, particularly in outdoor contexts.
Utility
The Apple Watch series all share the same OS, and according to Amazon’s best sellers for smartwatches, is the most popular smart watch platform.
Even beyond the impressive built-in utility from Apple, the third party support is fantastic.
Most cycling and fitness apps support the Apple Watch to some degree.
Garmin also has an App Store of sorts, though the apps you can really put on a Garmin are much more limited.
Apple Watch has more fitness apps available, and far more non-fitness apps.
I can open the garage door with the Apple Watch.
I can also use it as a walkie-talkie with my friends that have Apple Watches.
I can record voice memos, deal with my car, send text messages, etc.
It’s easier to see notifications and act on them.
The Garmin is hindered here by being an “adventure fitness watch” and not a proper “smart watch.”
So the comparison isn’t really fair.
Fortunately, there are some points where the Garmin is clearly superior - so let’s dive into those.
Battery Life
The Apple Watch has a relatively short battery life at only 18 hours.
This means you will likely need to charge it several times per day.
I’ve got a charger setup on my desk, where I bring it to a full charge in the morning (once I’m done with my morning routine and sleep).
Then, I’ll charge it again when I am showering.
This is usually enough to keep it working well enough for sleep tracking at night and some light activity tracking during the day.
The Watch appears to require a 20W USB-C fast charger.
I wasn’t able to get it to charge from my laptop’s USB-C port, nor the USB-C ports on my docking station.
This is an inconvenience - I’m not sure I’d easily be able to charge it on a long hiking or bikepacking trip.
The Garmin’s battery life is far better.
Not only that, but the charging speed is faster.
The “charge while showering” habit is all that’s necessary to bring it up to a full charge, even if I’ve forgotten to charge it for a few days.
As a result, there’s less stress around the battery.
With the Apple Watch, I feel like I’m needing to constantly manage the habit of keeping it charged, which is really more attention than I want to pay to a device.
The Garmin is much more forgiving.
The Garmin is also the clear winner for longer trips.
While I would not bring the Apple Watch along for a multiday bikepacking expedition, the Garmin would definitely come along.
Fitness Information
The Garmin takes much finer grained fitness information, and does much more with it.
But the Apple Watch does more for “health” - loudness levels, walking balance, etc.
Heart Rate
The Apple Watch takes heart rate readings periodically - about every 4 minutes according to my Apple Health information.
The Garmin tracks much more frequently when not in a workout - it appears to be continuously reporting a Heart Rate number, although that is still probably sampling only ever several seconds.
Both track continuously during a workout, and the Apple Watch provides a nicer view into your heart rate zones.
Sleep Tracking
Both devices offer sleep tracking.
I haven’t compared them directly, but both seem fairly good.
The Garmin occasionally thinks I slept longer than I did, which is easy to correct.
The Apple Watch doesn’t appear to have any ability to edit the overall sleep duration, but I also haven’t seen it be wrong yet, so that’s promising.
On that brief experience, the Apple Watch appears to have a more reliable algorithm, so I’m tempted to trust it more.
However, the Garmin provides a “stress” measurement during sleep, which can measure the quality of sleep.
Apple measures the quantity of sleep and the time in various sleep stages, but it doesn’t try to tell you what that means.
Garmin takes into account “stress” during sleep and incorporates that into a “Sleep Score.”
In my experience, the “Sleep Score” did a pretty damn good job of predicting how I’d feel during the day.
It did seem to notice when I had caffeine too late in the day, or even a single alcoholic beverage.
The Apple Watch may be providing the same raw data, but I don’t know how to interpret it.
The Garmin provides a much better reflection point.
Recovery Status
The Apple Watch does not attempt to provide a picture of your recovery status in the same way that Garmin does.
Garmin tracks your “stress level” in addition to your heart rate and presents a “Body Battery” score indicating your relative readiness.
In my experience, this number tracks pretty well with how I’m actually feeling.
I haven’t had a time where I saw the number, checked in with my feelings, and thought “wow that’s wrong.”
Much more often, I’d see a low number, reflect, and realize how tired I was.
The Apple Watch does track HRV, though it appears to periodically take measurements throughout the day.
This approach is inherently pretty noisy.
This post my Marco Altini goes into detail on the best way to use the HRV data from the Apple Watch, which is tricky.
The Apple Watch tracks “heart rate recovery,” but the metric is pretty limited.
It only works if you end a workout with your heart rate near peak.
So if your workout doesn’t get to a peak heart rate, you won’t get a reliable number.
You also won’t get a reliable number if your workout has cool-down.
In terms of providing feedback for training, the Garmin is far better.
Now, that doesn’t mean that you can’t get useful feedback with the Apple Watch.
Most cardio-based training apps (Strava, Xert, etc) will provide some “fitness/fatigue/form” numbers you can use to figure out how you’re feeling.
This is often “good enough,” especially if you’re sufficiently embodied to just “know” how sleepy and stressed you are.
Those models are often limited by only taking into account workout data.
And that needs to be calibrated against some training parameters, like Functional Threshold Power or Maximum Heart Rate - so if you expect those are wrong or off (which, coming off of a surgery, they definitely are for me), then you shouldn’t expect them to be too accurate.
The Garmin’s more holistic view of stress and fitness seems like a really useful tool for balancing actual recovery and not just training inputs.
This is a big deal.
I tend to take on about as much stress as I can in my life, and I’m not nearly as embodied as I would like.
If the Garmin can help me attune to my own sensations better and provide more actionable recovery feedback, then that is very valuable to me.
Does Apple Watch have a third party app that mimics Body Battery?
A three year old Reddit topic provides several options.
The most relevant one appears to be Training Today and a few more specific sleep tracking apps.
The app appears to be pretty good - I just downloaded it, and it loaded my data and said “You’re on the more tired side. Keep to Z1 or Z2.”
This is fair - I just did a 2x20 at 100% of FTP workout yesterday, and that’s about what I’d expect.
The free app gives the basic data you need, and only $20 lifetime for more advanced features is great.
Garmin’s information is more detailed
Training Status
I haven’t actually used this, because Garmin’s lack of interoperability means that I haven’t recorded any real workouts with the watch.
If you do use a Garmin device to track workouts, then Garmin gives you information about your VO2 max, and provides some data about how you’re training is going.
I can see this being effective, especially for runners, but cyclists tend to use power as the source-of-truth for training, and most training apps/websites provide that information pretty well.
Workouts
The Garmin can connect to power meters, external heart rate monitors, and other sensors.
The Apple Watch cannot.
So the Garmin is a better “fitness monitoring” device.
However, the Apple Watch is better for actually doing workouts.
Siri’s voice control is super nice for starting/stopping workouts and setting timers, all of which are pretty dang useful during a workout.
The scroll wheel is a better interface for most things than Garmin’s clunky up/down button.
The touch screen isn’t great when sweaty, but it’s not a disaster, either.
For serious training, the Garmin wins, but for the more casual user, the Apple Watch is probably a better fit.
If you’re a cyclist and into “serious training,” you probably have a dedicated bike computer anyway, which does the job much better than any watch.
But if you’re also interested in running, snowshoeing, hiking, skiing, etc, then the bike computer is obviously a worse fit.
Garmin vs Apple
I’ve had the Apple Watch for about a week, and I used the Garmin for about two weeks before deciding to write this up and switch to the Apple Watch.
The Apple Watch is $200 cheaper and has many more non-fitness features.
Even if you spend the $5 on Autosleep and $20 on Training Today, you’re $175 cheaper - and now the Garmin’s only real advantage is the longer battery life.
Thus, the question: Does it make sense to pay $175 more for a much nicer battery, and also lose a ton of really good features?
For me, no.
In large part, that’s because I’m a cyclist, and I already have an optimal setup for tracking cycling workouts - a bike computer, a chest strap heart rate monitor, and a power meter.
If I didn’t have that stuff, then the Garmin becomes much more interesting.
The Garmin can talk directly to a power meter when recording workouts, and is a heart rate monitor.
It can also provide navigation, routes, data pages for workouts, and other good features.
Changing anything on the bike would be a pain in the ass, though.
The Apple Watch would rely on using my phone to record workouts, since it cannot talk directly to a power meter.
However, it also can’t natively broadcast heart rate to other apps - there’s a third party app, but it only has 2.9 stars - maybe unreliable?
So, to sum everything up:
The Garmin Fenix is a superior sport/fitness watch, if you use it as your central device for fitness tracking, and if a watch is a better form factor than a bike computer
The Apple Watch SE is better in every other way, aside from battery life
The Apple Watch SE is cheap enough that you can get a Wahoo Elemnt BOLT and the watch, which is a better combination for cycling than just the Garmin
The GHC team is very pleased to announce the availability of GHC 9.6.1-alpha3.
As usual, binaries and source distributions are available at
downloads.haskell.org.
Beginning with GHC 9.6.1, GHC can be built as a cross-compiler to
WebAssembly and JavaScript. This is an important step towards robust
support for compiling Haskell to the Web, but there are a few caveats to be
aware of in the 9.6 series:
Both the Javascript and WebAssembly backends are still at an early
stage of development and are present in this release as a technology preview
Using GHC as a cross-compiler is not as easy as we would like it to
be; in particular, there are challenges related to Template Haskell
GHC is not yet run-time retargetable; a given GHC binary targets
exactly one platform, and both WebAssembly and JavaScript are considered
platforms for this purpose. Cross-compilers must be built from source by
their users
We hope to lift all of these limitations in future releases.
Additionally, 9.6.1 will include:
Significant latency improvements in the non-moving garbage collector
Efficient runtime support for delimited continuations
Improvements in compiler error messages
Numerous improvements in the compiler’s memory usage
See the release notes for a comprehensive accounting of changes in this
release.
As always, one can find a migration guide to aid in transitioning from older
releases on the GHC Wiki. We have also recently started
extending our release process to cover a wider set of Linux distributions. In
particular, we now offer Rocky 8 and Ubuntu 20.04 binary distributions which
cover RedHat-derivative and distributions using older glibc releases (namely
2.27), respectively.
Please do give this release a try and open a ticket if you see anything
amiss.
In the previous post, we wrote the interpreter for basic features of Co, the small language we are building in this series of posts. In this post, we explore and implement what makes Co really interesting: support for lightweight concurrency using Coroutines.
In this and next two posts, we add support for the following features to the Co interpreter:
yield statement to yield the current thread of computation (ToC).
spawn statement to start a new ToC.
First class channels with operators to send and receive values over them.
sleep function to sleep the current ToC for a given number of milliseconds.
Let’s Co!
Coroutines
Coroutines@1 are computations that support Cooperative multitasking@2. Unlike ordinary Subroutines that execute from start to end, and do not hold any state between invocations, coroutines can exit in the middle by calling other coroutines, and may resume later from the same point while holding state between invocations. They do so by yielding the control of the current running thread.
<noscript></noscript>
Subroutines vs. Coroutines
The above diagram compares the execution of a subroutine and a coroutine, invoked from a caller1. The rectangles represent instructions, starting at top and ending at bottom. The arrows represent the flow of control.
The subroutine executes from start to end when called. The coroutine can exit in the middle by yielding, and can resume later from the same point. The coroutine state is saved automatically at the point of yielding, and restored when the coroutine resumes. Note that the coroutine may not be resumed, in which case the rest of the coroutine never executes.
funmain()= runBlocking { launch {// launch a new coroutine and continue delay(1000L)// non-blocking delay for 1 second println("World!")// print after delay} println("Hello")// main coroutine}// prints "Hello World!"
Now, for a different kind of example, the following JavaScript code prints numbers 11–16 and 1–4 interleaved, using Generators3:
function*printNums(start, end) {for (let i = start; i < end +1; i++) {console.log(i);yield; }}functionrun(...gens) {const queue = [...gens];while (queue.length!=0) {const p = queue.shift();if (!p.next().done) { queue.push(p); } }}run(printNums(11,16),printNums(1,4));// prints numbers 11–16 and 1–4, interleaved.
Generators in JavaScript
The next example is in Co, and it has the same behaviour as the JavaScript example above, except we don’t have to write the function to schedule and run the coroutines. The runtime for Co—the Co interpreter—does that implicitly for us.
functionprintNums(start, end) {var i = start;while (i < end +1) {print(i);yield; i = i +1; }}spawn printNums(1,4);printNums(11,16);
Coroutine example in Co
So how are coroutines implemented in Co? Let’s find out.
Implementing Coroutines
A coroutine is essentially an Environment@6 and a Continuation@7. The environment is the state of the executing code at the point of yielding. The continuation is the code to be executed when the coroutine is resumed later. If we can capture the environment and the continuation, we can implement coroutines.
Different implementations of coroutines capture the environment and the continuation in different ways4:
We can capture the environment as the current stack and the continuation as the pointer to the next instruction to be executed at the level of machine code. This is how coroutines are implemented in C and C++.
We can transform the code into a state machine as a large switch statement, and use variables to store the environment. This is how Go threads are implemented in the Clojure core.async5 library.
We can capture the environment and the continuation as a Closure. To do this, we need to first transform the code into Continuation-passing style (CPS), so that we have the handle to the continuation at every point in the code. This is how we are going to implement coroutines in Co.
Let’s learn what CPS is, and how we can use it to implement coroutines.
Continuation-Passing Style
In the usual direct programming style, we write one statement or function call after another, as a sequence of steps to execute. There is another way of thinking about program execution: after returning from executing one statement/function, the rest of the program—which can be thought of as a big statement/function itself—is run. In CPS, this is made explicit: each statement/function takes the rest of the program that comes after it as an argument, which it invokes explicitly. For example, if we have a program to get the recommendations for a user and print them, written in direct style like this:
We see how each function takes the rest of the program after it captured as a function, as a parameter, and calls it explicitly to further the flow of the program. Instead of returning the recommendations, the getUserRecommendationsCPS function now takes a function as an additional parameter, which it calls with the recommendations at the end of all the processing. Same for all the other functions invoked in the program. These functions passed as arguments are known as continuations because they continue the execution of the programs when called, and hence this style is called the continuation-passing style. The cont function is the continuation here.
The rest of the functions can be written in CPS like this:
So, what is the point of all this? Why transform code into CPS? Since, in CPS the rest of the program is passed as a function, a program can itself explicitly manipulate the flow of control of the program. This lets us do things like6:
Returning early from a function by calling the continuation with the return value, and not executing the rest of the function.
Implementing exceptions by passing two continuations: one for the normal flow of the program, and another for the exceptional flow.
Implementing non-deterministic programs by passing continuations for backtracking to previous states of the program.
Converting potentially stack-blowing recursive programs into iterative programs by passing the continuation as a parameter to the recursive function.
Suspending the execution of the program by storing the continuation, and resuming it later.
We can now begin to see how CPS can be used to implement coroutines.
Continuation-Passing Style in Haskell
It is straightforward to translate the above program into Haskell:
getUserRecommendations ::Monad m =>Int-> m RecommendationsgetUserRecommendations userId =do user <- getUser userId friends <- getFriends user recommendations <- getRecommendations friends recordRecommendations userId recommendationsreturn recommendationsmain ::IO ()main = getUserRecommendations 123>>=print
And the CPS versions:
getUserRecommendationsCPS ::Monad m =>Int-> (Recommendations-> m a) -> m agetUserRecommendationsCPS userId cont = getUserCPS userId $ \user -> getFriendsCPS user $ \friends -> getRecommendationsCPS friends $ \recommendations -> recordRecommendationsCPS userId recommendations $ \_ -> cont recommendationsgetUserCPS ::Monad m =>Int-> (User-> m a) -> m agetUserCPS userId cont = getUser userId >>= contgetFriendsCPS ::Monad m =>User-> (Friends-> m a) -> m agetFriendsCPS user cont = getFriends user >>= contgetRecommendationsCPS ::Monad m =>Friends-> (Recommendations-> m a) -> m agetRecommendationsCPS friends cont = getRecommendations friends >>= contrecordRecommendationsCPS ::Monad m =>Int->Recommendations-> (() -> m a) -> m arecordRecommendationsCPS userId recommendations cont = recordRecommendations userId recommendations >> cont ()mainCPS ::IO ()mainCPS = getUserRecommendationsCPS 123$print
We can immediately notice a pattern in the type signatures of the functions above: they are all of the form:
f ::Monad m => b -> (a -> m r) -> m r
It is indeed a known pattern, and is captured by the ContT type:
newtypeContT r m a =ContT { runContT :: (a -> m r) -> m r }
Turns out, the ContT type is a monad transformer, and we can use it to write the above CPS program in a more concise way7:
getUserRecommendationsCont ::Monad m =>Int->ContT r m RecommendationsgetUserRecommendationsCont userId =do user <- getUserCont userId friends <- getFriendsCont user recommendations <- getRecommendationsCont friends recordRecommendationsCont userId recommendationsreturn recommendationsgetUserCont ::Monad m =>Int->ContT r m UsergetUserCont userId =ContT (getUser userId >>=)getFriendsCont ::Monad m =>User->ContT r m FriendsgetFriendsCont user =ContT (getFriends user >>=)getRecommendationsCont ::Monad m =>Friends->ContT r m RecommendationsgetRecommendationsCont friends =ContT (getRecommendations friends >>=)recordRecommendationsCont ::Monad m =>Int->Recommendations->ContT r m ()recordRecommendationsCont userId recommendations =ContT$ \cont -> recordRecommendations userId recommendations >> cont ()mainCont ::IO ()mainCont = runContT (getUserRecommendationsCont 123) print
So we have come full circle: we started with monadic code, and ended with similar monadic code, but with a different monad. So what did we gain from this transformation? Well, we can now use the callCC function provided by ContT.
Call with Current Continuation
callCC—short for “call with current continuation”—is a function that provides on-demand access to the current continuation at any point in the code, just like we had in the CPS version of the program. At the same time, by using ConT we can write the program again in the concise monadic style8.
The following example uses callCC to print the user recommendation twice, instead of once9:
getUserRecommendationsCont2 ::Monad m =>Int->ContT r m RecommendationsgetUserRecommendationsCont2 userId =do user <- getUserCont userId friends <- getFriendsCont user recommendations <- getRecommendationsCont friends logRecommendationsCont userId recommendations callCC $ \cont ->do cont recommendations cont recommendationsmainCont2 ::IO ()mainCont2 = runContT (getUserRecommendationsCont2 123) print
This is the power of CPS: it lets the programs manipulate the flow of control explicitly, and in some cases markedly, as we see in the next section.
From Continuations to Coroutines
Since continuations are functions, we can store them in data structures. This lets us pause the execution of a CPS program at a certain point, and resume it later from the same point. This is exactly what coroutines do.
To implement coroutines in Co, first we enhance the Interpreter monad to be able to capture the current continuation by adding the ContT monad transformer in the transformer stack:
newtypeInterpreter a =Interpreter { runInterpreter ::ExceptTException (ContT (EitherException ()) (StateTInterpreterStateIO)) a }deriving ( Functor,Applicative,Monad,MonadIO,MonadBaseIO,MonadStateInterpreterState,MonadErrorException,MonadCont )
To be able to pause and resume the Co code being interpreted, we need to capture the current interpreter environment as well. The environment contains the bindings that the executing Co code sees at any given time. By capturing and later restoring the environment, the code execution resumes with same environment, and hence works correctly.
We use an IORef containing a min-priority queue to store the coroutines10. For now, we use it as a simple FIFO queue, but we will see in a later post how we use it to implement the sleep functionality in our interpreter.
typeQueue a =IORef (PQ.MinPQueueTimeSpec a, TimeSpec)newQueue ::MonadBaseIO m => m (Queue a)newQueue =do now <- liftBase currentSystemTime newIORef (PQ.empty, now)
Now that we know how coroutines are stored in the interpreter, let’s see how we schedule them.
Scheduling Coroutines
First step in scheduling coroutines is to write functions to enqueue and dequeue from a queue:
enqueueAt ::TimeSpec-> a ->Queue a ->Interpreter ()enqueueAt time val queue = modifyIORef' queue $ \(q, maxWakeupTime)->( PQ.insert time val q,if time > maxWakeupTime then time else maxWakeupTime )enqueue :: a ->Queue a ->Interpreter ()enqueue val queue =do now <- currentSystemTime enqueueAt now val queuecurrentSystemTime ::MonadIO m => m TimeSpeccurrentSystemTime = liftIO $ getTime Monotonicdequeue ::Queue a ->Interpreter (Maybe a)dequeue queue = atomicModifyIORef' queue $ \(q, maxWakeupTime)->if PQ.null qthen ((q, maxWakeupTime), Nothing)elselet ((_, val), q') = PQ.deleteFindMin qin ((q', maxWakeupTime), Just val)
To use the min-priority queue as a FIFO queue, we use the current system time—which is a monotonically increasing value—as the priority of the values in the queue. This way, the coroutines are scheduled in the order they are enqueued.
The enqueueAt function enqueues the given value at the given time in the queue. The enqueue function enqueues the value at the current time, thus scheduling it to run immediately.
The dequeue function dequeues the value with the lowest priority from the queue, which in this case, is the value that is enqueued first.
The currentSystemTime function returns the monotonically increasing current system time.
Over these queuing primitives, we build the coroutine scheduling functions:
The scheduleCoroutine function takes a coroutine, and schedules it by enqueuing it in the coroutine queue in the interpreter state.
The runNextCoroutine function dequeues the next coroutine from the queue, and runs it. It first restores the environment of the coroutine in the interpreter state, and then runs the continuation of the coroutine. If the queue is empty, it throws a CoroutineQueueEmpty exception, which we add in the Exception data type:
The runNextCoroutine function is the heart of the coroutine scheduling. It is called at the end of every function related to coroutines in the interpreter, and that’s how the coroutines are run one-after-another. Next, we see how we use these functions to implement the yield and spawn statements in Co.
Yield and Spawn
Let’s recall the program we used to demonstrate coroutines:
functionprintNums(start, end) {var i = start;while (i < end +1) {print(i);yield; i = i +1; }}spawn printNums(1,4);printNums(11,16);
Running this program with the interpreter produces the following output:
11
1
12
2
13
3
14
4
15
16
As we see, the numbers printed by the printNums(11,16) function call are interleaved with the ones printed by the printNums(1,4) call.
This is how the code is interpreted:
First, the definition of the function printNums executes. The function gets stored in the environment as a Function value.
The spawn printNums(1,4) statement executes. The spawn statement creates a new coroutine for the function call printNums(1,4) and schedules it.
The printNums(11,16) function call executes, prints 11 and yields.
The while loop executes, and the print statement prints the value of the variable i, which is 11 at this point.
The yield statement executes. This creates a new coroutine for the rest of the call execution, and schedules it. The call execution suspends at this point.
The runNextCoroutine function executes, which dequeues the coroutine for the printNums(1,4) call, and runs it. This prints 1 and yields.
The while loop executes, and the print statement prints the value of the variable i, which is 1 at this point.
The yield statement executes. This creates a new coroutine for the rest of the call execution, and schedules it. The call execution suspends at this point.
The runNextCoroutine function executes again, which dequeues the coroutine for the printNums(11,16) call, and runs it. This prints 12 and yields.
The call resumes after the yield statement. The while loop executes again, and the print statement prints the value of the variable i, which is 12 at this point.
The function execution suspends at the yield statement again.
The runNextCoroutine function executes again, which dequeues the coroutine for the printNums(1,4) call, and runs it. This prints 2 and yields.
The call resumes after the yield statement. The while loop executes again, and the print statement prints the value of the variable i, which is 2 at this point.
The function execution suspends at the yield statement again.
This back-and-forth process of suspension and resumption of function executions continues until the printNums(1,4) call returns after printing the number 4.
After that, the call printNums(11,16) resumes to print the numbers and yields, again and again, until it returns after printing the number 16.
Interpretation ends.
The diagram below depicts this process in abstract:
<noscript></noscript>
Spawning, yielding, and running coroutines
With the understanding of how they work, let’s see how to implement the yield and spawn statements in Co.
Implementation
First, we add the YieldStmt and SpawnStmt constructors to the Stmt data type:
The spawn statement creates a new coroutine for the expression expr and schedules it. The coroutine captures the current environment, and evaluates the expression expr when it is run. The runNextCoroutine function is called after the expression is evaluated to run the next coroutine in the queue11.
The yield function is the essence of coroutines in Co. This is where we use the continuations that we added to the interpreter. First, we capture the current environment from the interpreter state. Then, we invoke callCC to get the current continuation. This continuation represents the rest of the program execution that lies in future after the yield statement12. We create a new coroutine with the captured environment and the continuation, and schedule it. Finally, we run the next coroutine in the queue.
By capturing the environment and the continuation in a coroutine, and scheduling it to be run later, we are able to suspend the current program execution, and resume it later. At the same time, by running the next coroutine in the queue, we cause the interleaved execution of function calls that we saw in the previous section.
Waiting for Termination
There is one last thing we need to implement. If we were to run the following program with the interpreter as we have it now, it would terminate prematurely without printing anything:
functionprintNums(start, end) {var i = start;while (i < end +1) {print(i);yield; i = i +1; }}spawn printNums(1,4);
That’s because spawn schedules a coroutine for the printNums(1,4) function call, but the interpreter does not wait for all scheduled coroutines to finish executing before terminating. So, we add a mechanism for the same:
The awaitTermination function checks if the coroutine queue is empty. If it is not, it yields and calls itself again to redo the check. Calling yield causes the next coroutine in the queue to be run. awaitTermination keeps checking the queue, and yielding until the queue is empty. Then, it finally returns.
Putting Everything Together
Finally, we put everything together in the interpret function:
interpret ::Program->IO (EitherString ())interpret program =do state <- initInterpreterState retVal <-flip evalStateT state.flip runContT return. runExceptT. runInterpreter$ (traverse_ execute program >> awaitTermination)case retVal ofLeft (RuntimeError err) ->return$Left errLeft (Return _) ->return$Left"Cannot return from outside functions"LeftCoroutineQueueEmpty->return$Right ()Right _ ->return$Right ()
We add awaitTermination at the end of the program to be interpreted so that the interpreter waits for the coroutine queue to be empty before terminating.
We use runContT to run the program with the initial continuation, which is just return. This causes the interpreter to terminate when the program returns.
Lastly, we catch the CoroutineQueueEmpty exception, ignore it, and terminate the interpreter.
That’s it! We have implemented coroutines in Co. For an interesting example of usage of coroutines, we are going to implement the breadth-first traversal of a binary tree without using a queue in the next section.
Bonus Round: Breadth-First Traversal without a Queue
Breadth-first traversal is a common algorithm for traversing a tree. It traverses the tree level-by-level, from left to right. It uses a queue to keep track of the nodes that are yet to be traversed. However, with coroutines, we can implement a breadth-first traversal without using a queue.
First, we need to define a binary tree data structure in Co. Remember, however, that Co does not have a built-in data structure for trees, neither does it support user-defined data structures. So, we are going to borrow a trick from the Wizard book, and implement it using closures:
We define a binary tree as a function that takes a node value, and left and right subtrees as parameters, and returns an anonymous function that takes a command, and returns the corresponding parameter value. The binaryTreeVal, binaryTreeLeft and binaryTreeRight are helper functions that call the returned anonymous function with the appropriate command.
Next, we write a function to generate a perfect binary tree given a starting power-of-two number:
The trick here is to use the coroutine queue itself for the breadth-first traversal. After printing a tree node, we spawn a coroutine for traversing each child node. The coroutines are scheduled in the order they are spawned, so the traversal is breadth-first, resulting in the above output.
In this post, we added support for coroutines to our Co interpreter. We learned about the continuation-passing style, and used it to implement coroutines. In the next post, we’ll add support for channels to our interpreter, and use them for cross-coroutine communication.
The code for complete Co interpreter is available here.
Acknowledgements
Many thanks to Arun Raghavan for reviewing a draft of this article.
This representation is copied from a series of articles on coroutines by Dmitry Kandalov. The articles are a great introduction to coroutines, and are highly recommended.↩︎
Read the Kotlin docs and Python docs to learn more about coroutines in Kotlin and Python respectively.↩︎
Generators are similar to coroutines. The main difference is that generators are typically used to produce a sequence of values, while coroutines are used to implement concurrency. But coroutines (as we have them in this post) can be implemented over generators, and generators can be implemented over coroutines and channels. So the difference is mostly of intent.↩︎
Coroutines as we have them in Co, are asymmetric, non-first-class, and stackful.
In contrast, coroutines in
Kotlin are asymmetric, non-first-class and stackless,
The core.async library implements something like coroutines in Clojure, but they are not true coroutines. They have various limitations like not being able to yield from a functions called from a go block. This is because core.async is implemented as a macro that transforms the code directly inside a go block into a state machine, but not the functions called from the go block.↩︎
See this article by
Ziyang Liu and this one by Matt Might for detailed explanations of the various use-cases of CPS.↩︎
See this article by Li-yao XIA for an introduction to the Cont monad.↩︎
Scheme was the first language to introduce call/cc. Since then many languages have added support for it.↩︎
If we compare the CPS version of the program with the direct style version, we can see that it is possible to print the recommendations twice in the CPS version by calling the continuation twice. However, this is not possible in the direct style version, since the flow of control is implicit in it.↩︎
We use the min-priority queue from the pqueue library.↩︎
It is essential to call runNextCoroutine after the expression in the spawn statement is evaluated. The evaluation of the expression may or may not yield. If it does, yielding causes the next coroutine to be run.
However, if it does not yield, but instead returns, and we do not call runNextCoroutine after it, the flow of control then goes to the end of the previous call to runNextCoroutine called from a previous yield. This causes the program after the previous yield to start executing, but with the interpreter environment set to that of the expression in the spawn statement, leading to unexpected behavior.
So, calling runNextCoroutine after the expression evaluation is a must to ensure correct execution.↩︎
The coroutines in Co are stackful, which means that the ToC can be yielded from anywhere in the program, including nested function calls, and are resumed from the same point. This is in contrast to stackless coroutine implementations, where the ToC can only be yielded from particular functions that are marked as being able to yield, like generators in Python or async functions in JavaScript. Stackless coroutines are more efficient, but they are also more restrictive.↩︎
The GHC developers are happy to announce the availability of GHC 9.2.6. Binary
distributions, source distributions, and documentation are available at
downloads.haskell.org.
This release is primarily a bugfix release addressing a few issues
found in 9.2.5. These include:
Fix for a simplifier regression in 9.2.5 that could impact runtime performance
when compiling with optimisations due to duplication of expensive work (#22425).
Allow certain keywords which can be used as variable names with
-XOverloadedRecordDot (#20723).
Improvements to typechecker performance for modules with holes in type
signatures (#14766)
Bump gmp-tarballs to a version which doesn’t use the reserved x18
register on AArch64/Darwin systems, and also has fixes for CVE-2021-43618
(#22497, #22789).
… and a few more. See the release notes for a full accounting.
As some of the fixed issues do affect correctness users are encouraged to
upgrade promptly.
We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool,
Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and
other anonymous contributors whose on-going financial and in-kind support has
facilitated GHC maintenance and release management over the years. Finally,
this release would not have been possible without the hundreds of open-source
contributors whose work comprise this release.
As always, do give this release a try and open a ticket if you see
anything amiss.
In two previous posts I've described how I currently compose log messages and
how I do the actual logging. This post wraps up this particular topic for now
with a couple of typeclasses, a default implementation, and an example showing
how I use them.
The typeclasses
First off I want a monad for the logging itself. It's just a collection of
functions taking a LogMsg and returning unit (in a monad).
In order for it to work, and compile, it needs an implementation of HasLogger too.
instanceHasLoggerEnvTwheregetLogger=ask
All that's left is a function using a constraint on LoggerActions (doStuff)
and a main function creating a logger, constructing an EnvT, and then
running doStuff in it.
In this blog I want to write about a particular usability issue surrounding the type of the subtraction operation, partly because I think some might find it interesting, and partly because forcing myself to clearly articulate possible solutions may help me come to a good resolution.
The problem with subtraction
Disco supports four basic numeric types: natural numbers , integers , “fractional” numbers (i.e. nonnegative rationals), and rational numbers . These types form a subtyping lattice, with natural numbers being a subtype of both integers and fractionals, and integers and fractionals in turn being subtypes of the rationals. All the numeric types support addition and multiplication; the integers allow negation/subtraction, the fractionals allow reciprocals/division, and rationals allow both.
So what is the type of ? Clearly it has to be either or ; that’s the whole point. Natural numbers and fractional numbers are not closed under subtraction; and are precisely what we get when we start with or and decide to allow subtraction, i.e. when we throw in additive inverses for everything.
However, this is one of the single biggest things that trips up students. As an example, consider the following function definition:
fact_bad : N -> N
fact_bad(0) = 1
fact_bad(n) = n * fact_bad(n-1)
This looks perfectly reasonable on the surface, and would work flawlessly at runtime. However, it does not typecheck: the argument to the recursive call must be of type , but since uses subtraction, it cannot have that type.
This is very annoying in practice for several reasons. The most basic reason is that, in my experience at least, it is very common: students often write functions like this without thinking about the fact that they happened to use subtraction along the way, and are utterly baffled when the function does not type check. This case is also extra annoying since it would work at runtime: we can clearly reason that if is a natural number that is not zero, then it must be or greater, and hence will in fact be a natural number. Because of Rice’s Theorem, we know that every decidable type system must necessarily exclude some programs as untypeable which nonetheless do not “go wrong”, i.e. exhibit no undesirable behavior when evaluated. The above fact_bad function is a particularly irksome example.
To be clear, there is nothing wrong with the type system, which is working exactly as intended. Rather, the problem lies in the fact that this is a common and confusing issue for students.
Implementing factorial
You may be wondering how it is even possible to implement something like factorial at all without being able to subtract natural numbers. In fact, there are two good ways to implement it, but they don’t necessarily solve the problem of student confusion.
One solution is to use an arithmetic pattern and match on n+1 instead of n, like this:
fact_ok1 : N -> N
fact_ok1(0) = 1
fact_ok1(n+1) = (n+1) * fact_ok1(n)
This works, and it’s theoretically well-motivated, but feels somewhat unsatisfying: both because we have to repeat n+1 and because this style of definition probably feels foreign to anyone except those who have played with a Nat algebraic data type (which excludes the vast majority of Discrete Math students).
Another solution is to use a saturating subtraction operator, . In Disco this operator is written x .- y. Unlike normal subtraction, it can have the type , so we can rewrite the factorial function this way:
fact_ok2 : N -> N
fact_ok2(0) = 1
fact_ok2(n) = n * fact_ok2(n .- 1)
The .- operator is also theoretically well-motivated, being the “monus” operator for the commutative monoid of natural numbers under addition. However, in my experience, students are annoyed and confused by this. They often do not understand when and why they are supposed to use .-. Of course, better error messages could help here, as could better pedagogy. This is actually my current approach: this semester I talked about the difference between and very early, hitting on the fact that is not closed under subtraction, and explicitly made them explore the use of the .- operator in their first homework assignment. We’ll see how it goes!
Some tempting and expedient, but wrong, solutions
One solution that sounds nice on the surface is to just pun the notation: why not just have a single operator -, but make it behave like .- on types without negation ( and ), and like normal subtraction on and ? That way students wouldn’t have to remember to use one version or the other, they can just use subtraction and have it do the right thing depending on the type.
This would be sound from a type system point of view; that is, we would never be able to produce, say, a negative value with type . However, in the presence of subtyping and type inference, there is a subtle problem from a semantics point of view. To understand the problem, consider the following function:
f : N -> Z
f(n) = (-3) * (n - 5)
What is the output of f(3)? Most people would say it should be (-3) * (3 - 5) = (-3) * (-2) = 6. However, if the behavior of subtraction depends on its type, it would also be sound for f(3) to output 0! The input 3 and the constant 5 can both be given the type , in which case the subtraction would act as a saturating subtraction and result in 0.
What’s going on here? Conceptually, one of the jobs of type inference, when subtyping is involved, is to decide where to insert type coercions. (Practically speaking, in Disco, such coercions are always no-ops; for example, all numeric values are represented as Rational, so 3 : N and 3 : Q have the same runtime representation.) An important guiding principle is that the semantics of a program should not depend on where coercions are inserted, and type-dependent-subtraction violates this principle. f(3) evaluates to either 6 or 0, depending on whether a coercion from to is inserted inside or outside the subtraction. Violating this principle can make it very difficult for anyone (let alone students!) to understand the semantics of a given program: at worst it is ambiguous or undefined; at best, it depends on understanding where coercions will be inserted.
What about having - always mean subtraction, but crash at runtime if we try to subtract natural numbers and get something less than 0? That way we can use it as long as we “know it is safe” (as in the factorial example). Unfortunately, this has the exact same issue, which the above example with f(3) still illustrates perfectly: f(3) can either evaluate to 6 or crash, depending on exactly where coercions are inserted.
Typechecking heuristics?
Another interesting option would be to make typechecking a bit smarter, so that instead of only keeping track of the type of each variable, we also sometimes keep track of values we statically know a variable can and can’t have in a certain context. We could then use this information to allow subtraction to have a type like as long as we can statically prove it is safe. For example, after matching on 0 in the first line of fact_bad, in the second line we know n cannot be 0, and we could imagine using this information to decide that the expression n - 1 is safe. This scheme would not change the semantics of any existing programs; it would only allow some additional programs to typecheck which did not before.
Of course, this would never be complete—there would always be examples of Disco programs where we can prove that a certain subtraction is safe but the heuristics don’t cover it. But it might still go a long way towards making this kind of thing less annoying. On the other hand, it makes errors even more mysterious when they do happen, and hard to understand when a program will and won’t typecheck. Perhaps it is best to just double down on the pedagogy and get students to understand the difference between and !
Division?
As a final aside, note that we have the same issue with division: x / y is only allowed at types or . If we want to divide integers, we can use a different built-in operator, // which does integer division, i.e. “rounds down”. However, this is not nearly as bad of an issue in practice, both because some students are already used to the idea of integer division (e.g. Python makes the same distinction), and because wanting to divide integers does not come up nearly as often, in practice, as wanting to subtract natural numbers.
In my previous post I described a type, with instances and a couple of useful
functions for composing log messages. To actually make use of that there's a bit
more needed, i.e. the actual logging. In this post I'll share that part of the
logging setup I've been using in the Haskell services at $DAYJOB.
The logger type
The logger will be a wrapper around fast-logger's FastLogger, even though
that's not really visible.
newtypeLogger=Logger(LogMsg->IO())
It's nature as a wrapper makes it natural to follow the API of fast-logger, with
some calls to liftIO added.
I’m happy to announce that my book “Production Haskell” is complete.
The book is a 500+ page distillation of my experience working with Haskell in industry.
I believe it’s the best resource available for building and scaling the use of Haskell in business.
To buy the ebook, go to the Leanpub page - the price is slightly lower here than on Amazon.
terraform-nixos-ng: Modern terraform support for NixOS
Recently I’ve been working on writing a “NixOS in Production” book and one of the chapters I’m writing is on deploying NixOS using terraform. However, one of the issues I ran across was the poor NixOS support for terraform. I’ve already gone through the nix.devpost explaining how to use the terraform-nixosproject but I ran into several issues trying to follow those instructions (which I’ll explain below). That plus the fact that terraform-nixosseems to be unmaintained pushed me over the edge to rewrite the project to simplify and improve upon it.
So this post is announcing my terraform-nixos-ngproject:
… which is a rewrite of terraform-nixos and I’ll use this post to compare and contrast the two projects. If you’re only interested in trying out the terraform-nixos-ng project then go straight to the README
Using nixos-rebuild
One of the first things I noticed when kicking the tires on terraform-nixos was that it was essentially reinventing what the nixos-rebuild tool already does. In fact, I was so surprised by this that I wrote a standalone post explaining how to use nixos-rebuild as a deployment tool:
Also, terraform-nixos doesn’t use flakes natively (it uses flake-compat), which breaks handling of the config.nix.binary{Caches,CachePublicKeys} flakes settings. The Nix UX for flakes is supposed to ask the user to consent to those settings (because they are potentially insecure to auto-enable for a flake), but their workaround breaks that UX by automatically enabling those settings without the user’s consent.
I wanted to upstream this rewrite to use nixos-rebuildinto terraform-nixos, but I gave up on that idea when I saw that no pull request since 2021 had been merged, including conservative pull requests like this one to just use the script included within the repository to update the list of available AMIs.
That brings me to the next improvement, which is:
Auto-generating available AMIs
The terraform-nixos repository requires the AMI list to be manually updated. The way you do this is to periodically run a script to fetch the available AMIs from Nixpkgs and then create a PR to vendor those changes. However, this shouldn’t be necessary because we could easily program terraform to generate the list of AMIs on the fly.
In the course of rewriting the AMI module, I made another small improvement, which was:
Support for aarch64AMIs
Another gripe I had with terraform-nixos-ng is that its AMI module doesn’t support aarch64-linux NixOS AMIs even though these AMIs exist and Nixpkgs supports them. That was a small and easy fix, too.
Functionality regressions
terraform-nixos-ng is not a strict improvement over terraform-nixos, though. Specifically, the most notable feature omissions are:
Support for non-flake workflows
terraform-nixos-ng requires the use of flakes and doesn’t provide support for non-flake-based workflows. I’m very much on team “Nix flakes are good and shouldn’t be treated as experimental any longer” so I made an opinionated choice to require users to use flakes rather than support their absence.
This choice also isn’t completely aesthetic, the use of flakes improves interop with nixos-rebuild, where flakes are the most ergonomic way for nixos-rebuild to select from one of many deployments.
Support for secrets management
I felt that this should be handled by something like sops-nixrather than rolling yet another secrets management system that was idiosyncratic to this deploy tool. In general, I wanted these terraform modules to be as lightweight as possible by making more idiomatic use of the modern NixOS ecosystem.
Support for Google Compute Engine images
terraform-nixos supports GCE images and the only reason I didn’t add the same support is because I’ve never used Google Compute Engine so I didn’t have enough context to do a good rewrite, nor did I have the inclination to set up a GCE account just to test the rewrite. However, I’d accept a pull request adding this support from someone interested in this feature.
Conclusion
There’s one last improvement over the terraform-nixosproject, which is that I don’t leave projects in an abandoned state. Anybody who has contributed to my open source projects knows that I’m generous about handing out the commit bit and I’m also good about relinquishing control if I don’t have time to maintain the project myself.
However, I don’t expect this to be a difficult project to maintain anyway because I designed terraform-nixos-ng to outsource the work to existing tools as much as possible instead of reinventing the wheel. This is why the implementation of terraform-nixos-ng is significantly smaller than terraform-nixos.
At $DAYJOB we use structured logging with rather little actual structure, the only rules are
Log to stdout.
Log one JSON object per line.
The only required fields are
message - a human readable string describing the event
level - the severity of the event, debug, info, warn, error, or fatal.
timestamp - the time of the event
correlation-id - an ID passed between services to allow to find related events
Beyond that pretty much anything goes, any other fields that are useful in that
service, or even in that one log message is OK.
My first take was very ad-hoc, mostly becuase there were other parts of the
question "How do I write a service in Haskell, actually?" that needed more
attention – then I read Announcing monad-logger-aeson: Structured logging in
Haskell for cheap. Sure, I'd looked at some of the logging libraries on
Hackage but not really found anything that seemed like it would fit very well.
Not until monad-logger-aeson, that is. Well, at least until I realised it didn't
quite fit the rules we have.
It did give me some ideas of how to structure my current rather simple, but very
awkward to use, current loggging code. This is what I came up with, and after
using it in a handful services I find it kind of nice to work with. Let me know
what you think.
The log message type
I decided that a log message must always contain the text describing the event.
It's the one thing that's sure to be known at the point where the developer
writes the code to log an event. All the other mandatory parts can, and probably
should as far as possible, be added by the logging library itself. So I ended up
with this type.
dataLogMsg=Text:#[Pair]deriving(Eq,Show)
It should however be easy to add custom parts at the point of logging, so I
added an operator for that.
The ordering is important, i.e. ps0 <> ps1, as aeson's object function will
take the last value for a field and I want to be able to give keys in a log
message new values by overwriting them later on.
Instances to use it with fast-logger
The previous logging code used fast-logger and it had worked really well so I
decided to stick with it. Making LogMsg and instance of ToLogStr is key, and
as the rules require logging of JSON objects it also needs to be an instance of
ToJSON.
When writing the previous logging code I'd regularly felt pain from the lack of
a nice way to combine log messages. With the definition of LogMsg above it's
not difficult to come up with reasonable instances for both Semigroup and
Monoid.
What's missing above is the automatic handling of the remaining fields. I'll try
to get back to that part soon. For now I'll just say that the log message API
above made the implementation nice and straight forward.
Here, I present the (lightly edited) story of using ChatGPT conversationally to solve a non-trivial problem in Haskell. It definitely gets some things wrong, and it’s still unclear whether co-developing this with ChatGPT made anything easier than it would have been otherwise. But in any case, it was definitely a different and less lonely experience than just programming on my own.
There are four basic sections to the conversation below:
Defining a model for the game of probabilistic rock-paper-scissors.
Solving with brute force by searching the space of possible strategies.
Adapting the code to use symbolic representation with polynomials.
Applying support enumeration to solve the game as a system of linear constraints.
NOTE: If you aren’t interested in the detailed conversation, and just want to see some interesting Haskell code, search for “Final Code for Part” in this post for the big four steps.
Part I: Modeling a game
Chris:
Hey ChatGPT! I’d like to solve a programming problem with you. Are you up for it?
ChatGPT:
I’d be happy to help you with a programming problem! What is the problem you’re trying to solve?
Chris:
Let’s assume you and I are playing rock-paper-scissors, but with special rules. After we each choose our desired moves, there is a 50% probability that your move will be changed to rock. Independently, there’s a 50% probability that my move will be changed to scissors. I’d like to compute the right strategies and expected outcomes for this game, and others like it.
Let’s program this in Haskell. I would start by just writing down some basic types to model the game.
ChatGPT:
Sure, that sounds like an interesting problem! Here’s an example of how you could define the types for this game in Haskell:
data Move = Rock | Paper | Scissors data Result = Win | Lose | Draw
data Game = Game { p1 :: Move, p2 :: Move }
In this code, Move is a data type that can take on the three values Rock, Paper, or Scissors. Result is a data type that can take on the values Win, Lose, or Draw. Game is a data type that contains two fields, p1 and p2, representing the moves made by player 1 and player 2 respectively.
With these basic types, you could then define a function to determine the result of a game, using the standard rules of rock-paper-scissors:
result :: Game -> Result result (Game Rock Scissors) = Win result (Game Scissors Paper) = Win result (Game Paper Rock) = Win result (Game Scissors Rock) = Lose result (Game Paper Scissors) = Lose result (Game Rock Paper) = Lose result (Game _ _) = Draw
Now you could use this function to determine the result of a game without the special rule. Now you could add the special rule of the game to this function and also a function which would simulate the game with these rules and give the expected outcome.
Can I help you with anything else?
Chris:
Looks great!
The rules of the game talk about people being forced to play a specific move. That’s pretty easy to represent.
data Transform = Chosen | Forced Move
transform :: Transform -> Move -> Move transform (Forced m) _ = m transform Chosen m = m
To make further progress, we need to think about probability distributions. There are plenty of Haskell packages that define types and functions for this, but let’s keep this self-contained and write our own. Something like this should work.
type Probability = Rational newtype Distribution a = Distribution [(Rational, a)]
Then we can express the parameters of the game like this:
data Params = Params (Distribution Transform) (Distribution Transform)
applyTransforms :: Distribution Transform -> Distribution Move -> Distribution Move
ChatGPT:
Yes, that’s a good approach to modeling the probability distributions for the game. The Probability type is just a synonym for Rational, which is a built-in Haskell type that can represent fractions with arbitrary precision. The Distribution type is a newtype wrapper around a list of pairs, where each pair consists of a Probability and a value of type a.
Transform is a good way to represent the forced moves. With the transform function, you can use it to apply the transform to a move.
Params is a good way to represent the parameters of the game, it takes two Distribution Transform which gives the possible transforms for the players.
Now, applyTransforms is a function that takes two distributions, one for Transform and one for Move and returns a new distribution for Move after applying the transform to the moves. Here is an example of how you could implement it:
applyTransforms :: Distribution Transform -> Distribution Move -> Distribution Move applyTransforms (Distribution tr) (Distribution mv) = Distribution $ [(p * q, transform t m) | (p, t) <- tr, (q, m) <- mv]
In this implementation, we use a list comprehension to iterate over the elements of the two input distributions, apply the transform function to each pair of elements, and then combine the probabilities using the distributive law of probability.
This function should be able to apply the given transforms to the moves and give you the expected outcome.
Is there anything else that I can help you with?
Chris:
That looks great, but what about generalizing it to an Applicative instance for Distribution?
ChatGPT:
Sure, we can generalize this further by making Distribution an instance of the Applicative type class. Here’s an example of how you could define the Applicative instance for Distribution:
instance Applicative Distribution where pure x = Distribution [(1, x)] (Distribution fs) <*> (Distribution xs) = Distribution [(p * q, f x) | (p, f) <- fs, (q, x) <- xs]
With this instance, you can use the <*> operator to apply a distribution of functions to a distribution of values, and it will give you a new distribution of the results. The pure function creates a distribution with a single element and a probability of 1, which corresponds to a deterministic event.
Here is an example of how we can use the applicative instance and the applyTransforms function to compute the expected outcome of the game:
game :: Params -> Distribution Game game (Params tr1 tr2) = Game <$> applyTransforms tr1 moves <*> applyTransforms tr2 moves where moves = Distribution [(1%3, Rock), (1%3, Paper), (1%3, Scissors)]
Now you can use the result function defined earlier to get the expected outcome of the game.
expectedOutcome :: Params -> Distribution Result expectedOutcome params = result <$> game params
This way, you can use the expectedOutcome function to compute the expected outcome of the game for any given set of parameters.
Please let me know if you have any other question or if there is anything else I can help you with.
Chris:
I’ve actually removed the applyTransforms function, because I don’t think it’s any clearer that just writing transform <$> tr <*> moves. So your game function now looks like this:
game :: Params -> Distribution Game game (Params tr1 tr2) = Game <$> (transform <$> tr1 <*> moves) <*> (transform <$> tr2 <*> moves)
But I think we should generalize this by passing in the strategy for each player. I don’t think each player should really try to play rock, paper, and scissors each 1/3 of the time, given the constraints on the game.
I think a strategy is just a distribution of moves:
type Strategy = Distribution Move
ChatGPT:
Yes, that’s a good idea to generalize the game function by passing in the strategies for each player. Instead of hardcoding the distribution of moves for each player, you can allow them to have different strategies.
Here’s how you can define the game function with the strategy as an input:
game :: Params -> Strategy -> Strategy -> Distribution Game game (Params tr1 tr2) s1 s2 = Game <$> (transform <$> tr1 <*> s1) <*> (transform <$> tr2 <*> s2)
This way, you can use the game function with any set of parameters and any strategies for each player, and it will give you the expected outcome of the game.
Here’s an example of how you can use the game function to compute the expected outcome of the game with a specific set of parameters and strategies:
let params = Params (Distribution [(1/2, Chosen), (1/2, Forced Rock)]) (Distribution [(1/2, Chosen), (1/2, Forced Scissors)]) let s1 = Distribution [(1/2, Rock), (1/2, Scissors)] let s2 = Distribution [(1/2, Paper), (1/2, Scissors)] expectedOutcome params s1 s2
You could also use the strategies to calculate the expected outcome for each move with the given probability distribution of the transforms.
Please let me know if there is anything else I can help you with.
Chris:
Yes, this works.
I’ve written this:
main :: IO () main = do let params = Params (Distribution [(1/2, Chosen), (1/2, Forced Rock)]) (Distribution [(1/2, Chosen), (1/2, Forced Scissors)]) let s1 = Distribution [(1/2, Rock), (1/2, Scissors)] let s2 = Distribution [(1/2, Paper), (1/2, Scissors)] print (result <$> game params s1 s2)
I think we need a function to simplify a Distribution by combining like values.
ChatGPT:
Yes, you’re right. The output is not very readable in its current form. One way to simplify the output is by combining the elements with the same value into a single element, with the sum of the probabilities. Here’s an example of how you could define a function to do this:
simplify :: Eq a => Distribution a -> Distribution a simplify (Distribution d) = Distribution $ map (\xs -> (sum (map fst xs), snd (head xs))) $ groupBy (\x y -> snd x == snd y) $ sortBy (\x y -> compare (snd x) (snd y)) d
This function takes a distribution as an input, and it uses a combination of groupBy, sortBy, and map to group the elements with the same value, sum their probabilities and return a new distribution with the combined elements.
You could use this function to simplify the output of your expectedOutcome function. Here’s an example:
main :: IO () main = do let params = Params (Distribution [(1/2, Chosen), (1/2, Forced Rock)]) (Distribution [(1/2, Chosen), (1/2, Forced Scissors)]) let s1 = Distribution [(1/2, Rock), (1/2, Scissors)] let s2 = Distribution [(1/2, Paper), (1/2, Scissors)] print (simplify $ result <$> game params s1 s2)
This will output
Distribution [(1 % 4,Lose),(1 % 2,Draw),(1 % 4,Win)]
This is more readable and gives you the expected probability of each outcome (Win, Lose, Draw)
Please let me know if you have any other question or if there is anything else I can help you with.
Chris:
Yes, that did it. Although, I get a different result than you predicted.
Distribution [(5 % 8,Win),(3 % 16,Lose),(3 % 16,Draw)]
This makes sense to me. Since player 1 is playing a lot of rock, and player 2 is playing a lot of scissors, I expect the results to favor player 1.
ChatGPT:
I apologize, my mistake. The output you are getting is correct and matches with the expected outcome.
As you mentioned, the player 1 is playing rock a lot and player 2 is playing scissors a lot. So the probability of winning for player 1 is high as rock beats scissors. The probability of losing for player 2 is also high, as scissors lose to rock. The probability of draw is also low as both players are playing different moves mostly.
I’m glad that the function is working as expected and providing you with the desired output. If you have any other questions or if there is anything else I can help you with, please let me know.
Final Code for Part I
{-# LANGUAGE DeriveFunctor #-}
module Main where
import Data.List (groupBy, sortBy)
data Move = Rock | Paper | Scissors deriving (Eq, Ord, Show)
data Result = Win | Lose | Draw deriving (Eq, Ord, Show)
data Game = Game Move Move
result :: Game -> Result result (Game Rock Scissors) = Win result (Game Scissors Paper) = Win result (Game Paper Rock) = Win result (Game Scissors Rock) = Lose result (Game Paper Scissors) = Lose result (Game Rock Paper) = Lose result (Game _ _) = Draw
data Transform = Chosen | Forced Move
transform :: Transform -> Move -> Move transform (Forced m) _ = m transform Chosen m = m
type Probability = Rational
newtype Distribution a = Distribution [(Rational, a)] deriving (Functor, Show)
instance Applicative Distribution where pure x = Distribution [(1, x)] Distribution fs <*> Distribution xs = Distribution [(p * q, f x) | (p, f) <- fs, (q, x) <- xs]
simplify :: Ord a => Distribution a -> Distribution a simplify (Distribution d) = Distribution $ map (\xs -> (sum (map fst xs), snd (head xs))) $ groupBy (\x y -> snd x == snd y) $ sortBy (\x y -> compare (snd x) (snd y)) d
data Params = Params (Distribution Transform) (Distribution Transform)
type Strategy = Distribution Move
game :: Params -> Strategy -> Strategy -> Distribution Game game (Params tr1 tr2) s1 s2 = Game <$> (transform <$> tr1 <*> s1) <*> (transform <$> tr2 <*> s2)
main :: IO () main = do let params = Params (Distribution [(1 / 2, Chosen), (1 / 2, Forced Rock)]) (Distribution [(1 / 2, Chosen), (1 / 2, Forced Scissors)]) let s1 = Distribution [(1 / 2, Rock), (1 / 2, Scissors)] let s2 = Distribution [(1 / 2, Paper), (1 / 2, Scissors)] print (simplify $ result <$> game params s1 s2)
Part II: Building an empirical solver
Chris:
This is great so far, but I’d actually like to know the optimal strategy for each player, given the constraints. Let’s proceed slowly and just define some basic building blocks to work toward a solution to that problem.
ChatGPT:
Sure, I understand. To find the optimal strategy for each player, given the constraints, we can use a few different techniques. One approach would be to use a search algorithm such as minimax or alpha-beta pruning, which are commonly used in game theory to find the optimal strategy for two-player games.
Another approach would be to use dynamic programming to find the optimal strategy for each player. This approach would involve defining a recursive function that finds the optimal strategy for each player, given the current state of the game and the parameters of the game.
Another approach would be to use linear programming or optimization techniques to model the problem and find the optimal strategy for each player.
Which approach would you like to proceed with?
Chris:
I don’t think there’s any kind of game state involved here, but it makes sense to proceed experimentally at first. Let’s just search the whole space of strategies for each player up to some precision.
ChatGPT:
Yes, that’s a good idea. Since there isn’t any game state involved in this problem and the only information you have is the parameters of the game, an experimental approach would be a good way to start.
One way to search the whole space of strategies for each player up to some precision is to use a brute-force algorithm. This would involve generating all possible strategies up to a certain precision, evaluating the expected outcome for each strategy, and then comparing the results to find the optimal strategy.
Here’s an example of how you could define a function to generate all possible strategies for a player:
generateStrategies :: Int -> [Strategy] generateStrategies n = [ Distribution [ (fromIntegral i / fromIntegral n, move) | i <- [1..n] ] | move <- [Rock, Paper, Scissors] ]
This function takes an integer n as input, which represents the precision of the search, and it generates all possible strategies for a player, which is represented as a Distribution of Move. The function uses a list comprehension to generate all possible strategies, where the probability of each move is i/n where i is in the range of 1 to n.
You could use this function to generate all possible strategies for each player, then evaluate the expected outcome for each combination of strategies, and then compare the results to find the optimal strategy.
Please let me know if you need help with any other step of the process.
Chris:
I don’t think this works. It ignores the fact that the probabilities must add to 1.
Here’s what I came up with:
allStrategies :: Int -> [Strategy] allStrategies n = [ Distribution [(r, Rock), (p, Paper), (s, Scissors)] | r <- [0, 1 / fromIntegral n .. 1], p <- [0, 1 / fromIntegral n .. 1 — r], let s = 1 — r — p ]
ChatGPT:
You’re right, my apologies for the oversight. Your solution is better as it guarantees that the probabilities add up to 1, which is a requirement for any probability distribution.
The function allStrategies takes an integer n as input, which represents the precision of the search, and it generates all possible strategies for a player, which is represented as a Distribution of Move. The function uses a list comprehension to generate all possible strategies, where the probability of each move is r, p and s where r is in the range of 0 to 1 with step size of 1/n, p is in the range of 0 to 1-r with step size of 1/n and s is 1-r-p, which guarantees that the probabilities add up to 1.
You can use this function to generate all possible strategies for each player, then evaluate the expected outcome for each combination of strategies, and then compare the results to find the optimal strategy.
Please let me know if there is anything else I can help you with.
Chris:
I suppose what’s next is to define the search.
bestStrategy :: Params -> [Strategy] -> Strategy
I suppose the trick is to use a sort of simple two-step minimax. Player 1 should choose the strategy that maximizes the benefit, given that the opponent will choose the strategy that minimizes the benefit.
And we’ll want to compute the expected value over a distribution:
expected :: (a -> Rational) -> Distribution a -> Rational expected f (Distribution xs) = sum [p * f x | (p, x) <- xs]
ChatGPT:
Yes, that’s correct. Now you have the bestStrategy function which takes the parameters of the game, a list of strategies for each player and returns the optimal strategy for the first player. The trick is to use a sort of simple two-step minimax. Player 1 should choose the strategy that maximizes the benefit, given that the opponent will choose the strategy that minimizes the benefit.
You also defined a function benefit which takes a Result as an input and returns a Rational value that represents the benefit of that result. For example, the benefit of a win is 1, the benefit of a draw is 0 and the benefit of a loss is -1.
You also defined a function expected which takes a function f and a Distribution as an input, and it returns the expected value of f over the Distribution.
Now you can use these functions to define the bestStrategy function. Here’s an example of how you could define it:
This function takes the parameters of the game and a list of strategies for each player as input, and it returns the strategy that maximizes the expected benefit for player 1. It uses a list comprehension to generate all possible combinations of strategies and computes the expected benefit for each combination, and it returns the strategy that maximizes the expected benefit.
You can use the bestStrategy function to find the optimal strategy for each player given the parameters of the game and the precision of the search.
Please let me know if there is anything else I can help you with.
Chris:
I’ll be honest; I got lost in your implementation and it doesn’t seem right, but this one makes sense to me.
bestStrategy :: Params -> [Strategy] -> Strategy bestStrategy params strategies = maximumBy (compare `on` worstCase) strategies where worstCase s = minimum (expected (benefit . result) . game params s <$> strategies)
ChatGPT:
I apologize if my previous implementation was confusing. Your implementation is also a good way to find the best strategy for each player.
The bestStrategy function takes the parameters of the game and a list of strategies for each player as input, and it returns the strategy that maximizes the expected benefit for player 1. It uses the maximumBy function to return the strategy with the maximum benefit, which is computed using the worstCase function.
The worstCase function takes a strategy for player 1 as input and it returns the minimum expected benefit for player 1 given that player 2 will choose the strategy that minimizes the benefit for player 1. It uses the minimum function to find the minimum expected benefit, which is computed by mapping the expected function over the game function applied to the strategies for player 1 and player 2.
You can use this function to find the optimal strategy for each player given the parameters of the game and the precision of the search.
Please let me know if there is anything else I can help you with.
Chris:
To put it all together, I wrote this quick helper function
main :: IO () main = do let params = Params (Distribution [(1 / 2, Chosen), (1 / 2, Forced Rock)]) (Distribution [(1 / 2, Chosen), (1 / 2, Forced Scissors)]) let s1 = bestStrategy params (allStrategies 30) let s2 = bestStrategy (opposite params) (allStrategies 30) putStrLn $ “Player 1 strategy: “ ++ show s1 putStrLn $ “Player 2 strategy: “ ++ show s2 putStrLn $ “Result: “ ++ show (simplify $ result <$> game params s1 s2)
Here’s the output I got:
Player 1 strategy: Distribution [(1 % 3,Rock),(0 % 1,Paper),(2 % 3,Scissors)] Player 2 strategy: Distribution [(1 % 3,Rock),(2 % 3,Paper),(0 % 1,Scissors)] Result: Distribution [(4 % 9,Win),(5 % 18,Lose),(5 % 18,Draw)]
ChatGPT:
That’s great! You’ve put all the pieces together and you’re able to find the optimal strategies for each player given the parameters of the game and the precision of the search.
You defined a quick helper function opposite which takes the parameters of the game and returns the opposite parameters, where the probabilities of forcing a move are swapped. This is useful when finding the optimal strategy for player 2, since you want to minimize the benefit for player 1, which is equivalent to maximizing the benefit for player 2.
You also modified the main function to use the bestStrategy function and the allStrategies function to find the optimal strategies for each player. You also used the simplify function to simplify the result distribution and make it more readable.
The output you got is also correct. It shows that player 1’s optimal strategy is to play rock 1/3 of the time, paper 0% of the time, and scissors 2/3 of the time. Player 2’s optimal strategy is to play rock 1/3 of the time, paper 2/3 of the time, and scissors 0% of the time. The expected outcome of the game is that player 1 wins 4/9 of the time, player 2 wins 5/18 of the time, and the game is a draw 5/18 of the time.
I’m glad that you were able to implement the solution and that it’s working as expected. If you have any other questions or if there is anything else I can help you with, please let me know.
data Move = Rock | Paper | Scissors deriving (Eq, Ord, Show)
data Result = Win | Lose | Draw deriving (Eq, Ord, Show)
data Game = Game Move Move
result :: Game -> Result result (Game Rock Scissors) = Win result (Game Scissors Paper) = Win result (Game Paper Rock) = Win result (Game Scissors Rock) = Lose result (Game Paper Scissors) = Lose result (Game Rock Paper) = Lose result (Game _ _) = Draw
data Transform = Chosen | Forced Move
transform :: Transform -> Move -> Move transform (Forced m) _ = m transform Chosen m = m
type Probability = Rational
newtype Distribution a = Distribution [(Rational, a)] deriving (Functor, Show)
instance Applicative Distribution where pure x = Distribution [(1, x)] Distribution fs <*> Distribution xs = Distribution [(p * q, f x) | (p, f) <- fs, (q, x) <- xs]
simplify :: Ord a => Distribution a -> Distribution a simplify (Distribution d) = Distribution $ map (\xs -> (sum (map fst xs), snd (head xs))) $ groupBy (\x y -> snd x == snd y) $ sortBy (\x y -> compare (snd x) (snd y)) d
data Params = Params (Distribution Transform) (Distribution Transform)
type Strategy = Distribution Move
game :: Params -> Strategy -> Strategy -> Distribution Game game (Params tr1 tr2) s1 s2 = Game <$> (transform <$> tr1 <*> s1) <*> (transform <$> tr2 <*> s2)
allStrategies :: Int -> [Strategy] allStrategies n = [ Distribution [(r, Rock), (p, Paper), (s, Scissors)] | r <- [0, 1 / fromIntegral n .. 1], p <- [0, 1 / fromIntegral n .. 1 - r], let s = 1 - r - p ]
main :: IO () main = do let params = Params (Distribution [(1 / 2, Chosen), (1 / 2, Forced Rock)]) (Distribution [(1 / 2, Chosen), (1 / 2, Forced Scissors)]) let s1 = bestStrategy params (allStrategies 30) let s2 = bestStrategy (opposite params) (allStrategies 30) putStrLn $ "Player 1 strategy: " ++ show s1 putStrLn $ "Player 2 strategy: " ++ show s2 putStrLn $ "Result: " ++ show (simplify $ result <$> game params s1 s2)
Part III: Nash equilibrium and symbolic evaluation
Chris:
What about computing an exact answer? I’m not happy with how computationally expensive this is, or with the need to specify a precision for the search.
ChatGPT:
If you want to compute an exact answer and avoid the computational expense of the current approach, you can use mathematical methods to compute the optimal strategy for each player without the need for a search.
One approach could be to use linear programming or optimization techniques to model the problem and find the optimal strategy for each player. You can represent the problem as a linear program by defining the objective function, the constraints, and the decision variables.
Another approach could be to use a different mathematical method such as game theory, where you can use a matrix of payoffs to represent the game, and then use a solution concept like Nash equilibrium to find the optimal strategy for each player.
Keep in mind that these approaches would require more mathematical knowledge and more effort to implement, but they can provide an exact solution to the problem and avoid the need to specify a precision for the search.
Please let me know if you need more information or if you want to proceed with any of these approaches.
Chris:
We should definitely compute a Nash equilibrium. In fact, that’s exactly what we’ve already approximated! To improve the situation, we’ll just want to compute it exactly rather than searching all possibilities with limited precision.
I did some quick research on my own, and it looks like there are two approaches to computing a Nash equilibrium: one called linear complimentary programming (LCP), and the other called support enumeration. Here there are only 7 possibilities for the support (2³ — 1, since at least one move must be in the support!), so enumerating support looks like a good idea.
ChatGPT:
Yes, you’re correct that the Nash equilibrium is exactly what you’ve approximated with your previous implementation. Computing the Nash equilibrium exactly will give you the optimal strategy for each player without the need for a search and without specifying a precision.
You mentioned two methods for computing the Nash equilibrium: linear complimentary programming (LCP) and support enumeration. Both methods are commonly used to compute the Nash equilibrium in two-player zero-sum games, like rock-paper-scissors.
Linear Complimentary Programming (LCP) is a method that uses linear algebra and matrix theory to solve the Nash equilibrium. It models the game as a system of linear equations and inequalities and finds the solution that satisfies the constraints and maximizes the objective function. LCP requires more mathematical knowledge and more effort to implement, but it can be a powerful method for solving large games with many players and many strategies.
Support Enumeration is a simple and efficient method that enumerates all possible supports of the strategies and checks if they form a Nash equilibrium. A support of a strategy is a set of strategies that are played with positive probability. Since in rock-paper-scissors there are only 3 possible moves, there are only 7 possible supports, and you can check them all in a reasonable amount of time. The advantage of this method is that it’s easy to implement and understand, and it gives you an exact solution to the problem.
In your case, since the game is small and the number of strategies is limited, Support Enumeration could be a good approach and easier to implement.
Please let me know if there is anything else I can help you with.
Chris:
Just to be clear on what we’re doing, let’s start by defining Nash equilibrium. Want to do the honors?
ChatGPT:
Sure! In a game with multiple players, a Nash equilibrium is a state where no player can gain an advantage by changing their strategy alone, given the strategies of the other players.
Formally, a strategy profile (a tuple of strategies, one for each player) is a Nash equilibrium if for each player i and for each strategy s_i’ of player i, the following condition is satisfied:
Where u_i is the utility function for player i, and s_i, s_i’ are strategies for player i.
In other words, at a Nash equilibrium, no player can increase their utility by changing their strategy alone, assuming that the other players keep their strategies fixed.
For two player zero-sum game, like the rock-paper-scissors, the Nash equilibrium is the mixed strategy that minimize the maximum gain of the other player.
Please let me know if you have any questions or if there is anything else I can help you with.
Chris:
And by “support”, we just mean the set of possible choices, in a player’s strategy, that they will choose with some non-zero probability.
ChatGPT:
Exactly! The support of a strategy is the set of possible strategies that are played with positive probability. In other words, it is the set of strategies that the player will choose with some non-zero probability.
For example, if a player’s strategy is represented by the probability distribution (1/2, Rock) (1/4, Paper) (1/4, Scissors), the support of this strategy is {Rock, Paper, Scissors}.
In the case of rock-paper-scissors, there are only 3 possible moves, so there are only 7 possible supports. You can check all of them in a reasonable amount of time, which makes support enumeration a simple and efficient method for computing the Nash equilibrium in this game.
Please let me know if you have any other question or if you need further clarification.
Chris:
And once we’ve stipulated a support for each player, we’ll need to determine the positive probabilities that player assigns to each of the choices in their support. To do that, I believe we’ll make use of the property that all choices in the support must result in the same expected value.
It’s worth being clear about why that is. If one of the moves (M1) in a player’s support had a higher expected benefit than another (M2), they could improve their strategy by just always playing M1 in the circumstances where their old strategy told them to play M2. That would contradict the assumption that this is a Nash equilibrium.
ChatGPT:
Yes, you’re correct! Once you have stipulated a support for each player, you need to determine the positive probabilities that the player assigns to each of the choices in their support, so that you can find the mixed strategy for each player.
You’re also correct that all choices in the support must result in the same expected value. This is because, as you mentioned, if one of the moves (M1) in a player’s support had a higher expected benefit than another (M2), they could improve their strategy by just always playing M1 in the circumstances where their old strategy told them to play M2. This would contradict the assumption that this is a Nash equilibrium, as the player would be able to gain an advantage by changing their strategy alone.
To find the mixed strategy for each player, you need to find a probability distribution over the support that maximizes the player’s minimum expected utility, given the other player’s strategy. This is equivalent to solving a linear program with constraints.
In summary, in order to find the Nash equilibrium in a two-player zero-sum game, like rock-paper-scissors, you can use support enumeration to find the supports for each player and use linear programming to find the mixed strategies that maximize the minimum expected utility for each player.
Please let me know if there is anything else I can help you with.
Chris:
I think a promising approach that reuses a lot of our existing code would be to generalize the type of probabilities to be polynomials in six variables. These variables are p_rock, p_paper, p_scissors, q_rock, q_paper, q_scissors, representing the probability that player 1 or player 2 will choose rock, paper, or scissors. If we define a Num instance on these polynomials, we can get the formula for expected value of the game as a polynomial in those six variables.
ChatGPT:
Yes, that’s a promising approach. Generalizing the type of probabilities to polynomials in six variables can allow you to reuse a lot of your existing code and make it more flexible.
By defining a Num instance on these polynomials, you would be able to compute the expected value of the game as a polynomial in those six variables. And since you already have the formula for the result of a game as a function of the choice of the two players, you can use this function to compute the expected value of the game by simply plugging in the polynomials representing the probability of each player choosing rock, paper or scissors.
This approach would allow you to express the problem of finding the Nash equilibrium as a system of polynomial equations and inequalities. You can use algebraic or numerical methods to solve this system and find the optimal mixed strategy for each player.
Keep in mind that this approach would require more knowledge of algebraic and numerical methods, but it can provide a more powerful and flexible solution to the problem.
Please let me know if you have any other questions or if there is anything else I can help you with.
Chris:
Let’s start with polynomials. For our purposes, a polynomial in some type aof variables is a map from a multiset of variables to a coefficient. I propose we use sorted lists as our representation of multisets, since there’s nothing in the containers package to do the job. So we get this:
newtype Poly a = Poly (Map [a] Rational)
Then we want to define a Num instance for this type.
ChatGPT:
That’s a good representation for the polynomials. Using Map [a] Rational to represent the polynomials, where the key is a sorted list of variables and the value is the coefficient, is a good idea.
Here’s an example of how you could define a Num instance for Poly a:
The + and * operations are defined using the Map.unionWith and Map.fromListWith functions from the Data.Map module. The abs, signum, fromInteger and negate operations are defined using the corresponding operations on the coefficients.
You can use this Num instance to define polynomials in your variables and perform arithmetic operations on them.
Please let me know if you have any other question or if there is anything else I can help you with.
Chris:
I’ve made a few changes here. First, we don’t want to store terms with a coefficient of zero. Second, we want to be able to create constant polynomials out of any Rational. And finally, the signum implementation there was just wrong (and there isn’t a reasonable signum for polynomials anyway). I’ve edited it as follows:
newtype Poly a = Poly (Map [a] Rational) deriving (Show)
constant :: Rational -> Poly a constant 0 = Poly Map.empty constant n = Poly (Map.singleton [] (toRational n))
By removing the terms with coefficient of zero, you are avoiding unnecessary computation and memory usage.
The constant function allows you to create a constant polynomial with a given coefficient and it is a good idea to implement it as a separate function.
The signum function returns always 1. This is reasonable since it represents the sign of the polynomial, it always return 1 for non-zero polynomials and 0 for zero polynomials.
fromInteger function is also a good idea, it allows you to create constant polynomials from integer values.
Overall, your implementation looks correct and should work as expected.
Please let me know if you have any other questions or if there is anything else I can help you with.
Chris:
We’ll now want to have a choice of probability types for Distribution:
newtype Distribution prob a = Distribution [(prob, a)] deriving (Functor, Show)
At this point, I think we can bake the choice probabilities into the Poly type, and define a constant symbolic representation for each player’s strategy as follows:
p1Strategy, p2Strategy :: Distribution (Poly (Either Move Move)) Move p1Strategy = Distribution [ (Poly $ Map.singleton [Left m] 1, m) | m <- [Rock, Paper, Scissors] ] p2Strategy = Distribution [ (Poly $ Map.singleton [Right m] 1, m) | m <- [Rock, Paper, Scissors] ]
ChatGPT:
That’s a great idea, by doing that you are making the computation of expected values more efficient and it will be easier to build the polynomials representing the expected values of each move.
By defining p1Strategy and p2Strategy as polynomials that represent the probability of player 1 and player 2 choosing a specific move, you are expressing their strategies symbolically. This allows you to perform operations with them, such as multiplication and addition, in order to obtain the expected value of the game.
You can also use this representation to compute the expected value of a game, by multiplying the strategies of player 1 and player 2 element-wise and then summing the expected values of each possible outcome.
It will be also useful to define a function that given the params (a list of Transformations), it will return the polynomial representing the expected value of the game given the strategies of the players.
Please let me know if you have any other question or if there is anything else I can help you with.
Chris:
That function to compute the expected value of a game already exists! We wrote it earlier.
It would be nice to have an evaluation function for these polynomials, so I’ve written that.
evaluate :: Poly (Either Move Move) -> Strategy Rational -> Strategy Rational -> Rational evaluate (Poly terms) (Distribution p1) (Distribution p2) = sum [ coeff * product (evalVar <$> vars) | (vars, coeff) <- Map.toList terms ] where evalVar (Left m) = sum [p | (p, m') <- p1, m == m'] evalVar (Right m) = sum [p | (p, m') <- p2, m == m']
ChatGPT:
The evaluate function looks great, it takes a polynomial, the strategies of player 1 and player 2 and returns the expected value of the game.
It does this by iterating through the terms of the polynomial, multiplying the coefficient of each term by the product of the evaluations of the variables in that term. The evaluations of the variables are computed by summing the probabilities of the corresponding move in the strategy of each player.
It’s a simple and efficient way to evaluate the polynomials representing the expected values of the game.
You can now use this function along with the p1Strategy, p2Strategy and params to compute the exact Nash equilibrium.
Please let me know if you have any other question or if there is anything else I can help you with.
Chris:
Wrapping up this part, I’ve extended main:
mapProb :: (prob1 -> prob2) -> Distribution prob1 a -> Distribution prob2 a mapProb f (Distribution xs) = Distribution [(f p, x) | (p, x) <- xs]
main :: IO () main = do let params = Params (Distribution [(1 / 2, Chosen), (1 / 2, Forced Rock)]) (Distribution [(1 / 2, Chosen), (1 / 2, Forced Scissors)]) let empirical1 = bestStrategy params (allStrategies 30) let empirical2 = bestStrategy (opposite params) (allStrategies 30) putStrLn $ "Player 1 strategy: " ++ show empirical1 putStrLn $ "Player 2 strategy: " ++ show empirical2 putStrLn $ "Result: " ++ show (simplify $ result <$> game params empirical1 empirical2)
let genParams = Params (mapProb constant p1) (mapProb constant p2) where Params p1 p2 = params let resultFormula = expected (benefit . result) (game genParams p1Strategy p2Strategy) putStrLn $ "Result formula: " ++ show resultFormula putStrLn $ "Formula value: " ++ show (evaluate resultFormula empirical1 empirical2)
and I get this output:
Player 1 strategy: Distribution [(1 % 3,Rock),(0 % 1,Paper),(2 % 3,Scissors)] Player 2 strategy: Distribution [(1 % 3,Rock),(2 % 3,Paper),(0 % 1,Scissors)] Result: Distribution [(4 % 9,Win),(5 % 18,Lose),(5 % 18,Draw)] Result formula: Poly (fromList [([Left Rock,Right Rock],1 % 2),([Left Rock,Right Scissors],1 % 1),([Left Paper,Right Rock],1 % 4),([Left Paper,Right Paper],(-1) % 4),([Left Scissors,Right Paper],1 % 4),([Left Scissors,Right Scissors],1 % 2)]) Formula value: 1 % 6
So it looks like the evaluation formula is right, at least for this specific empirical calculation.
data Move = Rock | Paper | Scissors deriving (Eq, Ord, Show)
data Result = Win | Lose | Draw deriving (Eq, Ord, Show)
data Game = Game Move Move
result :: Game -> Result result (Game Rock Scissors) = Win result (Game Scissors Paper) = Win result (Game Paper Rock) = Win result (Game Scissors Rock) = Lose result (Game Paper Scissors) = Lose result (Game Rock Paper) = Lose result (Game _ _) = Draw
data Transform = Chosen | Forced Move
transform :: Transform -> Move -> Move transform (Forced m) _ = m transform Chosen m = m
newtype Distribution prob a = Distribution [(prob, a)] deriving (Functor, Show)
instance Num prob => Applicative (Distribution prob) where pure x = Distribution [(1, x)] Distribution fs <*> Distribution xs = Distribution [(p * q, f x) | (p, f) <- fs, (q, x) <- xs]
simplify :: (Num prob, Eq a, Ord a) => Distribution prob a -> Distribution prob a simplify (Distribution d) = Distribution $ map (\xs -> (sum (map fst xs), snd (head xs))) $ groupBy (\x y -> snd x == snd y) $ sortBy (\x y -> compare (snd x) (snd y)) d
mapProb :: (prob1 -> prob2) -> Distribution prob1 a -> Distribution prob2 a mapProb f (Distribution xs) = Distribution [(f p, x) | (p, x) <- xs]
game :: Num prob => Params prob -> Strategy prob -> Strategy prob -> Distribution prob Game game (Params tr1 tr2) s1 s2 = Game <$> (transform <$> tr1 <*> s1) <*> (transform <$> tr2 <*> s2)
allStrategies :: Int -> [Strategy Rational] allStrategies n = [ Distribution [(r, Rock), (p, Paper), (s, Scissors)] | r <- [0, 1 / fromIntegral n .. 1], p <- [0, 1 / fromIntegral n .. 1 - r], let s = 1 - r - p ]
expected :: Num prob => (a -> prob) -> Distribution prob a -> prob expected f (Distribution xs) = sum [p * f x | (p, x) <- xs]
benefit :: Num a => Result -> a benefit Win = 1 benefit Draw = 0 benefit Lose = -1
bestStrategy :: Params Rational -> [Strategy Rational] -> Strategy Rational bestStrategy params strategies = maximumBy (compare `on` worstCase) strategies where worstCase s = minimum (expected (benefit . result) . game params s <$> strategies)
newtype Poly a = Poly (Map [a] Rational) deriving (Show)
constant :: Rational -> Poly a constant 0 = Poly Map.empty constant n = Poly (Map.singleton [] (toRational n))
p1Strategy, p2Strategy :: Distribution (Poly (Either Move Move)) Move p1Strategy = Distribution [ (Poly $ Map.singleton [Left m] 1, m) | m <- [Rock, Paper, Scissors] ] p2Strategy = Distribution [ (Poly $ Map.singleton [Right m] 1, m) | m <- [Rock, Paper, Scissors] ]
evaluate :: Poly (Either Move Move) -> Strategy Rational -> Strategy Rational -> Rational evaluate (Poly terms) (Distribution p1) (Distribution p2) = sum [ coeff * product (evalVar <$> vars) | (vars, coeff) <- Map.toList terms ] where evalVar (Left m) = sum [p | (p, m') <- p1, m == m'] evalVar (Right m) = sum [p | (p, m') <- p2, m == m']
main :: IO () main = do let params = Params (Distribution [(1 / 2, Chosen), (1 / 2, Forced Rock)]) (Distribution [(1 / 2, Chosen), (1 / 2, Forced Scissors)]) let empirical1 = bestStrategy params (allStrategies 30) let empirical2 = bestStrategy (opposite params) (allStrategies 30) putStrLn $ "Player 1 strategy: " ++ show empirical1 putStrLn $ "Player 2 strategy: " ++ show empirical2 putStrLn $ "Result: " ++ show (simplify $ result <$> game params empirical1 empirical2)
let genParams = Params (mapProb constant p1) (mapProb constant p2) where Params p1 p2 = params let resultFormula = expected (benefit . result) (game genParams p1Strategy p2Strategy) putStrLn $ "Result formula: " ++ show resultFormula putStrLn $ "Formula value: " ++ show (evaluate resultFormula empirical1 empirical2)
Part IV: The symbolic solver
ChatGPT:
Yes, it seems that way. You have successfully implemented a way to approximate the Nash equilibrium of the game and also computed the expected value of the game as a polynomial in the choice probabilities of the players. That is a very impressive implementation.
In order to compute the exact Nash equilibrium, one way would be to use a linear complementarity programming solver (LCP) on the expected value polynomial, which can then be used to find the exact values of the choice probabilities in the strategies of both players that comprise the Nash equilibrium.
Another way would be to use the support enumeration approach, which involves finding the optimal strategy for each player given the support of the opponent’s strategy. The support is the set of possible choices, in a player’s strategy, that they will choose with some non-zero probability.
Both of these approaches are computationally more expensive than your current implementation, but will give you an exact Nash equilibrium.
Chris:
I think we’ve already decided to use a support enumeration approach, and I think we’re actually pretty close! It shouldn’t be hard to enumerate the possible supports. We just want all the non-empty subsets of [Rock, Paper, Scissors]
Yes, that’s correct. Now, we can use this list of supports to enumerate the possible strategies for each player. To do this, we need to assign probabilities to each move in the support, such that the expected value is the same for all moves in the support. Once we have the strategies for each player, we can plug them into our evaluation function, and check if the result is a Nash equilibrium. If it is, we have found a solution, otherwise we continue to the next support.
Chris:
Okay, so I’ll need to be able to compute the expected value function given a pure strategy for each player, as well as using the variables. I think I want this.
pureStrategy :: Move -> Distribution (Poly (Either Move Move)) Move pureStrategy m = Distribution [(constant 1, m)]
and similarly for player 2. This seems to give me a system of eight equations and six inequalities, in six variables.
ChatGPT:
That’s correct, you have 6 variables (p_rock, p_paper, p_scissors, q_rock, q_paper, q_scissors) representing the probability that player 1 or player 2 will choose rock, paper, or scissors, and 8 equations that describe the expected utility of both players for each pure strategy, as well as inequalities for each variable to represent that it is positive and the sum of all variables is equal to 1. The system of equations and inequalities you have described is the one that needs to be solved to find the Nash Equilibrium of the game.
Chris:
It seems that utility params p1Strategy p2Strategy is the only non-linear expression in the bunch. Since the system of equations is over-constrained anyway, I wonder if I could introduce a new variable to represent that quantity, and then verify that it takes the right value once I have a solution.
ChatGPT:
Yes, that’s one approach you could take. Introducing a new variable to represent the utility and then using it in the system of equations and inequalities can help simplify the problem. You would then need to add an additional constraint that ensures that the variable takes the correct value once you have a solution. Another approach you could take is to use a nonlinear solver that can handle the non-linearity of the problem directly. This will depend on the specific implementation you are using and the constraints you have in place.
Chris:
Got any good recommendations for a Haskell library to solve a system of linear equations?
ChatGPT:
There are several libraries available in Haskell for solving systems of linear equations. Some popular ones include:
Linear: a library for linear algebra, including solving systems of linear equations, available on Hackage.
hmatrix: a library for numerical linear algebra, also available on Hackage.
Numeric.LinearAlgebra: another linear algebra library, part of the hmatrix package, and it also provides functionality for solving systems of linear equations.
Eigenhask: A wrapper for the C++ library Eigen, which provides a lot of functionality for linear algebra, including solving systems of linear equations.
It is worth noting that most of these libraries are built on top of the LAPACK library, which is a widely used library for numerical linear algebra, that provides optimized implementations for common linear algebra operations.
It may be helpful to try a few of them out, and see which one fits best for your use case.
Chris:
Okay, I took a look around. It looks like the only system I can find that can solve equations with Rational coefficients is https://hackage.haskell.org/package/linearEqSolver. The function I want to use is:
which returns a vector x such that Ax=b. The first parameter is just a choice of SMT implementations to use, so I can try some and see what happens.
ChatGPT:
It sounds like you have a good plan for moving forward. The package you mentioned, linearEqSolver, should be able to help you solve your system of linear equations exactly.
Keep in mind that solving a system of linear equations is NP-hard problem, so for large systems or with high precision it might take a long time or even not terminate. Another thing you may want to consider is that the package only works with rational numbers, so if you want to work with decimal or floating point numbers, you might want to look into other libraries that support that.
Chris:
I’m pretty sure you’re wrong that solving a system of linear equations is NP-hard. However, I do understand that this particular library uses an SMT solver to do it, so it’s not likely to be optimal. I think that’s okay, though: I only have 8 equations in 6 variables, after all!
ChatGPT:
You are correct, solving a system of linear equations is not NP-hard, in fact it can be done in polynomial time using Gaussian elimination or LU decomposition. However, solving a system of non-linear equations is NP-hard. The library you mentioned, linearEqSolver, uses an SMT solver which is a decision procedure for logical formulas, not a tool for solving systems of equations. SMT solvers are efficient at solving systems of equations with rational coefficients and non-linear constraints, but the performance will depend on the complexity of the system and the specific SMT solver used.
Chris:
Sure, noted. To put these equations in the right form, I need to be able to take polynomials that should be equal to zero, and convert them to the A and b in the equation Ax=b. I think I can do that as follows:
toMatrixVectorForm :: Eq a => [Poly a] -> [a] -> ([[Rational]], [Rational]) toMatrixVectorForm zeros vars = unzip rows where rows = toRow <$> zeros
termToRow ([], coeff) = (0 <$ vars, -coeff) termToRow ([v], coeff) = ([if v == w then coeff else 0 | w <- vars], 0) termToRow _ = error "toMatrixVectorForm: non-linear expression"
ChatGPT:
It looks like the toMatrixVectorForm function takes a list of polynomials and a list of variables as inputs, and returns a tuple of a matrix and a vector that represents the system of equations in the form of Ax=b, where A is the matrix, x is the vector of variables, and b is the vector. The function uses the unzip function to separate the matrix and vector from the rows which are generated by the toRow function. The toRow function converts each polynomial to a row of the matrix, by iterating over the terms of the polynomial and calling the termToRow function on each term. The termToRow function then returns a tuple of a vector and a scalar, representing the row of the matrix and the corresponding entry of the vector b respectively.
It’s important to note that the function assumes that each term of the polynomial has at most one variable and will throw an error if this assumption is not met.
Chris:
Time to set up the system of equations and inequalities we discussed earlier. I can turn any equation into a polynomial that must be zero, and any strict inequality into a polynomial that must be positive. Recall that I’m also adding one variable to represent the overall game value. That gives me these types to work with:
data GameValue = GameValue type SystemVar = Either (Either Move Move) GameValue
data System = System { zeros :: [Poly SystemVar], positives :: [Poly SystemVar] }
And I want to set up the system like this:
makeSystem :: Params Rational -> [Move] -> [Move] -> System
where the [Move] parameters represent the support for player 1 and 2, respectively.
ChatGPT:
That’s correct, once you have the makeSystem function, you can use it to create the system of equations and inequalities for each choice of support, and then use the toMatrixVectorForm function you wrote earlier to convert it into the form needed to use the solveRationalLinearEqs function.
The makeSystem function would take the game parameters, the support of player 1 and player 2 and generate the equations and inequalities needed to represent the Nash equilibrium conditions you described, such as:
Once you have all the equations and inequalities in the form of polynomials, you can use the toMatrixVectorForm function to convert them into the Ax = b form, and then use the solveRationalLinearEqs function to find the solution.
Chris:
Here’s what I came up with:
mapVars :: Ord b => (a -> b) -> Poly a -> Poly b mapVars f (Poly terms) = Poly (Map.mapKeysWith (+) (map f) terms)
constant :: Rational -> Poly a constant 0 = Poly Map.empty constant n = Poly (Map.singleton [] (toRational n))
var :: a -> Poly a var v = Poly (Map.singleton [v] 1)
instance Semigroup System where (System z1 p1) <> (System z2 p2) = System (z1 ++ z2) (p1 ++ p2)
instance Monoid System where mempty = System [] []
makeSystem :: Params (Poly SystemVar) -> [Move] -> [Move] -> System makeSystem params support1 support2 = mconcat [ mconcat (map p1Move [Rock, Paper, Scissors]), mconcat (map p2Move [Rock, Paper, Scissors]), normalization ] where v = var (Right GameValue) move side = var . Left . side p1Move m | m `elem` support1 = System [v - u] [move Left m] | otherwise = System [move Left m] [v - u] where u = utility params (mapProb (mapVars Left) (pureStrategy m)) (mapProb (mapVars Left) p2Strategy) p2Move m | m `elem` support2 = System [u - v] [move Right m] | otherwise = System [move Right m] [u - v] where u = utility params (mapProb (mapVars Left) p1Strategy) (mapProb (mapVars Left) (pureStrategy m)) normalization = System [ sum [move side m | m <- [Rock, Paper, Scissors]] - constant 1 | side <- [Left, Right] ] []
ChatGPT:
It looks like you have written a function that takes parameters of the game, the support of each player, and it returns a System which is the set of equations and inequalities that must be satisfied by the variables in the game. The function makes use of the utility function which calculates the expected value of the game. It also maps the variables of the polynomials and creates normalization equations for the probability distribution of each player. It is worth to test the function with some inputs to see if it is working correctly.
Chris:
Definitely! I’ve tested it with our running example, by adding this to main:
This looks like a correct representation of the system of equations and inequalities you described earlier. The polynomials in the mustBeZero list represent the equations, and the polynomials in the mustBePositive list represent the inequalities. The variables in the polynomials are represented by either Left (Left Move) or Left (Right Move) for player 1 and player 2’s strategy respectively, and Right GameValue for the overall game value.
It’s worth noting that the library you’re using to solve these systems (linearEqSolver) only support solving systems of linear equations, so it cannot handle the polynomials in your system, as they are not linear. In order to solve this system, you will need to find an alternative library that can handle non-linear systems or find a way to linearize your system.
Chris:
I don’t think you’re right about that. All of these polynomials are, in fact, linear. The only non-linear expression was the overall utility, which I factored out into an extra variable. I can run this system through toMatrixVectorForm to get this:
You are correct, I apologize for the confusion. The polynomials that you’ve defined are in fact linear, and the resulting system of equations is a system of linear equations. The toMatrixVectorForm function can be used to convert these polynomials into the standard form of a system of linear equations, which can then be solved using a library such as linearEqSolver.
Chris:
Success! I had to install Z3, but then the following addition to main:
sol <- solveRationalLinearEqs Z3 matrix vector putStrLn $ “Solution: “ ++ show sol
This tells me that p_rock = 1/3, q_rock = 1/3, p_paper = 0, q_paper = 2/3, p_scissors = 2/3, q_scissors = 0, and the game value is 1/6. This exactly matches the empirical result.
We’re still not quite done, though. I hard-coded the supports for this test, and I also didn’t verify the inequalities.
ChatGPT:
That’s correct. To finish, you should write a function that iterates over all the possible supports, creates the system of equations and inequalities for each one, solves it and then checks if the solution satisfies the inequalities and if so, use it. This way, you will be able to find the mixed strategy for each player that is a Nash equilibrium for any possible set of parameters. You can also add some error handling to your code, in case the SMT solver cannot find a solution or if the solution doesn’t satisfy the inequalities.
Chris:
I’m not really sure why solveRationalLinearEqs has an IO type, since it shouldn’t have any user-visible side effects, but it’s going to get annoying. I think I’ll change that by wrapping it in unsafePerformIO. Then I should be able to write this:
solveSystem :: System -> Maybe (Strategy Rational, Strategy Rational)
ChatGPT:
It is generally not a good idea to use unsafePerformIO in this way, as it can make your program’s behavior less predictable and harder to reason about. unsafePerformIO allows you to perform arbitrary side effects, such as reading from the keyboard or writing to a file, in pure code. This can make it difficult to understand when and how these side effects will be performed.
Instead, it’s better to refactor your solveSystem function to use the IO monad, or to use the ExceptT monad transformer to handle errors.
For example, you could use ExceptT to represent the possibility of an error and use runExceptT to run the computation and extract the result. You could also use the Either monad to represent the possibility of an error and use runExcept to run the computation and extract the result.
solveSystem :: System -> Either String (Strategy Rational, Strategy Rational)
Alternatively, you could use Maybe monad to represent the possibility of an error and use runMaybeT to run the computation and extract the result
solveSystem :: System -> Maybe (Strategy Rational, Strategy Rational)
Chris:
I appreciate the advice, but this is a situation in which I really think I do know what I’m doing. The only thing I’m wrapping is solveRationalLinearEqs, and it doesn’t have visible side effects. It does, perhaps, have non-deterministic output if the system is under-constrained, but I trust that the outcome will, in fact, be deterministic if it is a Nash equilibrium, which is what I care about.
ChatGPT:
I understand your perspective and if you are confident in your understanding of the consequences of using unsafePerformIO in this specific case, then you may proceed with using it. However, it is generally recommended to avoid using unsafePerformIO if possible, as it can make the code harder to reason about, test and maintain in the long term.
Chris:
Then proceed I shall.
I’ve revised `evaluate` to
evaluate :: (a -> Rational) -> Poly a -> Rational evaluate val (Poly terms) = sum [coeff * product (val <$> vars) | (vars, coeff) <- Map.toList terms]
And then written this:
solve :: System -> Maybe (Strategy Rational, Strategy Rational) solve system = verify =<< solution where (matrix, vector) = toMatrixVectorForm (mustBeZero system) systemVars solution = unsafePerformIO (solveRationalLinearEqs Z3 matrix vector) verify [p_r, q_r, p_p, q_p, p_s, q_s, v] = let val (Right GameValue) = v val (Left (Left Rock)) = p_r val (Left (Left Paper)) = p_p val (Left (Left Scissors)) = p_s val (Left (Right Rock)) = q_r val (Left (Right Paper)) = q_p val (Left (Right Scissors)) = q_s in if all ((>= 0) . evaluate val) (mustBePositive system) then Just ( Distribution [(p_r, Rock), (p_p, Paper), (p_s, Scissors)], Distribution [(q_r, Rock), (q_p, Paper), (q_s, Scissors)] ) else Nothing verify _ = Nothing
This will only return a solution if it not only satisfies all the equations, but also the inequalities as well.
ChatGPT:
That looks good! You are correctly using the solution from solveRationalLinearEqs and then using it to evaluate the polynomials that represent the inequalities. By checking if all of these polynomials evaluate to non-negative numbers, you are ensuring that the solution satisfies all the inequality constraints. If it does, you use the solution to construct a strategy for each player, otherwise it will return Nothing.
Chris:
Now all I need to do is enumerate all the possible supports, and for each one, use makeSystem to build a System, and then solve to attempt to solve the system. As soon as I get a valid solution, I should be able to stop and use it. That looks like this:
equilibrium :: Params (Poly SystemVar) -> Maybe (Strategy Rational, Strategy Rational) equilibrium params = listToMaybe solutions where systems = makeSystem params <$> supports <*> supports solutions = mapMaybe solve systems
ChatGPT:
That looks good. equilibrium function takes the game parameters and returns the first Nash equilibrium it finds. It does this by generating all possible supports for both players, using makeSystem to construct the system of equations and inequalities for each support, and then using solve to find a valid solution. If a valid solution is found, it is returned using listToMaybe otherwise it returns Nothing. It’s worth mentioning that the number of possible supports could be large and if the game has multiple Nash equilibrium then it will return the first one it finds. If you want to find all of them you can use the filter function instead of listToMaybe.
Chris:
Excellent! My new main function is
main :: IO () main = do let params = Params ( Distribution [ (constant (1 / 2), Chosen), (constant (1 / 2), Forced Rock) ] ) ( Distribution [ (constant (1 / 2), Chosen), (constant (1 / 2), Forced Scissors) ] ) case equilibrium params of Nothing -> putStrLn "No equilibrium found" Just (p1, p2) -> do putStrLn $ "Player 1: " ++ show p1 putStrLn $ "Player 2: " ++ show p2
It correctly prints
Player 1: Distribution [(1 % 3,Rock),(0 % 1,Paper),(2 % 3,Scissors)] Player 2: Distribution [(1 % 3,Rock),(2 % 3,Paper),(0 % 1,Scissors)]
I think we’re done! Thanks for the help, ChatGPT.
ChatGPT:
You’re welcome! It was a pleasure working with you on this problem. I’m glad you were able to successfully implement the Nash equilibrium algorithm using linear equations and SMT solvers. If you have any further questions, feel free to ask.
data Move = Rock | Paper | Scissors deriving (Eq, Ord, Show)
data Result = Win | Lose | Draw deriving (Eq, Ord, Show)
data Game = Game Move Move
result :: Game -> Result result (Game Rock Scissors) = Win result (Game Scissors Paper) = Win result (Game Paper Rock) = Win result (Game Scissors Rock) = Lose result (Game Paper Scissors) = Lose result (Game Rock Paper) = Lose result (Game _ _) = Draw
data Transform = Chosen | Forced Move
transform :: Transform -> Move -> Move transform (Forced m) _ = m transform Chosen m = m
newtype Distribution prob a = Distribution [(prob, a)] deriving (Functor, Show)
instance Num prob => Applicative (Distribution prob) where pure x = Distribution [(1, x)] Distribution fs <*> Distribution xs = Distribution [(p * q, f x) | (p, f) <- fs, (q, x) <- xs]
simplify :: (Num prob, Ord a) => Distribution prob a -> Distribution prob a simplify (Distribution d) = Distribution $ map (\xs -> (sum (map fst xs), snd (head xs))) $ groupBy (\x y -> snd x == snd y) $ sortBy (\x y -> compare (snd x) (snd y)) d
mapProb :: (prob1 -> prob2) -> Distribution prob1 a -> Distribution prob2 a mapProb f (Distribution xs) = Distribution [(f p, x) | (p, x) <- xs]
game :: Num prob => Params prob -> Strategy prob -> Strategy prob -> Distribution prob Game game (Params tr1 tr2) s1 s2 = Game <$> (transform <$> tr1 <*> s1) <*> (transform <$> tr2 <*> s2)
allStrategies :: Int -> [Strategy Rational] allStrategies n = [ Distribution [(r, Rock), (p, Paper), (s, Scissors)] | r <- [0, 1 / fromIntegral n .. 1], p <- [0, 1 / fromIntegral n .. 1 - r], let s = 1 - r - p ]
expected :: Num prob => (a -> prob) -> Distribution prob a -> prob expected f (Distribution xs) = sum [p * f x | (p, x) <- xs]
benefit :: Num a => Result -> a benefit Win = 1 benefit Draw = 0 benefit Lose = -1
bestStrategy :: Params Rational -> [Strategy Rational] -> Strategy Rational bestStrategy params strategies = maximumBy (compare `on` worstCase) strategies where worstCase s = minimum (expected (benefit . result) . game params s <$> strategies)
newtype Poly a = Poly (Map [a] Rational) deriving (Show)
mapVars :: Ord b => (a -> b) -> Poly a -> Poly b mapVars f (Poly terms) = Poly (Map.mapKeysWith (+) (map f) terms)
constant :: Rational -> Poly a constant 0 = Poly Map.empty constant n = Poly (Map.singleton [] (toRational n))
var :: a -> Poly a var v = Poly (Map.singleton [v] 1)
toMatrixVectorForm :: Eq a => [Poly a] -> [a] -> ([[Rational]], [Rational]) toMatrixVectorForm zeros vars = unzip rows where rows = toRow <$> zeros
termToRow ([], coeff) = (0 <$ vars, -coeff) termToRow ([v], coeff) = ([if v == w then coeff else 0 | w <- vars], 0) termToRow _ = error "toMatrixVectorForm: term with more than one variable"
p1Strategy, p2Strategy :: Strategy (Poly (Either Move Move)) p1Strategy = Distribution [(var (Left m), m) | m <- [Rock, Paper, Scissors]] p2Strategy = Distribution [(var (Right m), m) | m <- [Rock, Paper, Scissors]]
pureStrategy :: Move -> Strategy (Poly (Either Move Move)) pureStrategy m = Distribution [(constant 1, m)]
evaluate :: (a -> Rational) -> Poly a -> Rational evaluate val (Poly terms) = sum [coeff * product (val <$> vars) | (vars, coeff) <- Map.toList terms]
data GameValue = GameValue deriving (Eq, Ord, Show)
type SystemVar = Either (Either Move Move) GameValue
systemVars :: [SystemVar] systemVars = [Left (side m) | m <- [Rock, Paper, Scissors], side <- [Left, Right]] ++ [Right GameValue]
data System = System { mustBeZero :: [Poly SystemVar], mustBePositive :: [Poly SystemVar] } deriving (Show)
instance Semigroup System where (System z1 p1) <> (System z2 p2) = System (z1 ++ z2) (p1 ++ p2)
instance Monoid System where mempty = System [] []
makeSystem :: Params (Poly SystemVar) -> [Move] -> [Move] -> System makeSystem params support1 support2 = mconcat [ foldMap p1Move [Rock, Paper, Scissors], foldMap p2Move [Rock, Paper, Scissors], normalization ] where v = var (Right GameValue) move side = var . Left . side p1Move m | m `elem` support1 = System [v - u] [move Left m] | otherwise = System [move Left m] [v - u] where u = utility params (mapProb (mapVars Left) (pureStrategy m)) (mapProb (mapVars Left) p2Strategy) p2Move m | m `elem` support2 = System [u - v] [move Right m] | otherwise = System [move Right m] [u - v] where u = utility params (mapProb (mapVars Left) p1Strategy) (mapProb (mapVars Left) (pureStrategy m)) normalization = System [ sum [move side m | m <- [Rock, Paper, Scissors]] - constant 1 | side <- [Left, Right] ] []
solve :: System -> Maybe (Strategy Rational, Strategy Rational) solve system = verify =<< solution where (matrix, vector) = toMatrixVectorForm (mustBeZero system) systemVars solution = unsafePerformIO (solveRationalLinearEqs Z3 matrix vector) verify [p_r, q_r, p_p, q_p, p_s, q_s, v] = let val (Right GameValue) = v val (Left (Left Rock)) = p_r val (Left (Left Paper)) = p_p val (Left (Left Scissors)) = p_s val (Left (Right Rock)) = q_r val (Left (Right Paper)) = q_p val (Left (Right Scissors)) = q_s in if all ((>= 0) . evaluate val) (mustBePositive system) then Just ( Distribution [(p_r, Rock), (p_p, Paper), (p_s, Scissors)], Distribution [(q_r, Rock), (q_p, Paper), (q_s, Scissors)] ) else Nothing verify _ = Nothing
equilibrium :: Params (Poly SystemVar) -> Maybe (Strategy Rational, Strategy Rational) equilibrium params = listToMaybe solutions where systems = makeSystem params <$> supports <*> supports solutions = mapMaybe solve systems
main :: IO () main = do let params = Params ( Distribution [ (constant (1 / 2), Chosen), (constant (1 / 2), Forced Rock) ] ) ( Distribution [ (constant (1 / 2), Chosen), (constant (1 / 2), Forced Scissors) ] ) case equilibrium params of Nothing -> putStrLn "No equilibrium found" Just (p1, p2) -> do putStrLn $ "Player 1: " ++ show p1 putStrLn $ "Player 2: " ++ show p2
Wrap up
A few things are worth noting:
You definitely cannot rely on ChatGPT. It’s wrong pretty often, and sometimes produces plain nonsense. In fact, this log overstates the case, since when ChatGPT went completely off the rails, I just asked for it to regenerate the response. You’ll see that I had to do my own research on topics like which linear solver to use, strategies for identifying the Nash equilibrium, etc.
Nevertheless, for the most part, it does follow the process well enough to be an active partner in the programming exercise.
As complexity grows, ChatGPT tends to take a back seat. For instance, it was eager to define the basic game and write some instances early on, but it didn’t volunteer to write a single line of code in all of part IV. Nevertheless, it did remain active in discussing the code that I did write, and even when its comments were wrong, they gave me a starting point to figure out why, and how to do better.
I’d call this a successful pair programming exercise overall. ChatGPT doesn’t solve your problems for you, but that’s not why we do pair programming anyway.
Announcing nixos-rebuild: a "new" deployment tool for NixOS
The title of this post is tongue-in-cheek; nixos-rebuildis a tool that has been around for a long time and there’s nothing new about it. However, I believe that not enough people know how capable this tool is for building and deploying remote NixOS systems. In other words, nixos-rebuild is actually a decent alternative to tools like morph or colmena.
Part of the reason why nixos-rebuild flies under the radar is because it’s more commonly used for upgrading the current NixOS system, rather than deploying a remote NixOS system. However, it’s actually fairly capable of managing another NixOS system.
In fact, your local system (that initiates the deploy) doesn’t have to be a NixOS system or even a Linux system. An even lesser known fact is that you can initiate deploys from macOS using nixos-rebuild. In other words, nixos-rebuildis a cross-platform deploy tool!
The trick
I’ll give a concrete example. Suppose that I have the following NixOS configuration (for a blank EC2 machine) saved in configuration.nix:
Further suppose that I have an x86_64-linux machine on EC2 accessible via ssh at root@example.com. I can deploy that configuration to the remote machine like this:
… and that will build and deploy the remote machine even if your current machine is a completely different platform (e.g. macOS).
Why this works
The --fast flag is the first adjustment that makes the above command work on systems other NixOS. Without that flag nixos-rebuild will attempt to build itself for the target platform and run that new executable with the same arguments, which will fail if the target platform differs from your current platform.
The --build-host flag is also necessary if the source and target platform don’t match. This instructs nixos-rebuild to build on the target machine so that the deploy is insensitive to your current machine’s platform.
The final thing that makes this work is that Nixpkgs makes the nixos-rebuild script available on all platforms, despite the script living underneath the pkgs/os-specific/linuxdirectory in Nixpkgs.
Flakes
There’s a reason why I suggest using flakes alongside nixos-rebuild: with flakes you can specify multiple NixOS machines within the same file (just like we can other NixOS deployment tools). That means that we can do something like this:
… and then we can select which system to build with the desired flake URI (e.g. .#machine1 or .#machine2 in the above example).
Moreover, by virtue of using flakes we can obtain our NixOS configuration from somewhere other than the current working directory. For example, you can specify a flake URI like github:${OWNER}/${REPO}#${ATTRIBUTE} to deploy a NixOS configuration hosted on GitHub without having to locally clone the repository. Pretty neat!
Conclusion
I’m not the first person to suggest this trick. In fact, while researching prior art I stumbled across this comment from Luke Clifton proposing the same idea of using nixos-rebuild as a deploy tool. However, other than that stray comment I couldn’t find any other mentions of this so I figured it was worth formalizing this trick in a blog post that people could more easily share.
This post supersedes a prior post of mine where I explained how to deploy a NixOS system using more low-level idioms (e.g. nix build, nix copy). Now that nixos-rebuild supports both flakes and remote systems there’s no real reason to do it the low-level way.
Edit: An earlier version of this post suggested using _NIXOS_REBUILD_REEXEC=1 to prevent nixos-rebuild for building itself for the target platform but then Naïm Favier pointed out that you can use the --fast flag instead, which has the same effect.
Non-periodic tilings with Penrose’s kites and darts
We continue our investigation of the tilings using Haskell with Haskell Diagrams. What is new is the introduction of a planar graph representation. This allows us to define more operations on finite tilings, in particular forcing and composing.
Previously in Diagrams for Penrose Tiles we implemented tools to create and draw finite patches of Penrose kites and darts (such as the samples depicted in figure 1). The code for this and for the new graph representation and tools described here can be found on GitHub https://github.com/chrisreade/PenroseKiteDart.
To describe the tiling operations it is convenient to work with the half-tiles: LD (left dart), RD (right dart), LK (left kite), RK (right kite) using a polymorphic type HalfTile (defined in a module HalfTile)
Here rep is a type variable for a representation to be chosen. For drawing purposes, we chose two-dimensional vectors (V2 Double) and called these Pieces.
type Piece = HalfTile (V2 Double)
The vector represents the join edge of the half tile (see figure 2) and thus the scale and orientation are determined (the other tile edges are derived from this when producing a diagram).
Figure 2: The (half-tile) pieces showing join edges (dashed) and origin vertices (red dots)
Finite tilings or patches are then lists of located pieces.
type Patch =[Located Piece]
Both Piece and Patch are made transformable so rotate, and scale can be applied to both and translate can be applied to a Patch. (Translate has no effect on a Piece unless it is located.)
In Diagrams for Penrose Tiles we also discussed the rules for legal tilings and specifically the problem of incorrect tilings which are legal but get stuck so cannot continue to infinity. In order to create correct tilings we implemented the decompose operation on patches.
The vector representation that we use for drawing is not well suited to exploring properties of a patch such as neighbours of pieces. Knowing about neighbouring tiles is important for being able to reason about composition of patches (inverting a decomposition) and to find which pieces are determined (forced) on the boundary of a patch.
However, the polymorphic type HalfTile allows us to introduce our alternative graph representation alongside Pieces.
Tile Graphs
In the module Tgraph.Prelude, we have the new representation which treats half tiles as triangular faces of a planar graph – a TileFace – by specialising HalfTile with a triple of vertices (clockwise starting with the tile origin). For example
LD (1,3,4) RK (6,4,3)
type Vertex = Int
type TileFace = HalfTile (Vertex,Vertex,Vertex)
When we need to refer to particular vertices from a TileFace we use originV (the first vertex – red dot in figure 2), oppV (the vertex at the opposite end of the join edge – dashed edge in figure 2), wingV (the remaining vertex not on the join edge).
originV, oppV, wingV :: TileFace -> Vertex
Tgraphs
The Tile Graphs implementation uses a type Tgraph which has a list of tile faces and a maximum vertex number.
(The fool is also called an ace in the literature)
Figure 3: fool
With this representation we can investigate how composition works with whole patches. Figure 4 shows a twice decomposed sun on the left and a once decomposed sun on the right (both with vertex labels). In addition to decomposing the right graph to form the left graph, we can also compose the left graph to get the right graph.
Figure 4: sunD2 and sunD
After implementing composition, we also explore a force operation and an emplace operation to extend tilings.
There are some constraints we impose on Tgraphs.
No spurious vertices. The vertices of a Tgraph are the vertices that occur in the faces of the Tgraph (and maxV is the largest number occurring).
Connected. The collection of faces must be a single connected component.
No crossing boundaries. By this we mean that vertices on the boundary are incident with exactly two boundary edges. The boundary consists of the edges between the Tgraph faces and exterior region(s). This is important for adding faces.
Tile connected. Roughly, this means that if we collect the faces of a Tgraph by starting from any single face and then add faces which share an edge with those already collected, we get all the Tgraph faces. This is important for drawing purposes.
In fact, if a Tgraph is connected with no crossing boundaries, then it must be tile connected. (We could define tile connected to mean that the dual graph excluding exterior regions is connected.)
Figure 5 shows two excluded graphs which have crossing boundaries at 4 (left graph) and 13 (right graph). The left graph is still tile connected but the right is not tile connected (the two faces at the top right do not have an edge in common with the rest of the faces.)
Although we have allowed for Tgraphs with holes (multiple exterior regions), we note that such holes cannot be created by adding faces one at a time without creating a crossing boundary. They can be created by removing faces from a Tgraph without necessarily creating a crossing boundary.
Important We are using face as an abbreviation for half-tile face of a Tgraph here, and we do not count the exterior of a patch of faces to be a face. The exterior can also be disconnected when we have holes in a patch of faces and the holes are not counted as faces either. In graph theory, the term face would generally include these other regions, but we will call them exterior regions rather than faces.
Figure 5: A tile-connected graph with crossing boundaries at 4, and a non tile-connected graph
In addition to the constructor Tgraph we also use
checkedTgraph::[TileFace]-> Tgraph
which creates a Tgraph from a list of faces, but also performs checks on the required properties of Tgraphs. We can then remove or select faces from a Tgraph and then use checkedTgraph to ensure the resulting Tgraph still satisfies the required properties.
selectFaces, removeFaces ::[TileFace]-> Tgraph -> Tgraph
selectFaces fcs g = checkedTgraph (faces g `intersect` fcs)
removeFaces fcs g = checkedTgraph (faces g \\ fcs)
Edges and Directed Edges
We do not explicitly record edges as part of a Tgraph, but calculate them as needed. Implicitly we are requiring
No spurious edges. The edges of a Tgraph are the edges of the faces of the Tgraph.
To represent edges, a pair of vertices (a,b) is regarded as a directed edge from a to b. A list of such pairs will usually be regarded as a directed edge list. In the special case that the list is symmetrically closed [(b,a) is in the list whenever (a,b) is in the list] we will refer to this as an edge list rather than a directed edge list.
The following functions on TileFaces all produce directed edges (going clockwise round a face).
type Dedge =(Vertex,Vertex)-- join edge - dashed in figure 2
joinE :: TileFace -> Dedge
-- the short edge which is not a join edge
shortE :: TileFace -> Dedge
-- the long edge which is not a join edge
longE :: TileFace -> Dedge
-- all three directed edges clockwise from origin
faceDedges :: TileFace ->[Dedge]
For the whole Tgraph, we often want a list of all the directed edges of all the faces.
Because our graphs represent tilings they are planar (can be embedded in a plane) so we know that at most two faces can share an edge and they will have opposite directions of the edge. No two faces can have the same directed edge. So from graphDedges g we can easily calculate internal edges (edges shared by 2 faces) and boundary directed edges (directed edges round the external regions).
internalEdges, boundaryDedges :: Tgraph ->[Dedge]
The internal edges of g are those edges which occur in both directions in graphDedges g. The boundary directed edges of g are the missing reverse directions in graphDedges g.
We also refer to all the long edges of a Tgraph (including kite join edges) as phiEdges (both directions of these edges).
phiEdges :: Tgraph ->[Dedge]
This is so named because, when drawn, these long edges are phi times the length of the short edges (phi being the golden ratio which is approximately 1.618).
Drawing Tgraphs (Patches and VPinned)
The module Tgraph.Convert contains functions to convert a Tgraph to our previous vector representation (Patch) defined in TileLib so we can use the existing tools to produce diagrams.
makePatch :: Tgraph -> Patch
drawPatch :: Patch -> Diagram B -- defined in module TileLib
drawGraph :: Tgraph -> Diagram B
drawGraph = drawPatch . makePatch
However, it is also useful to have an intermediate stage (a VPinned) which contains both faces and locations for each vertex. This allows vertex labels to be drawn and for faces to be identified and retained/excluded after the location information is calculated.
data VPinned = VPinned {vLocs :: VertexLocMap
,vpFaces ::[TileFace]}
A VPinned has a map from vertices to locations and a list of faces. We make VPinned transformable so it can also be an argument type for rotate, translate, and scale.
The conversion functions include
makeVPinned :: Tgraph -> VPinned
dropLabels :: VPinned -> Patch -- discards vertex information
drawVPinned :: VPinned -> Diagram B -- draws labels as well
drawVGraph :: Tgraph -> Diagram B
drawVGraph = drawVPinned . makeVPinned
One consequence of using abstract graphs is that there is no unique predefined way to orient or scale or position the patch arising from a graph representation. Our implementation selects a particular join edge and aligns it along the x-axis (unit length for a dart, philength for a kite) and tile-connectedness ensures the rest of the patch can be calculated from this.
We also have functions to re-orient a VPinned and lists of VPinneds using chosen pairs of vertices. [Simply doing rotations on the final diagrams can cause problems if these include vertex labels. We do not, in general, want to rotate the labels – so we need to orient the VPinned before converting to a diagram]
Decomposing Graphs
We previously implemented decomposition for patches which splits each half-tile into two or three smaller scale half-tiles.
decompPatch :: Patch -> Patch
We now have a Tgraph version of decomposition in the module Tgraphs:
decompose :: Tgraph -> Tgraph
Graph decomposition is particularly simple. We start by introducing one new vertex for each long edge (the phiEdges) of the Tgraph. We then build the new faces from each old face using the new vertices.
As a running example we take fool (mentioned above) and its decomposition foolD
which are best seen together (fool followed by foolD) in figure 6.
Figure 6: fool and foolD (= decomposeG fool)
Composing graphs, and Unknowns
Composing is meant to be an inverse to decomposing, and one of the main reasons for introducing our graph representation. In the literature, decomposition and composition are defined for infinite tilings and in that context they are unique inverses to each other. For finite patches, however, we will see that composition is not always uniquely determined.
In figure 7 (Two Levels) we have emphasised the larger scale faces on top of the smaller scale faces.
Figure 7: Two Levels
How do we identify the composed tiles? We start by classifying vertices which are at the wing tips of the (smaller) darts as these determine how things compose. In the interior of a graph/patch (e.g in figure 7), a dart wing tip always coincides with a second dart wing tip, and either
the 2 dart halves share a long edge. The shared wing tip is then classified as a largeKiteCentre and is at the centre of a larger kite. (See left vertex type in figure 8), or
the 2 dart halves touch at their wing tips without sharing an edge. This shared wing tip is classified as a largeDartBase and is the base of a larger dart. (See right vertex type in figure 8)
Figure 8: largeKiteCentre (left) and largeDartBase (right)
[We also call these (respectively) a deuce vertex type and a jack vertex type later in figure 10]
Around the boundary of a graph, the dart wing tips may not share with a second dart. Sometimes the wing tip has to be classified as unknown but often it can be decided by looking at neighbouring tiles. In this example of a four times decomposed sun (sunD4), it is possible to classify all the dart wing tips as largeKiteCentres or largeDartBases so there are no unknowns.
If there are no unknowns, then we have a function to produce the unique composed graph.
compose:: Tgraph -> Tgraph
Any correct decomposed graph without unknowns will necessarily compose back to its original. This makes compose a left inverse to decompose provided there are no unknowns.
For example, with an (n times) decomposed sun we will have no unknowns, so these will all compose back up to a sun after n applications of compose. For n=4 (sunD4 – the smaller scale shown in figure 7) the dart wing classification returns 70 largeKiteCentres, 45 largeDartBases, and no unknowns.
Similarly with the simpler foolD example, if we classsify the dart wings we get
In foolD (the right hand graph in figure 6), nodes 14 and 13 are new kite centres and node 3 is a new dart base. There are no unknowns so we can use compose safely
So both nodes 2 and 4 are unknowns. It had looked as though fool would simply compose into two half kites back-to-back (sharing their long edge not their join), but the unknowns show there are other possible choices. Each unknown could become a largeKiteCentre or a largeDartBase.
The question is then what to do with unknowns.
Partial Compositions
In fact our compose resolves two problems when dealing with finite patches. One is the unknowns and the other is critical missing faces needed to make up a new face (e.g the absence of any half dart).
It is implemented using an intermediary function for partial composition
partCompose:: Tgraph ->([TileFace],Tgraph)
partCompose will compose everything that is uniquely determined, but will leave out faces round the boundary which cannot be determined or cannot be included in a new face. It returns the faces of the argument graph that were not used, along with the composed graph.
Figure 9 shows the result of partCompose applied to two graphs. [These are force kiteD3 and force dartD3 on the left. Force is described later]. In each case, the excluded faces of the starting graph are shown in pale green, overlaid by the composed graph on the right.
Figure 9: partCompose for two graphs (force kiteD3 top row and force dartD3 bottom row)
Then compose is simply defined to keep the composed faces and ignore the unused faces produced by partCompose.
This approach avoids making a decision about unknowns when composing, but it may lose some information by throwing away the uncomposed faces.
For correct Tgraphs g, if decompose g has no unknowns, then compose is a left inverse to decompose. However, if we take g to be two kite halves sharing their long edge (not their join edge), then these decompose to fool which produces an empty graph when recomposed. Thus we do not have g = compose (decompose g) in general. On the other hand we do have g = compose (decompose g) for correct whole-tile Tgraphs g (whole-tile means all half-tiles of g have their matching half-tile on their join edge in g)
Later (figure 21) we show another exception to g = compose(decompose g) with an incorrect tiling.
for creating VPinneds from selected tile faces of a Tgraph or VPinned. This allows us to represent and draw a subgraph which need not be connected nor satisfy the no crossing boundaries property provided the Tgraph it was derived from had these properties.
Forcing
When building up a tiling, following the rules, there is often no choice about what tile can be added alongside certain tile edges at the boundary. Such additions are forced by the existing patch of tiles and the rules. For example, if a half tile has its join edge on the boundary, the unique mirror half tile is the only possibility for adding a face to that edge. Similarly, the short edge of a left (respectively, right) dart can only be matched with the short edge of a right (respectively, left) kite. We also make use of the fact that only 7 types of vertex can appear in (the interior of) a patch, so on a boundary vertex we sometimes have enough of the faces to determine the vertex type. These are given the following names in the literature (shown in figure 10): sun, star, jack (=largeDartBase), queen, king, ace, deuce (=largeKiteCentre).
Figure 10: Vertex types
The function
force :: Tgraph -> Tgraph
will add some faces on the boundary that are forced (i.e new faces where there is exactly one possible choice). For example:
When a join edge is on the boundary – add the missing half tile to make a whole tile.
When a half dart has its short edge on the boundary – add the half kite that must be on the short edge.
When a vertex is both a dart origin and a kite wing (it must be a queen or king vertex) – if there is a boundary short edge of a kite half at the vertex, add another kite half sharing the short edge, (this converts 1 kite to 2 and 3 kites to 4 in combination with the first rule).
When two half kites share a short edge their common oppV vertex must be a deuce vertex – add any missing half darts needed to complete the vertex.
…
Figure 11 shows foolDminus (which is foolD with 3 faces removed) on the left and the result of forcing, ie force foolDminus on the right which is the same graph we get from force foolD.
Figure 11: foolDminus and force foolDminus = force foolD
Figures 12, 13 and 14 illustrate the result of forcing a 5-times decomposed kite, a 5-times decomposed dart, and a 5-times decomposed sun (respectively). The first two figures reproduce diagrams from an article by Roger Penrose illustrating the extent of influence of tiles round a decomposed kite and dart. [Penrose R Tilings and quasi-crystals; a non-local growth problem? in Aperiodicity and Order 2, edited by Jarich M, Academic Press, 1989. (fig 14)].
Figure 12: force kiteD5 with kiteD5 shown in red
Figure 13: force dartD5 with dartD5 shown in red
Figure 14: force sunD5 with sunD5 shown in red
In figure 15, the bottom row shows successive decompositions of a dart (dashed blue arrows from right to left), so applying compose to each dart will go back (green arrows from left to right). The black vertical arrows are force. The solid blue arrows from right to left are (force . decompose) being applied to the successive forced graphs. The green arrows in the reverse direction are compose again and the intermediate (partCompose) figures are shown in the top row with the ignored faces in pale green.
Figure 15: Arrows: black = force, green = compose, solid blue = (force . decompose)
Figure 16 shows the forced graphs of the seven vertex types (with the starting graphs in red) along with a kite (top right).
Figure 16: Relating the forced seven vertex types and the kite
These are related to each other as shown in the columns. Each graph composes to the one above (an empty graph for the ones in the top row) and the graph below is its forced decomposition. [The rows have been scaled differently to make the vertex types easier to see.]
Adding Faces to a Tgraph
This is technically tricky because we need to discover what vertices (and implicitly edges) need to be newly created and which ones already exist in the Tgraph. This goes beyond a simple graph operation and requires use of the geometry of the faces. We have chosen not to do a full conversion to vectors to work out all the geometry, but instead we introduce a local representation of angles at a vertex allowing a simple equality test.
Integer Angles
All vertex angles are integer multiples of 1/10th turn (mod 10) so we use these integers for face internal angles and boundary external angles. The face adding process always adds to the right of a given directed edge (a,b) which must be a boundary directed edge. [Adding to the left of an edge (a,b) would mean that (b,a) will be the boundary direction and so we are really adding to the right of (b,a)]. Face adding looks to see if either of the two other edges already exist in the graph by considering the end points a and b to which the new face is to be added, and checking angles.
This allows an edge in a particular sought direction to be discovered. If it is not found it is assumed not to exist. However, this will be undermined if there are crossing boundaries . In this case there must be more than two boundary directed edges at the vertex and there is no unique external angle.
Establishing the no crossing boundaries property ensures these failures cannot occur. We can easily check this property for newly created graphs (with checkedTgraph) and the face adding operations cannot create crossing boundaries.
Touching Vertices and Crossing Boundaries
When a new face to be added on (a,b) has neither of the other two edges already in the graph, the third vertex needs to be created. However it could already exist in the Tgraph – it is not on an edge coming from a or b but from another non-local part of the Tgraph. We call this a touching vertex. If we simply added a new vertex without checking for a clash this would create a nonsense graph. However, if we do check and find an existing vertex, we still cannot add the face using this because it would create a crossing boundary.
Our version of forcing prevents face additions that would create a touching vertex/crossing boundary by calculating the positions of boundary vertices.
No conflicting edges
There is a final (simple) check when adding a new face, to prevent a long edge (phiEdge) sharing with a short edge. This can arise if we force an incorrect graph (as we will see later).
Implementing Forcing
Our order of forcing prioritises updates (face additions) which do not introduce a new vertex. Such safe updates are easy to recognise and they do not require a touching vertex check. Surprisingly, this pretty much removes the problem of touching vertices altogether.
As an illustration, consider foolDMinus again on the left of figure 11. Adding the left dart onto edge (12,14) is not a safe addition (and would create a crossing boundary at 6). However, adding the right dart RD(6,14,11) is safe and creates the new edge (6,14) which then makes the left dart addition safe. In fact it takes some contrivance to come up with a Tgraph with an update that could fail the check during forcing when safe cases are always done first. Figure 17 shows such a contrived Tgraph formed by removing the faces shown in green from a twice decomposed sun on the left. The forced result is shown on the right. When there are no safe cases, we need to try an unsafe one. The four green faces at the bottom are blocked by the touching vertex check. This leaves any one of 9 half-kites at the centre which would pass the check. But after just one of these is added, the check is not needed again. There is always a safe addition to be done at each step until all the green faces are added.
Figure 17: A contrived example requiring a touching vertex check
Boundary information
The implementation of forcing has been made more efficient by calculating some boundary information in advance. This boundary information uses a type BoundaryState
This records the boundary directed edges (boundary) plus a mapping of the boundary vertices to their incident faces (bvFacesMap) plus a mapping of the boundary vertices to their positions (bvLocMap). It also keeps track of all the faces and vertices. The boundary information is easily incremented for each face addition without being recalculated from scratch, and a final graph with all the new faces is easily recovered from the boundary information when there are no more updates.
The saving that comes from using boundaries lies in efficient incremental changes to boundary information and, of course, in avoiding the need to consider internal faces. As a further optimisation we keep track of updates in a mapping from boundary directed edges to updates, and supply a list of affected edges after an update so the update calculator (update generator) need only revise these. The boundary and mapping are combined in a force state.
type UpdateMap = Mapping Dedge Update
type UpdateGenerator = BoundaryState -> [Dedge] -> UpdateMap
data ForceState = ForceState
{ boundaryState:: BoundaryState
, updateMap:: UpdateMap
}
Forcing then involves using a specific update generator (allUGenerator) and initialising the state, then using the recursive forceAll which keeps doing updates until there are no more, before recovering the final graph.
which just uses the first forcing rule to make sure every half-tile has a matching other half.
We also have a version of force which counts to a specific number of face additions.
stepForce :: Int -> ForceState -> ForceState
This proved essential in uncovering problems of accumulated innaccuracy in calculating boundary positions (now fixed).
Some Other Experiments
Below we describe results of some experiments using the tools introduced above. Specifically: emplacements, sub-Tgraphs, incorrect tilings, and composition choices.
Emplacements
The finite number of rules used in forcing are based on local boundary vertex and edge information only. We may be able to improve on this by considering a composition and forcing at the next level up before decomposing and forcing again. This thus considers slightly broader local information. In fact we can iterate this process to all the higher levels of composition. Some graphs produce an empty graph when composed so we can regard those as maximal compositions. For example compose fool produces an empty graph.
The idea now is to take an arbitrary graph and apply (compose . force) repeatedly to find its maximally composed graph, then to force the maximal graph before applying (force . decompose) repeatedly back down to the starting level (so the same number of decompositions as compositions).
We call the function emplace, and call the result the emplacement of the starting graph as it shows a region of influence around the starting graph.
With earlier versions of forcing when we had fewer rules, emplace g often extended force g for a Tgraph g. This allowed the identification of some new rules. Since adding the new rules we have not yet found graphs with different results from force and emplace. [Update: We now have an example where force includes more than emplace].
Sub-Tgraphs
In figure 18 on the left we have a four times decomposed dart dartD4 followed by two sub-Tgraphs brokenDart and badlyBrokenDart which are constructed by removing faces from dartD4 (but retaining the connectedness condition and the no crossing boundaries condition). These all produce the same forced result (depicted middle row left in figure 15).
Figure 18: dartD4, brokenDart, badlyBrokenDart
However, if we do compositions without forcing first we find badlyBrokenDart fails because it produces a graph with crossing boundaries after 3 compositions. So compose on its own is not always safe, where safe means guaranteed to produce a valid Tgraph from a valid correct Tgraph.
In other experiments we tried force on Tgraphs with holes and on incomplete boundaries around a potential hole. For example, we have taken the boundary faces of a forced, 5 times decomposed dart, then removed a few more faces to make a gap (which is still a valid Tgraph). This is shown at the top in figure 19. The result of forcing reconstructs the complete original forced graph. The bottom figure shows an intermediate stage after 2200 face additions. The gap cannot be closed off to make a hole as this would create a crossing boundary, but the channel does get filled and eventually closes the gap without creating a hole.
Figure 19: Forcing boundary faces with a gap (after 2200 steps)
Incorrect Tilings
When we say a Tgraph g is a correct graph (respectively: incorrect graph), we mean g represents a correct tiling (respectively: incorrect tiling). A simple example of an incorrect graph is a kite with a dart on each side (called a mistake by Penrose) shown on the left of figure 20.
If we try to force (or emplace) this graph it produces an error in construction which is detected by the test for conflicting edge types (a phiEdge sharing with a non-phiEdge).
*Main> force mistake
... *** Exception: doUpdate:(incorrect tiling)
Conflicting new face RK (11,1,6)
with neighbouring faces
[RK (9,1,11),LK (9,5,1),RK (1,2,4),LK (1,3,2),RD (3,1,5),LD (4,6,1),RD (4,8,6)]
in boundary
BoundaryState ...
In figure 20 on the right, we see that after successfully constructing the two whole kites on the top dart short edges, there is an attempt to add an RK on edge (1,6). The process finds an existing edge (1,11) in the correct direction for one of the new edges so tries to add the erroneous RK (11,1,6) which fails a noConflicts test.
Figure 20: An incorrect graph (mistake), and the point at which force mistake fails
So it is certainly true that incorrect graphs may fail on forcing, but forcing cannot create an incorrect graph from a correct graph.
If we apply decompose to mistake it produces another incorrect graph (which is similarly detected if we apply force), but will nevertheless still compose back to mistake if we do not try to force.
Interestingly, though, the incorrectness of a graph is not always preserved by decompose. If we start with mistake1 which is mistake with just two of the half darts (and also an incorrect tiling) we still get a similar failure on forcing, but decompose mistake1 is no longer incorrect. If we apply compose to the result or force then compose the mistake is thrown away to leave just a kite (see figure 21). This is an example where compose is not a left inverse to either decompose or (force . decompose).
Figure 21: mistake1 with its decomposition, forced decomposition, and recomposed.
Composing with Choices
We know that unknowns indicate possible choices (although some choices may lead to incorrect graphs). As an experiment we introduce
makeChoices :: Tgraph ->[Tgraph]
which produces alternatives for the 2 choices of each of unknowns (prior to composing). This uses forceLDB which forces an unknown to be a largeDartBase by adding an appropriate joined half dart at the node, and forceLKC which forces an unknown to be a largeKiteCentre by adding a half dart and a whole kite at the node (making up the 3 pieces for a larger half kite).
Figure 22 illustrates the four choices for composing fool this way. The top row has the four choices of makeChoices fool (with the fool shown embeded in red in each case). The bottom row shows the result of applying compose to each choice.
Figure 22: makeChoices fool (top row) and composeG of each choice (bottom row)
In this case, all four compositions are correct tilings. The problem is that, in general, some of the choices may lead to incorrect tilings. More specifically, a choice of one unknown can determine what other unknowns have to become with constraints such as
a and b have to be opposite choices
a and b have to be the same choice
a and b cannot both be largeKiteCentres
a and b cannot both be largeDartBases
This analysis of constraints on unknowns is not trivial. The potential exponential results from choices suggests we should compose and force as much as possible and only consider unknowns of a maximal graph.
For calculating the emplacement of a graph, we first find the forced maximal graph before decomposing. We could also consider using makeChoices at this top step when there are unknowns, i.e a version of emplace which produces these alternative results (emplaceChoices)
The result of emplaceChoices is illustrated for foolD in figure 23. The first force and composition is unique producing the fool level at which point we get 4 alternatives each of which compose further as previously illustrated in figure 22. Each of these are forced, then decomposed and forced, decomposed and forced again back down to the starting level. In figure 23 foolD is overlaid on the 4 alternative results. What they have in common is (as you might expect) emplace foolD which equals force foolD and is the graph shown on the right of figure 11.
Figure 23: emplaceChoices foolD
Future Work
I am collaborating with Stephen Huggett who suggested the use of graphs for exploring properties of the tilings. We now have some tools to experiment with but we would also like to complete some formalisation and proofs. For example, we do not know if force g always produces the same result as emplace g. [Update (August 2022): We now have an example where force g strictly includes emplace g].
It would also be good to establish that g is incorrect iff force g fails.
We have other conjectures relating to subgraph ordering of Tgraphs and Galois connections to explore.
In a previous post, I challenged you to solve Infinite 2D Array using Haskell. As a reminder, the problem specifies a two-parameter recurrence , given by
for
for
for .
Last time, we derived a formula for that involves only a linear number of terms:
While the number of terms may be linear, it can still be on the order of a million terms, so computing each term is going to have to be pretty quick in order to fit the whole thing within the one second time limit.
Fibonacci numbers modulo a prime
Computing Fibonacci numbers modulo a prime is not hard, especially since we want all the Fibonacci numbers from 1 up to : just compute each one by adding the previous two modulo . We could also precompute a table of Fibonacci numbers mod this way. And any of the fast methods for computing individual Fibonacci numbers (for example, using 2×2 matrix exponentiation) also work just fine if you reduce everything modulo at each step, since they only involve addition, subtraction, and multiplication.
Binomial coefficients modulo a prime
What about binomial coefficients? Since and are not too large, and in particular since they will both be smaller than , we can use the usual formula:
(If and could be much larger, or if they could be larger than , we would have to use something like Lucas’s Theorem or other techniques; that might make for another interesting blog post sometime.) But how do we handle division in modular arithmtic? Since we’re working modulo a prime, every value other than zero must have a modular inverse, that is, a value such that (this is a corollary of Bézout’s Theorem). To compute the modular inverse for a given , we have a couple options. One simple way is to use Fermat’s Little Theorem: if is not divisible by a prime , then , hence is the modular inverse of modulo , and we can compute it efficiently using repeated squaring modulo . Another option is to use the extended Euclidean algorithm to find the and (guaranteed to exist by Bézout’s Theorem) such that ; then is the inverse of modulo .
Both of these methods take time. In my experience, computing the power is easier to code (especially in Haskell where we get exponentiation by repeated squaring for free!), but using the extended Euclidean algorithm can be a bit faster when it’s well-optimized. (Note the extended Euclidean algorithm can be faster when is small, but raising to the power always takes the same number of steps no matter what is.)
Factorials modulo a prime
Since we’re going to be repeatedly using the same factorials, one thing we absolutely must do is precompute a table of factorials mod , from up to some maximum. In this case, since our formula involves things like , we may need factorials up to , so a table of size will do ( and can be up to ).
We could also precompute a table of modular inverses of factorials; to compute the inverse of , we just find the inverse of each and multiply it by the (previously computed) inverse of . (Or we could just invert the value for stored in the other table.) Making a table of inverse factorials like this turns out not to help too much for this particular problem, but it can be an important optimization in some cases.
The end?
So we can compute each additional Fibonacci number in ; we can also now compute binomial coefficients modulo in , with a few table lookups for factorials and an inversion operation. (Again, we could achieve if we also stored a table of inverse factorials, but for this problem it seems the additional time needed to construct the table in the first place outweighs the time saved computing binomial coefficients.) In theory, we have everything we need to solve this problem efficiently.
However, for this problem, constant factors matter! There’s still quite a bit of nontrivial work I had to do to get my code fast enough. In my next and final post on this problem, we’ll walk through a few different ideas for implementing this concretely in Haskell.
Today I watched the very good, and short, video from Tweag on how to Avoid
boilerplate instances with -XDerivingVia. It made me realise that I've read
about this before, but then the topic was on reducing boilerplate with MTL-style
code.
Given that I'd forgotten about it I'm writing this mostly as a note to myself.
The example from the Tweag video, slightly changed
The code for making film ratings into a Monoid, when translated to the UK,
would look something like this:
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module DeriveMonoid wherenewtypeSupremuma=MkSupaderivingstock(Bounded,Eq,Ord)derivingnewtype(Show)instanceOrda=>Semigroup(Supremuma)where(<>)=maxinstance(Boundeda,Orda)=>Monoid(Supremuma)wheremempty=minBounddataFilmClassification=Universal|ParentalGuidance|Suitable12|Suitable15|Adults|Restricted18derivingstock(Bounded,Eq,Ord)deriving(Monoid,Semigroup)via(SupremumFilmClassification)
Composing by deriving
First let's write up a silly class for writing to stdout, a single operation will do.
Haskell binaries can get quite large (think ~100MB), especially for projects with many transitive dependencies. Here are two strategies that can help at link time, the latter being more experimental.
I used the test-pandoc binary from pandoc on GHC 9.2.5 below. This was nice because obviously it was easy to test if linking broke anything (just run the tests).
-split-sections and --gc-sections You can instruct ghc to emit code in individual minimal sections, allowing the linker to easily find and remove dead code.
At work I was recently tasked with figuring out what API calls our program makes, and more interestingly, which code-paths lead to those API calls. Determining this by hand is tedious and error-prone, and worse, doesn’t stay up to date with code changes. Instead, let’s see how we can use the type system to eliminate the pain.
The existing code was organized around a class HasAPI that looks something like this:
typeHasAPI ::Service->Symbol->ConstraintclassHasAPI srv name wheretypeAPICall srv name callAPI ::APICall srv name
Here, HasAPI is a type class with an associated type family APICall which gives the type for making the call. For example, there might be an instance:
This is a silly example — the real codebase makes actual API calls — but it serves for demonstration.
Our goal is to document every codepath that makes any use of callAPI, in some sense, “infecting� every path with some marker of that fact. This is a common experience to Haskell programmers; in fact, IO has this same pattern of infectiousness. Whenever you make a function perform IO, every type in the callstack needs to document the fact it performs IO. This is the inspiration we will take, except that changing types is extremely expensive. What if we pushed a constraint around instead?
The trick is to define a new class, of the same shape as HasAPI:
typeCallsAPI ::Service->Symbol->ConstraintclassCallsAPI srv name
but crucially, we give CallsAPIno instances. On first blush, this seems insane: why introduce a class with no methods and no instances? Having no methods means it can’t do anything useful. Having no instances means GHC can never eliminate the constraint, and thus must propagate it upwards. This is the infectiousness we want; any function which makes an API call must document that fact in its type — failure to do so will result in GHC failing to compile with the message No instance for (CallsAPI srv name).
The trick now is to ensure that callsAPI produces a CallsAPI constraint. The easy way to do this is a little renaming to ensure existing polymorphic code continues work:
Any code written against the old HasAPI constraint will continue to work (modulo the instance definitions,) but concrete calls to callAPI now result in a dangling, unsatisfiable CallsAPI constraint. You’ll need to go through the codebase now, and document every transitive call to the API with matching CallsAPI constraints. Thankfully, HLS can help with this task: it will underline the missing cases, and suggest a code action that will automatically add these constraints to the type. Rinse and repeat, until every code path is documented.
Great success! We have automatically found every codepath that makes an API call, and forced them to document that fact. Better yet, we have solved the problem once and for all; our coworkers also must document any new API calls they make, lest their code not compile. It seems like we’re done!
Except for one fact: GHC will rudely refuse to compile our project, even if we correctly track all of our API calls. The problem of course, is that all we have managed to do is force main to collect every CallsAPI constraint. But GHC will still complain No instance for (CallsAPI srv name). Of course, you could just give an orphan instance in the same module that defines main, which would work, but this doesn’t give you any sort of external documentation. It’s nice when you read the code, but it doesn’t help the business people.
A better approach here is to selectively solve the CallsAPI constraints, which we can do with some Haskell dark magic. The Dict type captures a constraint, giving us a convenient way to manipulate constraints:
typeDict ::Constraint->TypedataDict c whereDict :: c =>Dict c
We can write an eliminator to bring the c from a Dict c into scope, which, importantly, allows us to solve otherwise-unsolved constraints:
(\\) :: (c => r) ->Dict c -> rf \\ Dict= f
If we can get our hands on a Dict (CallsAPI Srv Name), we can use (\\) to convince GHC to compile our program.
GHC is happy to give us dictionaries for constraints it knows about:
showIntDict ::Dict (ShowInt)showIntDict =Dict
but unfortunately, refuses to give us dictionaries for unsolved constraints:
callsAPIDict ::forall srv name.Dict (CallsAPI srv name)callsAPIDict =Dict-- Error: No instance for (CallsAPI srv name)
It seems like we’re just as stuck, but we have a trick up our sleeve. The first step is to define another class with an instance in scope. GHC will happily give us a dictionary for such a thing:
Behind the scenes, GHC compiles classes into records, instances into values of these records, and replaces wanted constraints with function arguments taking those records. By ensuring that Trivial and CallsAPI are both empty classes, with no methods or super-classes, we can be certain the generated records for these classes will be identical, and thus that it is OK to coerce one into the other.
Armed with withDict and callsAPIDict, we can play the part of the constraint solver and satisfy constraints ourself. GHC will happily compile the following example:
ex ::HasAPIShoutService"shout"=>IOStringex = callAPI @ShoutService@"shout""hello world"-- Look ma, no HasAPI constraint!test ::IOStringtest = ex \\ callsAPIDict @ShoutService@"shout"
So that’s the rough technique. But how do we actually use it in anger?
Our actual use case at work is to add these API calls to our swagger documentation. Swagger is this automatically generated manifest of an API surface; we want to document the fact that some API calls might call other ones. Our server is one big servant application, and servant is extensible. So the real technique is to build a servant combinator that eliminates HasAPI constraints when you document them in the API definition.
Getting into the nitty gritty bits of servant is beyond the scope of this post, but we can sketch the idea. Servant APIs use the type-level (:>) operator to combine information about an endpoint. For example, we might expose another service:
This definition states that we have a REST server with a single route, api/echo which responds to POST requests, returning a JSON-encoded string, which takes a JSON-encoded string as the request body.
A servant server for ServantAPI would have type Server ServantAPI, where Server is a type family given by HasServer. Evaluating the type family results in String -> Handler String, so in order to implement this server, we would need to provide a function of that type.
Unfortunately, due to our earlier work, we can’t eliminate the CallsAPI constraint, and thus we can’t actually use echo as the handler for our endpoint.
It’s important to note that servant’s DSL is extensible, and we can add our own machinery here. The first step is to build a type that we can use in servant:
typeMakesAPICall ::Service->Symbol->TypedataMakesAPICall srv name
In order to actually run our endpoint, we need to give an instance of HasServer for our new MakesAPICall combinator:
instanceHasServer api ctx=>HasServer (MakesAPICall srv name :> api) ctxwheretypeServerT (MakesAPICall srv name :> api) m =Dict (CallsFed srv name) ->ServerT api m route _ ctx f = route (Proxy@api) ctx $fmap ($ callsAPIDict @srv @name) f
The ServerT instance here adds a Dict (CallsFed srv name) to the type of the handler required to satisfy this endpoint, while route automatically fills in the dictionary whenever the handler needs to be run. In an ideal world, we could give our ServerT instance as:
typeServerT (MakesAPICall srv name :> api) m =CallsFed srv name =>ServerT api m
but GHC doesn’t let us use quantified types on the right-hand sides of type families, so this is unfortunately a no-go. Playing games with Dict instead is the best approach I’ve found here, but I’d love to hear if anyone has a better idea.
We still can’t use echo as a handler, but we can use makesCall echo as one, where makesCall is given as:
Servers that document their API calls via MakesAPICall and which wrap their handlers with makesCall can now eliminate CallsFed constraints. Since this is the only way of eliminating CallsFed constraints, we can be sure that every API call is correctly documented in the servant DSL!
The final step here is to add an instance of HasSwagger (MakesAPICall srv name :> api), but the details are gory and devoid of educational value. Suffice it to say that this instance was written, and now we have automatically generated JSON documentation describing which server endpoints make which other API calls. This documentation is guaranteed to be correct, because updating it is the only way to convince GHC to compile your code.
Binary search is a workhorse of competitive programming. There are occasional easy problems where binary search is the solution in and of itself; more often, it’s used as a primitive building block of more complex algorithms. It is often presented as a way to find the index of something in a sorted array in time, and many languages have such a thing in their standard library (for example, see Arrays.binarySearch in Java, the bisect library in Python, or the binary_search function in C++). However, the idea of binary search is more general than searching in a sorted array; we’re doing binary search any time we repeatedly halve a search interval. For example, we can use it to find the smallest or largest number with a given property, or to find an optimal, “just right” measurement that is neither too small nor too big.
Generic binary search with first-class functions, take 1
A language with easy access to first-class functions provides a great opportunity to generalize binary search properly. For example, here’s a version of binary search that has lived in my competitive programming solution template for a long time:
-- Discrete binary search. Find the smallest integer in [lo,hi] such-- that monotone predicate p holds.binarySearchD::Int->Int->(Int->Bool)->IntbinarySearchDlohip|lo==hi=lo|pmid=binarySearchDlomidp|otherwise=binarySearchD(mid+1)hipwheremid=(lo+hi)`div`2
The key generalization is that it takes a predicate of type Int -> Bool as an argument. Note that in order for binary search to work, the predicate p must be monotonic. This means, intuitively, that p starts out False, and once it switches to True it never goes back. (Formally, p being monotonic means that for all x and y, if x <= y then p x <= p y, where False <= True). This is how we can tell if we’re “too low” or “too high”: we’re “too low” when p is False and “too high” when it is True.
This is definitely an improvement over array-specific versions of binary search. We can still use it to search in an array by providing a predicate that does an array lookup, but we can use it for other things as well.
I should note at this point that there is a very nice binary-search package published on Hackage, which I definitely recommend if you need binary search in some kind of project. However, for the purposes of competitive programming, we can’t rely on that package being available, and we’d also like something a bit simpler, so we don’t have to read the documentation every time we want to use it.
…can we do better?
So my binarySearchD function works fine as far as it goes, and I have used it regularly, but there are still several things about it that always annoyed me:
What if we want a slight variation, such as the largest integer such that something holds? Or the last integer where the predicate doesn’t hold? etc.? It is possible to use binarySearchD in these situations, but I find it tricky and error-prone to figure out how. And when I’m reaching for some function as a building block of a bigger algorithm I definitely don’t want to waste time and brainpower having to think carefully about small details like this.
Getting the implementation right in the first place was kind of tricky. Should we use mid+1? mid-1? Should we think in terms of a closed interval [lo,hi], or a half-open interval [lo,hi), or…? How can we convince ourselves our implementation is completely correct, and won’t get stuck in infinite recursion?
What if we want to do binary search over a continuous domain, like Double? We have to make a completely separate function, for example, like this:
-- Continuous binary search. Given a tolerance eps, an interval-- [a,b], a continuous, monotonically increasing function f, and a-- target value tgt, find c ∈ [a,b] such that f(c) = tgt.binarySearch::(Fractionalt,Ordt,Orda)=>t->t->t->(t->a)->a->tbinarySearchepsabftgt=goabwheregolohi|hi-lo<eps=mid|fmid<tgt=gomidhi|otherwise=golomidwheremid=(lo+hi)/2
(Actually, I’m not sure why I wrote that version in terms of finding a “target” value. In practice I suppose continuous binary search often comes up that way, but looking at it now it seems less general. In any case, we’re going to throw this function away very shortly so it doesn’t really matter!)
A better binary search
Recently I came across a lovely article, Binary Search a Little Simpler & More Generic by Jules Jacobs. Jules explains a really elegant API for binary search that is so much better than anything I’d seen before, and solves all the above issues! I immediately went to implement it in Haskell, and I want to share it with you. As I’ve reflected on Jules’s presentation, I have identified three key ideas:
Rather than looking for some index with a certain property, we’re really looking for the place where p switches from False to True. That actually happens in between two indices… so let’s return the pair of indices bracketing the change, rather than just a single index! This means we get both the last index that does not have property p and the first one that does, and we can use whichever one we want.
This is a simple change, but in my experience, it helps a lot to reduce the cognitive load. Previously, if I wanted something like “the last index that does not have property p” I’d have to think hard about what the index I get out of the search represents, and figure out that I needed to subtract one. Now I only have to think “OK, I want the thing right before the predicate changes from False to True, so I can project it out with fst”.
The second important idea is that we’re going to insist that p switches from False to True, not at most once, but exactly once. (If necessary, we can add special “virtual” -∞ and/or +∞ indices such that p (-∞) = False and p (+∞) = True.) Then as we narrow down our current search interval [l, r], we will maintain the invariant that p l = False and p r = True.
This invariant makes everything so much cleaner, and it also ties in with the first important idea of returning a pair instead of a single index. Previously I always thought of binary search in terms of searching for a specific index, but that makes the semantics of the interval tricky. For example, do we maintain the invariant that the index we’re looking for is somewhere in the closed interval [l,r]? Somewhere in the half-open interval [l,r)? …? But I find it so much more elegant and natural to say “l always stays in the False part, and r always stays in the True part, and we just slide them closer until we find the exact dividing line between False and True.”
I will note that there are a couple tradeoffs: first of all, our search function of course takes starting values for l and r as inputs, and it will now have as a prerequisite that p l = False and p r = True, so we have to think a little harder when calling it. We also have to work a little harder to figure out when e.g. a value we’re looking for was not found at all. Typically, if we use some sort of initial special +∞ value for r, if the returned r value is still +∞ it means nothing at all was found that made the predicate True.
The final important idea is to abstract out a function mid to compute a potential next index to look at, given the current interval. We’ll insist that when mid l r returns a value, it must be strictly in between l and r (there’s no point in returning l or r because we already know p l = False and p r = True), and we’ll stop when it returns Nothing. This lets us cleanly separate out the logic of the recursion and keeping track of the current search interval from the details of the arithmetic needed for each step. In particular, it will allow us to unify binary search over both integral and floating-point domains.
Here’s the final form of our search function. Unlike, say, binarySearchD, it pretty much writes itself at this point:
We check our mid function to tell us what to look at next. If it returns Nothing, we stop and return the pair of the current (l,r). If it returns a “midpoint” m then we test the predicate on m and recurse. No tricky +1’s or -1’s to think about; given our invariant regarding l and r, it’s obvious which one we should replace with m depending on the outcome of the predicate, and we can’t get stuck in an infinite loop since m is always strictly between l and r.
(As an aside, I love that this is polymorphic in a with no class constraints! That’s another hint that this is really quite general. The class constraints will come with particular mid functions.)
So what about those mid functions? Here’s one for doing binary search over integers:
Pretty straightforward! We stop when l and r are exactly one apart; otherwise we return their midpoint (you should convince yourself that (l+r) `div` 2 is always strictly in between l and r when r - l > 1).
For example, we can use this to take an integer square root:
λ> search binary (\x -> x^2 >= 150) 0 100
(12,13)
This tells us that 12 is the biggest integer whose square is less than 150, and 13 is the smallest integer whose square is greater.
But we needn’t limit ourselves to integers; as hinted previously, we can also do binary search over Fractional domains:
Given an eps value, we stop when r - l <= eps, and otherwise return their midpoint. For example, we can use this to find the square root of 150 to 6 decimal places:
We can even write some functions to do linear search! Why might we want to do that, you ask? Well, with some care, these can be used even with non-monotonic predicates, to find the first or last place the predicate switches from False to True (though using something like find or findIndex is typically easier than using search fwd).
I don’t have any great examples of using these off the top of my head, but we might as well include them.
[WARNING: this section about binary search on bit representations of floating-point numbers is completely wrong, but I’m leaving it here for context. See the discussion in the comments to this post and the follow-up post!]
But there’s more: we can also do exact binary search on the bit representations of floating-point numbers! That is, we do binary search as if the bit representations of l and r were unsigned integers. This is possibly more efficient than “continuous” binary search, and lets us find the two precisely adjacent floating-point numbers where our predicate switches from False to True.
This honestly seems like black magic to me, and I don’t know enough about floating-point representation to have a good idea of how this works and what the caveats might be, but it’s worked for all the examples I’ve tried. It even works when l is negative and r is positive (it seems like in that case the bit representation of l would correspond to a larger unsigned integer than r, but somehow it all works anyway!).
The upcoming version of GHC will feature primitives for
delimited continuations. Let’s put them to use and build a bare bones algebraic effect
system.
Algebraic effects
In Haskell, different sorts of effectful computations can be expressed using monads.
Monads for individual effects are fairly well understood. The challenge now is
to combine many different effects. Applications manage many kinds of resources
(files, network, databases…), handle many types of errors, and run in
different environments (production vs testing with mock components).
Can that be done while maintaining a reasonable level of separation of concerns?
Currently, a common approach is to use monad transformers and type classes (mtl-style).
But when you have a big stack of monad transformers, it may not be easy to even
understand what (>>=) does, because its behavior arises from the composition
of all of these transformers. So the actual control flow of the program is
opaque to us, which can be an obstacle to locating errors and guaranteeing
performance.
Algebraic effects are another approach to combine effects.
Whereas with transformers, every monad transformer must be defined from
scratch, algebraic effects start from a few core primitives: you have one
(parameterized) monad with abilities to “call” an operation and to “handle”
such calls. The hope is that those core primitives:
are simple to implement and to optimize;
make it easy to define and reason about effects,
in terms of both behavior and performance.
Until now, algebraic effect systems in Haskell used free monads or the
continuation monad. Continuations were emulated as closures; this comes
with a level of indirection whose cost is difficult to mitigate.
The newly implemented
delimited continuations primops
let us directly manipulate native continuations.
This post uses delimited continuations to implement programs with various effects.
The usual culprits:
The example programs leveraging this mini effect library will look like your
standard-fare monadic code. What makes them interesting is that, operationally,
they are all in the IO monad. Unlike with monad transformers, adding a new
effect does not change the underlying monad, so code that doesn’t use that
effect does not pay a price for it. Another notable consequence is that
“unlifting” abstractions like UnliftIO or MonadBaseControl are no longer
relevant: there is nothing to “unlift” if you never leave IO.
The abstraction layer of algebraic effects over continuations is so thin that I
just use prompt and control0 directly, but the bits that are “operations”
and the bits that are “handlers” are clearly identifiable. The system
implemented here is untyped as far as effects are concerned, but features
named handlers as a mitigating alternative;
a complete effect system which would keep track of what operations each
computation may call and would provide safe primitives to define new effects is
left as an exercise for the reader.
This post is written in Literate Haskell (source code).
It can be compiled using the development version of GHC (or GHC 9.6 if it has been released).
The boxing of the continuation k in control0 could be avoided by
introducing a new type for continuations, replacing (Mom b -> Mom a).
I’m not sure whether there is much to gain from that optimization.
I leave it like this for simplicity.
prompt and control0, “goto” with extra steps?
When a function terminates normally, it returns its result to its caller,
its predecessor in the call stack. prompt lets you prepare another return point
earlier in the call stack, and control0 returns to that point. What happens
to all the stack frames that were skipped that way? They are copied to the heap so they
can be restored later.
In more concrete terms, when you call control0 t f :: Mom b, the caller expects a
result of some type b. It is assumed that you have previously set up a
prompt t :: Mom a -> Mom a in the call stack with the same tag t :: PromptTag a.
The slice of the stack up to that prompt t is unwinded and stored as a function
continue :: Mom b -> Mom a (IO b -> IO a).
prompt t is popped off the stack, and the program carries on as f continue.
It sounds completely insane the first time you learn about it,
it’s like “goto” with extra steps. And yet, when you get down to it, delimited
continuations have rather clean semantics, both operationally and
denotationally.
The implementation was a surprisingly small change in GHC.
The changes required to implement prompt# and control0# are relatively minimal.
They only impact the RTS, and they do not require any changes to existing
functionality. Though capturing portions of the RTS stack may seem like a
radical proposition, GHC actually already does it when raising an asynchronous
exception to avoid the need to duplicate work for any blackholed thunks. In
fact, getting that right is significantly more subtle than implementing
control0#, which is quite straightforward in comparison.
The richness of continuations, both theoretically and practically, suggests that these
control operators are not as arbitrary as they seem.
Effectful code, pure semantics
The code in this post can be split in two levels. Library-level code uses the delimited
continuation primitives to implement effects—operations and handlers, and user-level
code uses those effects in example programs.
Without direct access to delimited continuations, user-level code cannot
observe any mutation, so it will be safe to use the following pure run
function.
-- Look Ma', no IO!run::Moma->Maybearun (Momm) =unsafePerformIO (E.catch (Just<$>m) \NoMatchingContinuationPrompt->pureNothing)
Hiding the delimited continuations primitives avoids the danger of duplicating
and observing the creation of fresh PromptTags in a pure context.
Some partiality remains (Maybe) due to potentially mismatched
control0# calls. Such errors would be prevented by a type system for effects,
which is beyond the scope of this post.
To begin, let’s implement exceptions using delimited continuations.
This effect has an operation throw and a handler catch.
Operation
We first declare the uninterpreted operation Throw as a constructor
in a functor. The parameter a is ignored by exceptions; it will be
used by other effects.
dataExceptionea=Throwe
We wrap this constructor in a user-facing function throw.
Every throw should have a matching catch, and we ensure this
by requiring a tag that identifies the corresponding catch.
The exact type of tag will be revealed in a moment.
control0 uses that tag to look up the matching catch in the call stack,
and returns to it with the exception e wrapped in Throw.
The underscore is the continuation, which is the slice of the stack below the
catch, which is thus discarded.
The type of catch should also look familiar, with the novelty that the
handled computation f expects a tag—so that it may call throw.
In catch f onThrow, a fresh tag is generated, then
f tag either (1) returns normally, and its result is wrapped in Pure a,
or (2) f tag throws an exception wrapped in Op (Throw e).
We then return the result or apply the handler onThrow accordingly.
You might have guessed that the Exception e % a tag is just a PromptTag.
More surprisingly, the tag index involves a free monad.
For exceptions, Free (Exception e) a is equivalent to Either e a:
we expect the computation under prompt to produce either an exception e or
a result a. More generally, for an effect expressed as a functor f,
things will be set up exactly so that handlers will be matching on a
computation/tree of type Free f r.
The explicit tags serve as a form of capabilities, handles that functions
take as explicit arguments, granting the permission to use the associated
effects. This partly makes up for the lack of effect typing.
It’s not watertight: you can easily capture the tag to call throw outside of
try/catch. But from a non-adversarial perspective, this mechanism may
prevent quite a few mistakes.
-- Minimalistic unit testing frameworkassert::HasCallStack=>Bool->IO ()assertTrue=pure ()assertFalse=error"Assertion failed"
Output
Algebraic effects are also known as “resumable exceptions”, extending
exceptions with the ability to continue the computation right where
the exception was thrown.
The next simplest effect after exceptions is to produce some output.
Like Throw, we represent the Output operation as a constructor,
containing the value to output, and now also a continuation.
Operation
dataOutoa=Outputo (Mom () ->Moma)
The output wrapper is similar to throw, additionally storing the
continuation in the Output constructor.
The expected argument of the continuation continue is a computation which is
to replace the operation call.
When we call output o :: Mom (), that call “bubbles
up” like an exception, gets caught by a handler, and the call gets replaced by
pure () or some other Mom () computation.
The big selling point of algebraic effects is that effects can be
combined smoothly. We can thus use log to trace the
execution flow of a program using throw and catch
without further ceremony.
This looks like your usual monadic program. The point is that everything lives
in the same monad Mom (which is operationally equal to IO),
so you do not have to worry about “lifting” or “unlifting” anything through a
transformer: the semantics of (>>=) do not change with every new effect, and
there isn’t the problem that “lifting” catch and other operations that are
actually handlers is counter-intuitive for many transformers, if possible at all.
To be fair, there remain
difficulties in this area even with
algebraic effects.
tracedCatch::OutString%r->MomBooltracedCatchout=catchthisonThrowwherethisexc=dologout"Start"_<-throwexc"Boom"logout"This is unreachable"pureFalseonThrowmsg=dologout ("Error: "++msg)pureTrue
There can also be different ways of handling an effect.
The following handler discards output instead of collecting it,
for example to ignore debugging logs.
Output the cumulative sum of an input stream.
Like fibonacci, this is an infinite loop in IO.
It gets broken by control0 in input.
Until now, an infinite loop in IO would either have to be broken by an
exception (which makes it not actually infinite), or have to involve
concurrency.
What sets IO apart from ST and Mom is that it can change the world.
We can define handlers to send output and receive input from the real world.
The result of these handlers must be in IO.
A drawback of this implementation is that for a computation that features both
input and output, these handlers are awkward to compose.
We can coerce IO to Mom so readInput can be composed with printOutput,
but that is a hacky solution that makes the type Mom a lie (it’s not supposed
to have side effects). A better solution may be to combine effects before
interpreting them in IO all at once.
State
No effect tutorial would be complete without the state effect.
Use the output effect to stream all results of a nondeterministic computation.
Here, the continuation is not used linearly: it is called once for every
element in the given list.
Earlier, the streaming handler connect interleaved execution of one consumer
and one producer thread. Here is a cooperative concurrency effect that lets us
dynamically fork any number of threads and interleave them.
A round-robin scheduler. handle keeps track of a queue of threads.
It runs the first thread until the next event. If the thread yields,
its continuation is pushed to the end of the queue. If the thread
forks another thread, the forked thread is pushed to the end of the queue,
and we continue in the main thread (forking does not yield).
If the thread terminates, we remove it from the queue.
Primitive delimited continuation in Haskell give us the power to jump around
the stack to implement many kinds of effects. Under the hood, those operations
live in the IO monad, grounding effectful code in a familiar execution model.
For those new to the topic, I hope that these examples may serve as a good
starting point to experiment with delimited continuations and algebraic effects
in Haskell.
The system implemented here is as rudimentary as it gets.
To define new effects and handlers, we use the new primitives directly, which
is dangerous. This was deliberate to provide material to familiarize oneself
with those primitives.
Moreover, on the one hand, a type system to keep track of the scope of
delimited continuations is a non-trivial ask. On the other hand, the examples
here all follow a regular structure, so there is probably a way to encapsulate
the primitives, trading off some expressiveness for a safe interface
to define new effects and handlers.
Named handlers—via prompt tags—occupy an interesting spot in the scale of
safety guarantees. It is imperfect, even very easy to circumvent. But if you’re
not working against it, it is still a neat way to prevent simple mistakes.
This system can be reinforced further using rank-2 polymorphism,
a technique described in:
Interestingly, prompt tags were not part of the original proposal, and
they are not used by eff, the effect
system which gave rise to Alexis King’s GHC proposal. Prompt tags were added during
the feedback process to make the primitives type-safe by default.
Now is an exciting time for algebraic effects/delimited continuations,
as they are making their way into industrial languages:
Haskell,
OCaml,
WebAssembly.
This actually took more effort than I expected. Twain and Scotty are a bit
different and the concepts do not map very easily from one to the other.
I had to spend a bit of time rethinking how to introduce this library in an approachable
way. I've also decided to add some additional information about WAI and other topics along the way.
The other important article I worked on was
Giml's type inference engine,
which was written in 2021, but had a significant edit.
When I first wrote the article there was one step
I was uncertain about - generalization.
At the time I omitted it and secretly hoped someone will let me know if I missed
an important step and tell me why is it important, but that didn't happen unfortunately.
However, after thinking about the interaction with other features (such as modules)
I came to understand its importance, and edited it into the article.
Streaming
This year I did not get back to working on
Giml on stream, or streaming in general.
I only streamed once this year when I
talked about compilers and Haskell.
I'd like to stream more in 2023, and make some more progress on Giml.
Open-source software
Giml
Giml did not see a lot of work other than
adding the generalization step to the type inference
and some refactoring.
I kinda got stuck with doing the not so fun work of
rewriting the parser in a more robust way. This drained my motivation.
I hope to get across that hurdle and get back to working
on Giml because there are many more interesting things to work on!
GlideVM
GlideVM is a stack-based bytecode interpreter
that I started while streaming before I started Giml.
This year I spent a bit more time on it, though not on stream.
I refactored some stuff and added comparisons and jumps.
The next interesting feature for this project is closures.
I hope to spend some time on it in 2023 as well.
sqlite-easy
sqlite-easy is a primitive yet easy to use
database library for interacting with SQLite3. I just wrote
a blog post
introducing it, so read it if youwant to learn more!
This is probably the most noteable personal project I worked on this year.
Book
Thanks to your feedback and contributions,
Learn Haskell by building a blog generator
have seen additional editing, notes and exercises!
Thanks for everyone who contributed!
I'd also like to highlight
a review
of the book. I'm always happy to hear what people think about my work,
and constructive feedback helps me improve and make my work better.
Final thoughts
This is my third time doing a yearly recap type of post.
This year been a bit lighter on content than previous years,
but I'm still proud of the progress I've made with my personal projects,
and I hope at least some of it can be useful for others as well.
I recently upstreamed a derivation for a Linux builder into Nixpkgs that’s easy to deploy on macOS. The significance of this work is that you can now run the following command on macOS:
In this post, I wanted to provide some of the background and motivation for this work to help contextualize it.
Background - NixOS qemu VMs on MacOS
I wasn’t originally trying to create a Linux builder for macOS when I began this project. I was actually working on making it as easy as possible to experiment interactively with (non-builder) NixOS qemu VMs on macOS.
While searching for prior art related to this I stumbled across the following Nixpkgs issue requesting exactly this same feature: Allowing NixOS VM’s to be run on macOS.
Even better, by the time I discovered that issue several people had already done most of the work, culminating in the following repository demonstrating how all of the features were supposed to fit together: YorikSar/nixos-vm-on-macos.
In fact, the flake for that repository also came with a binary cache, so if you just ran:
$ nix run github:YorikSar/nixos-vm-on-macos
… then you could run the sample NixOS VM from that repository on macOS without requiring access to an Linux builder because it would download all the Linux build products from the matching cache. Pretty neat!
However, this still didn’t completely satisfy my use case for reasons already noted by someone else: it doesn’t work well if you want to run a NixOS VM that differs even slightly from the included sample VM. Any difference requires Linux build products to be rebuilt which requires access to a Linux builder because those build products will not be cached ahead of time.
Background - linuxkit-nix
The need for a Linux builder wasn’t a showstopper for me because there was already prior art for bootstrapping a Linux builder on macOS, which was the linuxkit-nixproject. So what I could have done was:
Launch a (non-NixOS) linuxkit VM on macOS for use as a Linux builder
Use the linuxkit builder to build the desired NixOS qemu VM
Run that NixOS qemu VM on macOS
However, I was curious if I could use a NixOS VM for the first step, too! In other words:
Launch a cached NixOS qemu VM on macOS for use as a Linux builder
Use the qemu builder to build the desired (non-builder) NixOS qemu VM
Run that NixOS qemu VM on macOS
The only difference between the two approaches is the first step: instead of using linuxkit to create the Linux builder we use qemu to create a NixOS builder. This works because the qemu builder’s NixOS configuration doesn’t need to change so hydra.nixos.org can build and cache the NixOS qemu builder ahead of time.
There were a few reasons I took interest in this approach:
linuxkit-nix appears to not work on aarch64-darwin (i.e. Apple Silicon)
This seems like it is potentially fixable, but I wasn’t yet ready to volunteer to do that work.
It’s easier to customize a NixOS builder
linuxkit-nix doesn’t use NixOS for the builder and instead creates a bespoke builder for this purpose. This means that you can’t use the NixOS module system to more easily customize the behavior of the builder.
The qemu-based solution is simpler than linuxkit-nix
I think the easiest way to explain this is for me to link to the macos-builder.nixNixOS module, which has the entirety of the code that I contributed, which is significantly simpler than linuxkit-nix.
The main reason that the qemu-based solution is simpler than linuxkit-nix is because it is reusing more infrastructure that has already been upstreamed into Nixpkgs (most notably, NixOS and qemu VMs).
linuxkit-nix appears to be unmaintained
There was a nascent attempt to upstream linuxkit-nix into Nixpkgs, but that stalled because it seems like linuxkit-nix appears to have been abandoned a couple of years ago.
I could have restored that effort, but personally I was fine with using the simpler qemu-based approach. I haven’t given up on the idea of reviving linuxkit-nix, but it’s not on my immediate roadmap.
There is one notable downside to using qemu over linuxkit, which is that qemu is supposed to be slower than linuxkit
Note: I have not actually verified this claim since I can’t run linuxkit-nix on my M1 Mac, but this is purportedly the reason that the authors of linuxkit-nix did not opt to use qemu for their approach according to this PR description.
qemu performance hasn’t been an issue for me (yet), but that could change, especially if I try to make use of this at work, where performance could potentially matter more.
Motivation
As I mentioned above, the long-term goal for all of this is to run NixOS VMs on macOS. There are two main reasons I’m interested in this:
… and I wanted macOS users to be able to test-drive example NixOS configurations on their local machine without requiring them to own and operate a separate Linux machine.
I’m interested in running NixOS tests on macOS
… primarily for work-related reasons. At work developers have to install postgres on their development machines for integration testing, and it would be much nicer if we could restructure our integration tests as NixOS tests (which run inside of qemu VMs instead of running on the host).
However, at the time of this writing this would still require additional work which is in progress on this draft pull request.
Penrose Kite and Dart Tilings with Haskell Diagrams
Revised version (no longer the full program in this literate Haskell)
Infinite non-periodic tessellations of Roger Penrose’s kite and dart tiles.
filledSun6
As part of a collaboration with Stephen Huggett, working on some mathematical properties of Penrose tilings, I recognised the need for quick renderings of tilings. I thought Haskell diagrams would be helpful here, and that turned out to be an excellent choice. Two dimensional vectors were well-suited to describing tiling operations and these are included as part of the diagrams package.
This literate Haskell uses the Haskell diagrams package to draw tilings with kites and darts. It also implements the main operations of compChoicescompChoices and decompose which are essential for constructing tilings (explained below).
Firstly, these 5 lines are needed in Haskell to use the diagrams package:
{-# LANGUAGE NoMonomorphismRestriction #-}{-# LANGUAGE FlexibleContexts #-}{-# LANGUAGE TypeFamilies #-}
and we will also import a module for half tiles (explained later)
import HalfTile
Legal tilings
These are the kite and dart tiles.
Kite and Dart
The red line marking here on the right hand copies, is purely to illustrate rules about how tiles can be put together for legal (non-periodic) tilings. Obviously edges can only be put together when they have the same length. If all the tiles are marked with red lines as illustrated on the right, the vertices where tiles meet must all have a red line or none must have a red line at that vertex. This prevents us from forming a simple rombus by placing a kite top at the base of a dart and thus enabling periodic tilings.
All edges are powers of the golden section which we write as phi.
phi::Double
phi =(1.0 + sqrt 5.0) / 2.0
So if the shorter edges are unit length, then the longer edges have length phi. We also have the interesting property of the golden section that and so , and .
All angles in the figures are multiples of tt which is 36 deg or 1/10 turn. We use ttangle to express such angles (e.g 180 degrees is ttangle 5).
ttangle:: Int -> Angle Double
ttangle n =(fromIntegral (n `mod` 10))*^tt
where tt =1/10 @@ turn
Pieces
In order to implement compChoices and decompose, we need to work with half tiles. We now define these in the separately imported module HalfTile with constructors for Left Dart, Right Dart, Left Kite, Right Kite
data HalfTile rep = LD rep -- defined in HalfTile module| RD rep
| LK rep
| RK rep
where rep is a type variable allowing for different representations. However, here, we want to use a more specific type which we will call Piece:
type Piece = HalfTile (V2 Double)
where the half tiles have a simple 2D vector representation to provide orientation and scale. The vector represents the join edge of each half tile where halves come together. The origin for a dart is the tip, and the origin for a kite is the acute angle tip (marked in the figure with a red dot).
These are the only 4 pieces we use (oriented along the x axis)
Perhaps confusingly, we regard left and right of a dart differently from left and right of a kite when viewed from the origin. The diagram shows the left dart before the right dart and the left kite before the right kite. Thus in a complete tile, going clockwise round the origin the right dart comes before the left dart, but the left kite comes before the right kite.
When it comes to drawing pieces, for the simplest case, we just want to show the two tile edges of each piece (and not the join edge). These edges are calculated as a list of 2 new vectors, using the join edge vector v. They are ordered clockwise from the origin of each piece
It is also useful to calculate a list of the 4 tile edges of a completed half-tile piece clockwise from the origin of the tile. (This is useful for colour filling a tile)
To fill whole tiles with colours, darts with dcol and kites with kcol we can use leftFillDK. This uses only the left pieces to identify the whole tile and ignores right pieces so that a tile is not filled twice.
leftFillDK:: Colour Double -> Colour Double -> Piece -> Diagram B
leftFillDK dcol kcol c =case c of(LD _)->(strokeLoop $ glueLine $ fromOffsets $ tileEdges c)
# fc dcol
(LK _)->(strokeLoop $ glueLine $ fromOffsets $ tileEdges c)
# fc kcol
_-> mempty
To fill half tiles separately, we can use fillPiece which fills without drawing edges of a half tile.
fillPiece:: Colour Double -> Piece -> Diagram B
fillPiece col piece = drawJPiece piece # fc col # lw none
For an alternative fill operation we can use fillDK which fills darts and kites with given colours and draws the edges with drawPiece.
fillDK:: Colour Double -> Colour Double -> Piece -> Diagram B
fillDK dcol kcol piece = drawPiece piece <> fillPiece col piece where
col = case piece of (LD _) -> dcol
(RD _) -> dcol
(LK _) -> kcol
(RK _) -> kcol
By making Pieces transformable we can reuse generic transform operations. These 4 lines of code are required to do this
typeinstance N (HalfTile a)= N a
typeinstance V (HalfTile a)= V a
instance Transformable a => Transformable (HalfTile a)where
transform t ht = fmap (transform t) ht
So we can also scale a piece and rotate a piece by an angle. (Positive rotations are in the anticlockwise direction.)
A patch is a list of located pieces (each with a 2D point)
type Patch =[Located Piece]
To turn a whole patch into a diagram using some function cd for drawing the pieces, we use
patchWith cd patch = position $ fmap (viewLoc . mapLoc cd) patch
Here mapLoc applies a function to the piece in a located piece – producing a located diagram in this case, and viewLoc returns the pair of point and diagram from a located diagram. Finally position forms a single diagram from the list of pairs of points and diagrams.
The common special case drawPatch uses drawPiece on each piece
drawPatch = patchWith drawPiece
Patches are automatically inferred to be transformable now Pieces are transformable, so we can also scale a patch, translate a patch by a vector, and rotate a patch by an angle.
As an aid to creating patches with 5-fold rotational symmetry, we combine 5 copies of a basic patch (rotated by multiples of ttangle 2 successively).
penta:: Patch -> Patch
penta p = concatMap copy [0..4]where copy n = rotate (ttangle (2*n)) p
This must be used with care to avoid nonsense patches. But two special cases are
sun,star::Patch
sun = penta [rkite `at` origin, lkite `at` origin]
star = penta [rdart `at` origin, ldart `at` origin]
This figure shows some example patches, drawn with drawPatch The first is a star and the second is a sun.
tile patches
The tools so far for creating patches may seem limited (and do not help with ensuring legal tilings), but there is an even bigger problem.
Correct Tilings
Unfortunately, correct tilings – that is, tilings which can be extended to infinity – are not as simple as just legal tilings. It is not enough to have a legal tiling, because an apparent (legal) choice of placing one tile can have non-local consequences, causing a conflict with a choice made far away in a patch of tiles, resulting in a patch which cannot be extended. This suggests that constructing correct patches is far from trivial.
The infinite number of possible infinite tilings do have some remarkable properties. Any finite patch from one of them, will occur in all the others (infinitely many times) and within a relatively small radius of any point in an infinite tiling. (For details of this see links at the end)
This is why we need a different approach to constructing larger patches. There are two significant processes used for creating patches, namely compChoices and decompPatch.
To understand these processes, take a look at the following figure.
experiment
Here the small pieces have been drawn in an unusual way. The edges have been drawn with dashed lines, but long edges of kites have been emphasised with a solid line and the join edges of darts marked with a red line. From this you may be able to make out a patch of larger scale kites and darts. This is a composed patch arising from the smaller scale patch. Conversely, the larger kites and darts decompose to the smaller scale ones.
Decomposition
Since the rule for decomposition is uniquely determined, we can express it as a simple function on patches.
where the function decompPiece acts on located pieces and produces a list of the smaller located pieces contained in the piece. For example, a larger right dart will produce both a smaller right dart and a smaller left kite. Decomposing a located piece also takes care of the location, scale and rotation of the new pieces.
This is illustrated in the following figure for the cases of a right dart and a right kite.
explanation
The symmetric diagrams for left pieces are easy to work out from these, so they are not illustrated.
With the decompPatch operation we can start with a simple correct patch, and decompose repeatedly to get more and more detailed patches. (Each decomposition scales the tiles down by a factor of but we can rescale at any time.)
This figure illustrates how each piece decomposes with 4 decomposition steps below each one.
experiment:: Piece -> Diagram B
experiment pc = emph pc <> (drawJPiece pc # dashingN [0.002,0.002]0
# lw ultraThin)where emph pc =case pc of-- emphasise join edge of darts in red(LD v)->(strokeLine . fromOffsets)[v] # lc red
(RD v)->(strokeLine . fromOffsets)[v] # lc red
-- emphasise long edges for kites(LK v)->(strokeLine . fromOffsets)[rotate (ttangle 1) v]
(RK v)->(strokeLine . fromOffsets)[rotate (ttangle 9) v]
Compose Choices
You might expect composition to be a kind of inverse to decomposition, but it is a bit more complicated than that. With our current representation of pieces, we can only compose single pieces. This amounts to embedding the piece into a larger piece that matches how the larger piece decomposes. There is thus a choice at each composition step as to which of several possibilities we select as the larger half-tile. We represent this choice as a list of alternatives. This list should not be confused with a patch. It only makes sense to select one of the alternatives giving a new single piece.
The earlier diagram illustrating how decompositions are calculated also shows the two choices for embedding a right dart into either a right kite or a larger right dart. There will be two symmetric choices for a left dart, and three choices for left and right kites.
Once again we work with located pieces to ensure the resulting larger piece contains the original in its original position in a decomposition.
As the result is a list of alternatives, we need to select one to make further composition choices. We can express all the alternatives after n steps as compNChoices n where
compNChoices :: Int -> Located Piece ->[Located Piece]
compNChoices 0 lp =[lp]
compNChoices n lp =do
lp' <- compChoices lp
compNChoices (n-1) lp'
This figure illustrates 5 consecutive choices for composing a left dart to produce a left kite. On the left, the finishing piece is shown with the starting piece embedded, and on the right the 5-fold decomposition of the result is shown.
Finally, at the end of this literate haskell program we choose which figure to draw as output.
fig::Diagram B
fig = filledSun6
main = mainWith fig
That’s it. But, What about composing whole patches?, I hear you ask. Unfortunately we need to answer questions like what pieces are adjacent to a piece in a patch and whether there is a corresponding other half for a piece. These cannot be done easily with our simple vector representations. We would need some form of planar graph representation, which is much more involved. That is another story.
Many thanks to Stephen Huggett for his inspirations concerning the tilings. A library version of the above code is available on GitHub
There is also a very interesting article by Roger Penrose himself: Penrose R Tilings and quasi-crystals; a non-local growth problem? in Aperiodicity and Order 2, edited by Jarich M, Academic Press, 1989.
More information about the diagrams package can be found from the home page Haskell diagrams
Strings like these can serve as example input when it is useful to communicate the length of the string but the content of the string is not important, for example, an example block of data encoded in base 10 (future post vkhdrcsg). Each string documents its own length.
For strings of lengths 20-29, we write the length-10 string above, then ten 2s, then up to 9 digits:
This shorthand will become especially useful when we explain longer strings.
The pattern easily extends up to strings of length 99.
Strings of length 10-19 are not (1x10)12345... because 10 and 11 would be difficult to distinguish:
1111111111 11111111111
Replacing long strings of 1s with ascending digits will also become useful later in creating a "ladder".
The first idea for 100 is (1x100), but that has same difficulty as 10-11 for strings of length 101. Next we try (1x10)(2x10)(3x10)(4x10)(5x10)(6x10)(7x10)(8x10)(9x10)(0x10). We then replace the initial (1x10) with 1234567890 as we did for 10-19, for reasons we will explain shortly, yielding
For strings of length 101-199, we simply append the strings of length 1-99 to the right of the above string.
The strings of length 200-299 has the string of length 100 from above, then (2x100), then the string of length 1-99. You have to trust that there are 100 2s. For example, the string of length 200:
One can see the recursive structure. The string of length 1000 has the string of length 100 at its beginning, which in turn has the string of length 10 at its beginning. This "ladder" of more and more stretched out versions of 1234567890 lets you determine what power of 10 you are dealing with by counting the number of recursions. If we were not to do this, it would be difficult to distinguish between (say) one thousand and ten thousand 1s. Once we know what power of 10 we are at -- what rung of the ladder we are at -- we can assume that the following long strings of 2s, 3s, etc. each have the same length.
Here is the string of length 1023, illustrating how nothing interesting happens when the length has an internal zero digit:
There will be multiple ladders, one for each nonzero digit in the length of a string. The string 1234567890 indicates that the previous digit in the length is complete and a new ladder is beginning. Here is the string of length 2345:
-- expand by 10 dogrow :: Integer -> Char -> String; dogrow base '1' = List.genericTake (base-1) positive_digits ++ "0"; dogrow base c = List.genericReplicate base c;
-- process little-endian input one digit at a time growasdigits :: Integer -> [Integer] -> String; growasdigits _base [] = ""; growasdigits base (h:t) = (growasdigits base t & concatMap (dogrow base)) ++ List.genericTake h positive_digits;
Future work: parser to verify correctness of a string.
Below is the construction in base 4 instead of base 10, showing strings of length 1 through 70. The first column is string lengths expressed in base 4.
I was recently made aware that haddock hasn’t been working, at all, on the Mecury code base.
I decided to investigate.
Watching htop, haddock slowly accumulated memory, until it exploded in use and invoked the OOM killer.
My laptop has 64GB of RAM.
What.
I rebooted, and tried again.
With no other programs running, haddock was able to complete.
I enabled -v2 and --optghc=-ddump-timings, which printed out GHC timing information and Haddock time/memory information.
With these flags, I could see that HTML generation alone was allocating 800GB of RAM.
I decided to look at the source code and see if there were any low hanging fruit.
Fortunately, there was!
Don’t use WriterT for logging
This section culminated in this PR #1543, which I’ll reference.
At time of writing, it has been merged.
The first thing that jumped out at me is that haddock used WriterT for logging.
Even worse, it used WriterT [String].
This is maybe the slowest possible logging system you can imagine.
At least, use the CPS Writer
WriterT has a big problem with space leaks.
Even the strictWriterT has this issue.
The only WriterT that can avoid it is the CPS variant, newly available in mtl-2.3 in Control.Monad.Writer.CPS.
This is documented in Infinite Negative Utility’s post “Writer Monads and Space Leaks”, which references two posts from Gabriella Gonzalez to the mailing list in 2012 and 2013.
Don’t use [String] or String
Beyond just leaking space, the String format for log messages is extremely inefficient.
This is equal to a [Char], which builds up as a big thunk in memory until the whole WriterT computation can complete.
Each Char takes 2 words of memory, and a [Char] takes (1 + 3N) words + 2N where N is the number of characters.
Or, ~5 words per character.
On a 64 bit machine, each word is 8 bytes, so each character costs 40 bytes.
A UTF-8 encoded ByteString will use 1 to 4 bytes per character.
Using a ByteString would make the representation much more compact, but these things get concatenated a bunch, and a Builder is the appropriate choice for an O(1) append.
Switching to CPS.WriterT [Builder] instead of WriterT [String] helps, but we’re not done yet.
[] is a bad choice for a WriterT, since tell will mappend the lists.
mappend on lists can result in bad performance if it isn’t associated correctly - (((a ++ b) ++ c) ++ d) ++ e is accidentally quadratic, since we’ll traverse over the a list for every single ++ call.
A “difference list” has much faster appends, since it can associate things correctly regardless of how you construct it.
To make it easier to use the API, I created an ErrorMessages type:
There were a few places that called nub.
nub is O(n^2) on a linked list - it must search the entire list for every element in the list to verify uniqueness.
That last one is especially gnarly.
We’re doing an O(n^2) job, leaking space along the way, and finally we accumulate the big list- only to do an O(n^2) traversal over it to delete duplicates.
Fortunately, each call site of nub can be replaced with the easyordNub:
This small change resulted in a huge improvement in performance for my test case.
Running haddock on the persistent-test library, I observed a 30% improvement in the time to generate documenation with total memory use 4% better.
Allocations went from 42GB to 25GB.
I didn’t bother profiling to determine this as a hot-spot - it’s always wrong to use WriterT as a logger.
A further performance improvement would be to remove WriterT entirely and simply output the messages directly.
Then instead of retaining a big difference list of log messages, you can just print them right then and there.
haddock uses a library xhtml for generating the HTML.
This library is old - the initial copyright is 1999.
xhtml predates ByteString entirely, which has an earliest copyright of 2003.
Anyway, we have a similar problem.
The Html type is defined like this:
newtypeHtml=Html{getHtmlElements::[HtmlElement]}dataHtmlElement=HtmlStringString|HtmlTag{markupTag::String,markupAttrs::[HtmlAttr],markupContent::Html}-- | Attributes with name and value.dataHtmlAttr=HtmlAttrStringString
The xhtml library uses ++ on lists all over the place.
The renderHtml' function uses ShowS, fortunately - this is difference list of Char, so we probably won’t be seeing pessimal performance.
Like the above PR to remove WriterT [String] and replace it with a difference list of Builder, I did that to xhtml.
All explicit lists are now difference lists, and all String are replaced with Builder.
The performance results are impressive:
Haddock Head
xhtml Builder
Absolute Difference
Relative Change
HTML allocations
1134 MB
1141 MB
+7 MB
0.6% worse
HTML time:
380 ms
198 ms
-182 ms
47.9% improvement
Total Memory:
554 MB
466 MB
-88 MB
15.9% improvement
Total Allocated:
16.0 GB
16.0 GB
0
No change
Max residency:
238 MB
195 MB
-43 MB
18.1% improvement
Total Time:
10.88 s
6.526s s
-4.354 s
40% improvement
Avoiding [] and Stringhalves the time to render HTML, and results in a 40% overall improvement in the time to run haddock.
While we don’t allocate any fewer memory during HTML generation, we’re using 16% less total memory and maximum residency is down by 18%.
Conclusion
Haskell performance doesn’t have to be hard.
If you avoid common footguns like WriterT, [], String, nub, etc. then your code will probably be pretty quick.
Picking the low hanging fruit is usually worthwhile, even if you haven’t spent the effort determining the real problem.
Profiling shows that haddock spends an enormous amount of time generating object code - a necessary step for any module that has TemplateHaskell enabled.
With GHC 9.6, we’ll be able to pass -fprefer-byte-code, which will use the much faster byte code representation instead of object code.
Even in HTML generation, profiling indicates that we spend the majority of time doing fixChar - the process of escaping a character into an HTML appropriate Builder.
We also spend a bunch of time regenerating HTML for re-exports - the HTML documentation for a datatype, function, type class, etc is generated fresh for every module that exports it.
Even if HTML were perfectly optimized, Haddock’s current design creates a huge [Interface], where each Interface is a module that you are generating documentation for.
This [Interface] must be retained in memory, because it is passed to each “component” of the documentation build.
Refactoring haddock to stream these interfaces isn’t obvious, since some doc building steps require summary of the entire [Interface] in order to proceed.
Figuring out a fix for the “real problems” would have been much more difficult than these easy fixes, which have still made a huge difference in overall perforamnce.
The context for this post is that at work I recently implemented Nix ecosystem support for “incrementally” building Haskell packages. By “incrementally” I mean that these Nix builds only need to build what changed since the last full build of the package so that the package doesn’t need to be built from scratch every time.
The pull requests implementing this feature have not yet been approved or merged at the time of this writing, but I figured that I would explain the motivation, design, results, and limitations of this work to hopefully persuade people that this work should be merged.
If you’re not interested in the design then you can skip straight to the Demo section below.
Background
I work on Mercury’s Backend Development User Experience team and we support developers contributing to a large Haskell monolith consisting of 3000+ modules. That may seem like a lot but the vast majority of these modules are small and the whole codebase takes ~14 minutes to compile in CI if we disable optimizations (although we still build with optimizations enabled for deployment).
In my experience, that’s pretty good for a Haskell project of this size, thanks not only to the work of our team but also other teams who also contribute to improving the development experience. In fact, the pioneering work for this “incremental builds” feature actually originated from two engineers outside our team.
First, Harry Garroodimproved GHC’s change detection algorithm so that GHC would use the hash of the file to detect changes instead of using the timestamp. In this post he explains how you can make use of this to implement incremental builds for traditional CI services (e.g. GitHub actions) where each build reuses the intermediate build products from the prior build instead of building from scratch.
That alone would not be enough for us to use this at work since we use Nix where this sort of build impurity doesn’t fly. However, Harry and Jade Lovelace prototyped using this feature in Nixpkgs so that Nix builds of Haskell packages could also reuse intermediate build products from prior builds to save work. You can find their prototype here.
The basic idea behind the prototype Nixpkgs integration is that you split a Haskell package build into two separate builds:
A “full build” that builds the Haskell package from scratch
This full build exports its intermediate build products (i.e. the dist directory) which can then be reused by:
An “incremental build” that only builds what changed since the full build
This incremental build imports the intermediate build products from the corresponding full build so that it doesn’t have to build the package from scratch.
So you might wonder: if that was already implemented then what work still remained for me to do?
Problem
The main issue with the initial Nixpkgs integration is that it does not provide any support for selecting which Git revision to use as the basis for the full build. The existing solutions require some out-of-band process to automatically select and lock the appropriate git revision to use for the older (full) build.
Non-solution #0: Rolling rebuilds
The first non-solution is for each revision to always reuse the build products from the previous revision. This doesn’t work well with Nix because it would create an increasingly-long chain of dependent derivations; in order to build the most recent revision you’d have to build all preceding revisions.
The dilemma here is that Nix is forcing us to confront something that other build tools gloss over: if you’re always reusing build products from the last build then you can’t accurately reproduce the most recent build from scratch without reproducing all prior builds. You’ve essentially “contaminated” the current build with all prior builds by doing things in this way.
So what we really want is something more like this:
Periodically do a full build from scratch and then make each incremental build relative to the last full rebuild.
That’s much more compatible with Nix because then we only need to do two builds of our project if we rebuild things from scratch, instead of one build for every revision in our project’s history.
There’s also another issue with rolling rebuilds when you’re not using Nix, which is that most naïve attempts to do this don’t ensure that the starting build products came from the parent commit. You can end up with contamination of build products across branches if you’re not careful, which further complicates reproducibility.
Non-solution #1: Lockfile
Okay, so suppose you periodically do a full build of the project from scratch and then each incremental build is relative to the last full build. You would need to do a full rebuild frequently enough so that the incremental builds stay quick. If you wait too long in between full rebuilds then the project will evolve to the point where the incremental builds can no longer reuse most of the build products from the last full build and in the extreme case the incremental builds degenerate into full builds if they can’t reuse any old build products.
For example, at our work we currently do a full build of our large package once a day, so we need some way to update the full build to point to the last revision from the preceding day.
One existing approach to solving this involved using Nix flakes to manage the git revision for the older build. The idea is that you periodically run nix flake update to update the revision used for the full build and you might even automate this process by having some recurring cron job generate a pull request or commit to bump this revision on the main development branch. You don’t have to use flakes for this purpose, but flakes are probably the most ergonomic solution along these lines.
However, there are a few issues with this approach:
It only works well for short-lived pull requests
In other words, if you update the revision used for the full build once a day then typically only pull requests that are less than a day old will benefit from incremental builds.
Specifically, what we’d really like is “branch-local” incremental builds. In other words if a longer-lived development branch were to deposit a few commits a day we’d like there to be a full rebuild once a day on that branch so that incremental builds against the tip of that development branch remain snappy.
It pollutes the git history
If you bump the lockfile, say, once per day then that’s one junk commit that you’ve added to your git history every day.
It’s difficult to open source any useful automation around this
If the solution requires out-of-band machinery (e.g. some recurring cron job) to bump the lockfile you can’t provide a great user experience for open source projects. It only really works well for proprietary projects that can tolerate that complexity.
That last point was the most important one for me. Generally, when I design something (even something intended for internal, proprietary use) I try to design it in such a way that it works well in an open source context, too. In my experience, doing things in this way tends to improve the design, quality, and user experience of software that I build.
In particular, I wanted a solution where all the automation could be implemented entirely within the Nix language. However, this is not possible in Nix’s present form!
Non-solution #2: Rollback derivation
So what I really wanted was a Nix function (which I will call “truncate”) that would take any git repository and roll it back in time to the last commit before some repeating time boundary (where the time boundary might be, say, an hour, or day, or week). For simplicity, let’s just say that the desired time interval is one day so I want to roll back the repository to the last revision from the day before.
If I had such a truncate function then it would be easy to automatically select which revision to use for the full build. I would:
extract the source git repository from the current Haskell package build
truncate that git repository to the last revision from the day before
Use that “truncated” revision as the source for the full build
Use that full build as the input to the current (incremental) build
Then if I built multiple revisions for the same day they would all share the same full build since they would all get “truncated” to the same revision from the previous day.
However, there isn’t a great way to implement this truncate function in Nix. To see why, consider the following (wrong) solution:
extract the source git repository from the current Haskell package build
Let’s call the derivation for this git repository “src”
create a new Nix derivation (“src2”) that rolls back src
In other words, this would be a trivial Nix derivation that begins from src and runs something like:
$ git checkout $(git rev-list -1--before'1 day ago' HEAD)
… and stores that as the result
Use src2 as the input to the full build
Do you see the problem with that approach?
The above wrong solution doesn’t allow multiple incremental builds from the same day to share the same full build from the prior day. This is because src2 depends on src and since each incremental build has a different src repository then each also have a different src2 derivation and therefore a different full build. That in turn defeats the purpose of incremental builds if we have to do a new full rebuild for each incremental build.
For this to work we would need a way to roll back a gitrepository to an older revision that less sensitive to the current revision.
Non-solution #3: Plain fetchGit
The builtins.fetchGit utility almost does what we want! This primitive function lets you fetch a gitrepository at evaluation time, like this:
The above result is the same no matter what revision I currently have checked out at ~/proj/turtle because Nix’s fetchGit function produces a content-addressed derivation. In other words, if two invocations of fetchGit generate the same final repository state then they share the same outPath. This is exactly the behavior we want: we need the source repository for the full build to be content-addressed so that multiple incremental builds can share the same full build.
However, the problem is that I don’t exactly know which revision I want. What I really want to be able to say is “get me the last revision from the day before this other revision”. fetchGit does not expose any way to do something like that.
That brings us to the actual solution:
Solution
The solution I went with was the following two pull requests:
This amends builtins.fetchGit to allow a date specification, which can either be a relative date (e.g. 1 day ago) or an absolute date (e.g. 2020-01-01T00:00:00 or a Unix timestamp like 1671388622). Basically, this argument accepts anything git accepts as a date specification (which is a lot since git is pretty flexible in this regard).
The cool thing about this change is that it doesn’t compromise the purity of builtins.fetchGit. If a given fetchGit specification was pure then adding a date specification preserves that purity.
This pull request actually does two separate things:
This polishes and upstreams the prototype support for incremental builds
In other words, this upstreams Harry and Jade’s work to split a Haskell build into two builds: a full build and incremental build
This uses the fetchGit patch to automate the full build selection
There’s a new pkgs.haskell.lib.incremental utility which uses builtins.fetchGit to automatically update the full build for you and it has all the desired behaviors (including branch-local incrementalism).
I could have split this into two separate pull request (and I still might) but for internal testing purposes it was easier to do everything on one branch. I’m waiting for a decision on the other pull request before deciding whether or not to split up this branch.
Demo
I’ll use my turtle package as the running example for the demo. If you clone the gabriella/incremental branch of my turtle repository:
$ git clone --branch gabriella/incremental \ https://github.com/Gabriella439/turtle.git $ cd turtle
… you’ll find the following default.nix file making use of the Nixpkgs support for incremental Haskell builds:
in { inherit (pkgs.haskell.packages."${compiler}") turtle turtle-incremental ; }
However, that alone is not enough to make use of incremental builds. If you attempt to build that (at the time of this writing) you’ll get an error message like this:
$ nix build --file ./default.nix turtle-incremental error: evaluation aborted with the following error message: 'pkgs.haskell.lib.incremental requires Nix version 2.12.0pre20221128_32c182b or newer' (use'--show-trace' to show detailed location information)
The Nixpkgs support for incremental builds depends on a matching change to the Nix interpreter, so you actually have to run:
$ nix run github:Gabriella439/nix/gabriella/fetchGit --\ build --file ./default.nix turtle-incremental
… or if you don’t yet have flakes enabled, then use this pedantically complete command:
Once the build is complete you can inspect the logs and you should see something like the following buildPhase:
$ nix log ./result
… @nix { "action": "setPhase", "phase": "buildPhase" } building Preprocessing library for turtle-1.6.1.. Building library for turtle-1.6.1.. Preprocessing test suite 'regression-broken-pipe' for turtle-1.6.1.. Building test suite 'regression-broken-pipe' for turtle-1.6.1.. [2 of 2] Linking dist/build/regression-broken-pipe/regression-broken-pipe [Libr> Preprocessing test suite 'regression-masking-exception' for turtle-1.6.1.. Building test suite 'regression-masking-exception' for turtle-1.6.1.. [2 of 2] Linking dist/build/regression-masking-exception/regression-masking-exc> Preprocessing test suite 'tests' for turtle-1.6.1.. Building test suite 'tests' for turtle-1.6.1.. [2 of 2] Linking dist/build/tests/tests [Library changed] Preprocessing test suite 'system-filepath-tests' for turtle-1.6.1.. Building test suite 'system-filepath-tests' for turtle-1.6.1.. [2 of 2] Linking dist/build/system-filepath-tests/system-filepath-tests [Librar> Preprocessing test suite 'cptree' for turtle-1.6.1.. Building test suite 'cptree' for turtle-1.6.1.. [2 of 2] Linking dist/build/cptree/cptree [Library changed] …
This is shows that the incremental builds are indeed working. We still have to re-link some executables (for reasons that are still not clear to me), but none of the Haskell modules needed to be rebuilt since nothing has changed (yet) since the last rebuild.
Now let’s test that by making a small whitespace change to one of the Turtle modules:
$ echo >> src/Turtle/Prelude.hs
Then if we rebuild the package we’ll see the following build phase:
… turtle> building turtle> Preprocessing library for turtle-1.6.1.. turtle> Building library for turtle-1.6.1.. turtle> [ 7 of 10] Compiling Turtle.Prelude ( src/Turtle/Prelude.hs, dist/build/Turtle/Prelude.o, dist/build/Turtle/Prelude.dyn_o ) [Source file changed] turtle> src/Turtle/Prelude.hs:319:1: warning: [-Wunused-imports] turtle> The import of ‘Data.Monoid’ is redundant turtle> except perhaps to import instances from ‘Data.Monoid’ turtle> To import instances alone, use: import Data.Monoid() turtle> | turtle> 319 | import Data.Monoid ((<>)) turtle> | ^^^^^^^^^^^^^^^^^^^^^^^^^ turtle> Preprocessing test suite 'regression-broken-pipe' for turtle-1.6.1.. turtle> Building test suite 'regression-broken-pipe' for turtle-1.6.1.. turtle> [2 of 2] Linking dist/build/regression-broken-pipe/regression-broken-pipe [Library changed] turtle> Preprocessing test suite 'regression-masking-exception' for turtle-1.6.1.. turtle> Building test suite 'regression-masking-exception' for turtle-1.6.1.. turtle> [2 of 2] Linking dist/build/regression-masking-exception/regression-masking-exception [Library changed] turtle> Preprocessing test suite 'tests' for turtle-1.6.1.. turtle> Building test suite 'tests' for turtle-1.6.1.. turtle> [2 of 2] Linking dist/build/tests/tests [Library changed] turtle> Preprocessing test suite 'system-filepath-tests' for turtle-1.6.1.. turtle> Building test suite 'system-filepath-tests' for turtle-1.6.1.. turtle> [2 of 2] Linking dist/build/system-filepath-tests/system-filepath-tests [Library changed] turtle> Preprocessing test suite 'cptree' for turtle-1.6.1.. turtle> Building test suite 'cptree' for turtle-1.6.1.. turtle> [2 of 2] Linking dist/build/cptree/cptree [Library changed] …
Our package only built the “diff” (the Turtle.Preludemodule we just changed)!
Benchmarks
For the turtle package the speed-up is not a huge deal because the package doesn’t take long time to compile, but the benefit for our main project at work is dramatic!
As I mentioned in the introduction, our work project normally takes ~14 minutes to build and after this change builds can be as fast as ~3.5 minutes. In fact, they could even be faster except for the presence of a Paths_* module that is rebuilt each time and triggers a large number of gratuitous downstream rebuilds (we’re working on fixing that).
Limitations
There is one major issue with this work, which is that it does not work well with flakes.
Specifically, if you try to turn the above default.nixinto the equivalent flake the build will fail because Nix’s flake mechanism will copy the project into the /nix/store but without the .git history, so builtins.fetchGitwill fail to to fetch the current repository’s history necessary to truncate the build to the previous day.
I believe this can be fixed with a change to flakes to support something like a ?shallow=false or ?allRefs=true addendum to git URLs, but I have not implemented that, yet.
sqlite-easy is a new database library for interacting with SQLite3 databases.
It provides the ability to create database resource pools, run SQL statements and queries,
wrap statements in simple transactions, and perform database migrations.
I would advise against using it for anything serious, but it might be useful for fun hobby projects.
You can try it now in ghci:
➜ cabal repl --build-depends sqlite-easy
[...]
λ> :set -XOverloadedStrings
λ> import Database.Sqlite.Easy
λ>
λ> withDb ":memory:" $ run "select 1 + 1"
[[SQLInteger 2]]
λ>
λ> pool <- createSqlitePool "/tmp/mysqliteasydb.bin"
λ> withPool pool $ run "create table t (x text)"
[]
λ> :{
λ| withPool pool $
λ| transaction $ do
λ| run "insert into t values ('a'),('b'),('c')"
λ| run "insert into t values ('b'),('a'),('b')"
λ| :}
[]
λ> withPool pool $ runWith "select x, count(x) from t where x <> ? group by x" [SQLText "c"]
[[SQLText "a",SQLInteger 2],[SQLText "b",SQLInteger 3]]
I created this library because I wanted something that is plug and play and doesn't require a lot of thinking.
Because of these requirements I decided to:
Provide an easy API for common cases, and try to avoid pitfalls like SQL injection by default
Include solutions for migrations and resource-pooling - no need to search for libraries or read other docs
Avoid config files, Template Haskell, type classes and type level programming - if you know some basic Haskell (and SQL), you should be fine
Avoid customization or less common features - small API, less to learn
Just use SQL for interacting with the database - no restrictions, but also no validation
No external dependencies - SQLite3 bundled and embedded into the executable via sqlite-direct
None of the solutions I was familiar ticked all of the boxes, and after writing the same code a few times for different projects
I decided to just make it a library and publish it.
withDb - takes a connection string and actions to run, and runs them, producing a result.
Creating a database resource pool using createSqlitePool and then run actions using withPool like with withDb.
2. Run database migrations
Run migrations using the
migrate
function by defining a list of migration names, a setup function, and a teardown function that maps the name of the migration to the action to take.
SQL is a type representing a SQL string.
Use OverloadedStrings to construct values of this type.
SQLite
is the type describing SQLite actions. SQLite actions can be composed using the monadic interface.
SQLData
is a type representing a SQLite3 scalar value. SQLite3 has 5 possible value groups - integers, doubles, text strings, binary blobs and null.
[[SQLData]] represents rows of columns of scalar values.
SQLite actions can also be wrapped in a transaction.
Static site generators (SSGs) are all rage these days as people realize that plain HTML websites are good enough for most cases. SSGs take raw data in various formats—often Markdown, JSON, and YAML—and process them to produce the static websites, which can then be hosted easily on any hosting provider, or on personal VPSes. In this post, we write a bespoke SSG using the Shake build system.
In the beginning, people coded websites by hand, painstakingly writing the HTML tags and CSS styles (and JavaScript code, if they were into DHTML). Many weblogs were crafted by the hands of passionate individuals, even before the word Blog came into being.
Over time, these websites grew in size and some clever people decided to separate the data for the websites from the presentation and layout. The data moved into databases and CGI scripts were invented to pull the data and create webpages out of them programmatically, on request. Thus began the age of Content management systems (CMS) like Drupal, and of course, blogging software like Wordpress and Blogspot.
Things eventually came full circle, as people realized that they don’t need the bloated and complicated mess of CMSes and blogging software, but at the same time appreciated the separation of presentation and data. Thus Static site generators were born1.
SSGs allow users to write blog articles and pages as plain data in various simple formats like Markdown or reStructuredText, and configurations in YAML, JSON or TOML, and process them to produce static websites in HTML/CSS/JS.
Most SSGs allow the user to operate in a default mode where you can follow the conventions of the SSG—like writing the blog articles in certain formats, and putting them in certain directories—and the SSG takes care of everything else. The user does not need to know any internals.
At the same time, most SSGs allow users to customize the output website by creating custom templates, and custom URLs. However, all SSGs limit what users can do with them. If you need to do something that goes against the grain of your SSG, you are stuck.
Build Systems
SSGs are used to create websites by transforming a set of input files (templates, content, and assets) into a set of output files (HTML, CSS, and JavaScript files). In this sense, SSGs can be seen as a type of build system, as they automate the process of building a website by following a set of rules and dependencies.
A build system is a tool for automating the process of building complex projects. Build systems are commonly used in software development to ensure that the correct sequence of steps is followed in order to produce a working version of the software. This typically involves compiling source code, linking libraries, and running tests to ensure that the software is correct. However, build systems can also be used for projects in other domains where a set of input files need to be transformed into a set of output files according to some rules.
Shake is a build system written in the Haskell. It is flexible and powerful enough for managing the build process of complex software projects like GHC, but at the same time, it is simple enough to be used to create an SSG2.
Shake
In Shake, build targets represent the files or outputs that need to be produced as part of the build process. These could be executable binaries, library files, or any other type of output that is required to complete the build. Build targets are declared in a build script, along with information about their dependencies. For example, if an executable binary depends on a particular library file, the build script would specify this dependency.
Once the build targets and their dependencies have been declared, Shake uses Rules to specify how those targets should be built. A rule typically consists of a pattern that matches one or more targets, along with a set of instructions—called build Actions by Shake—for building them. For example, a rule might specify that a certain type of source code file should be compiled using a particular compiler, with a certain set of flags. When Shake encounters a target that matches the pattern in a rule, it executes the instructions in the rule to build it.
By declaring dependencies between targets and defining rules to build them , Shake is able to figure out the correct order in which to build the targets 3. Shake also provides a number of features to help users customize and optimize their build process, such as support for parallel builds, on-demand rebuilding, and caching of intermediate results.
In this post, we use Shake to build an SSG by defining the build targets and rules for building the website. In addition, we use Pandoc to render Markdown content into HTML, and Mustache to render HTML templates.
Site.hs contains the Haskell code that we are going to write in this post. about.md and contact.md are two static pages. The css and images directories contain assets for the website. The posts directory contains blog posts, names of which start with the post publication dates in YYYY-mm-dd format. Finally, the templates directory contains the Mustache templates for the website.
The blog posts start with YAML metadata sections that contain the title of the post, name of the author (optional) and a list of tags for the post. For example:
---title: Welcome to my blogauthor: Abhinav Sarkartags: - brag - note---Welcome to my new blog. I wrote the blog generator myself.
posts/2022-08-12-welcome.md
Pages are written in a similar fashion, but have only title in their YAML metadata.
After processing the input above, our SSG produces the following file structure:
The CSS and image assets are copied directly. One index.html file is generated for each page, post, and tag. Additionally, one file is generated for the archive of posts, and one for the home page.
With the input and output described, let’s get started with writing the generator.
Main
We are going to write the program in a top-down fashion, starting with the main function. First come the extensions and imports. Other than imports from Shake, Pandoc and Mustache libraries, we also import from aeson, text, time and unordered-containers libraries4.
The main function sets up the top-level Shake build targets, and lets Shake invoke the right one depending on the arguments passed at runtime.
main ::IO ()main = Shake.shakeArgs Shake.shakeOptions $do Shake.withTargetDocs "Build the site"$"build"~> buildTargets Shake.withTargetDocs "Clean the built site"$"clean"~> Shake.removeFilesAfter outputDir ["//*"] Shake.withoutTargets buildRulesoutputDir ::StringoutputDir ="_site"
There are two top-level build targets:
build: generates the website.
clean: deletes the generated website.
outputDir is the subdirectory in which the website is generated. Building the clean target deletes all files inside outputDir. The build target runs the buildTargets action that sets up the build targets for generating the site. The buildRules are also included in the Shake setup.
Build Targets
The buildTargets function sets up the build targets for the files to be generated by Shake.
The Shake.need function registers one or more targets with Shake.
For assets, we just want them to be copied to the outputDir at the same path.
Page and post target paths in the outputDir are stripped of their extensions and appended with /index.html. So a post sourced from posts/example.md ends up at <outputDir>/posts/example/index.html.
We also register two composite targets for the post archive and the home page5.
The paths, globs and helper function are shown below:
Now Shake knows what we want it to build. But how does it know how to build them? That’s what the build rules are for.
Build Rules
We have one build rule function for each build target type:
buildRules ::Rules ()buildRules =do assets pages posts archive tags home
Let’s start with the simplest one, the build rule for assets.
Assets
In Shake, the build rules are written with |%> or %> operators. The |%> operator takes a list of output globs or paths, and a function from target path to build action. When Shake.need is called with a file that matches a target glob, the corresponding build action is called with the target path.
In case of assets, we simply get the original source path by dropping the first directory from the target path (that is, outputDir), and copy the source file to the target path if the file has changed6.
Pages
Building pages is a bit more interesting. First, we write a data type to represent a page:
dataPage=Page {pageTitle ::Text, pageContent ::Text}deriving (Show, Generic)deriving (ToJSON) via PrefixedSnake"page"Page
A page has a title and some text content. We also make Page data type JSON serializable so that it can be consumed by the Mustache library for filling templates.
We get the source path from the target path by passing it through the indexHtmlSourcePath function. We read and render the source file by calling the markdownToHtml function. It returns the page YAML metadata as a FromJSON-able value (a HashMap in this case), and the page HTML text.
Next, we apply the Page data to the default.html template, and write it to the target path by calling the applyTemplateAndWrite function. This creates the HTML file for the page.
The default.html Mustache template can be seen below:
Other than the title and text content, a post also has a date, a list of tags, an optional author, and a permalink. Some of these data come from the post YAML metadata, and some are derived from the post source path. as we see below: