Planet Haskell
http://planet.haskell.org/
enPlanet Haskell - http://planet.haskell.org/Michael Snoyman: What Makes Haskell Uniquehttps://www.snoyman.com/blog/2017/12/what-makes-haskell-unique
https://www.snoyman.com/blog/2017/12/what-makes-haskell-unique
<p>I gave a talk today at the <a href="https://fby.by/">F(by) 2017 conference</a> in
Minsk, Belarus. The conference was great, I would definitely recommend
it in the future. Thank you very much to the organizers for the
opportunity to present on Haskell.</p><p>I prepared for this talk differently than I've prepared for other
talks in the past. I'm very comfortable writing up blog posts, but
have always found slide preparation difficult. This time around, I
wrote up the content in mostly-blog-post form first, and only created
the slides after that was complete. Overall, this worked very well for
me, and I'll try it again in the future. (If others want to share
their approaches to preparing talks, I'd definitely be happy to hear
them.)</p><p>As a result: I'm able to share the original write-up I did as
well. For those who saw the live talk (or the video): you may want to
skip towards the end, which covers some material that there wasn't
time for in the talk itself.</p><p>If you'd like to follow with
<a href="https://www.snoyman.com/reveal/what-makes-haskell-unique">the slides</a>,
they're also available.</p><hr /><p>My name is Michael Snoyman. I work at a company called FP
Complete. One of the things we do is help individuals and companies
adopt Haskell, and functional programming in general. And that leads
right in to the topic of my talk today:</p><p><b>What makes Haskell unique</b></p><p>Programmers today have a large number of languages to choose from when
deciding what they will learn and use in their day to day coding. In
order to make intelligent decisions about which languages to pursue,
people need to be able to quickly learn and understand what
distinguishes one language from another.</p><p>Given that this is a functional programming conference, it's probably
no surprise to you that Haskell can be called a functional programming
language. But there are lots of languages out there that can be called
functional. Definitions vary, but let's take a particularly lax
version of functional programming: first class functions, and higher
order functions. Well, by this defintion, even a language like C
counts! You may want to limit the definition further to include
syntactic support for closures, or some other features. Regardless,
the same point remains:</p><p><b>Haskell may be functional, but that doesn't make it unique</b></p><p>In fact, there's a long list of features I could rattle off that could
be used to describe Haskell.</p><ul><li>Functional</li><li>Statically typed</li><li>Pure</li><li>Lazy</li><li>Strongly typed</li><li>Green threads</li><li>Native executables</li><li>Garbage collected</li><li>Immutability</li></ul><p>Some of these features, like being pure and lazy, are relatively rare
in mainstream languages. Others, however, are common place. What I'm
going to claim is that not one of these features is enough to motivate
new people to Haskell—including people in this audience—to
start using it. Instead:</p><p><b>It's the combination of these features that makes Haskell unique</b></p><p>As an example: the intersection of purity, strong typing, and
functional programming style, for instance, lends itself to a high
level form of expression which is simultaneously easy to write, easy
to read, easy to modify, and efficient. I want to share some examples
of some code examples in Haskell that demonstrate how the language
encourages you to write code differently from other languages. And I'm
going to try to claim that this "different" style is awesome, though
it also has some downsides.</p><h2 id="async-io-and-concurrency">Async I/O and Concurrency</h2><p>Let's start off with a use case that's pretty popular today. Look at
this pseudocode and tell me what's wrong with it:</p><pre><code>json1 := httpGet(url1)
json2 := httpGet(url2)
useJsonBodies(json1, json2)</code></pre><p>Given the heading of this slide, you may have guessed it: this is
blocking code. It will tie up an entire thread waiting for the
response body from each of these requests to come back. Instead, we
should be using asynchronous I/O calls to allow more efficient usage
of system resources. One common approach is to use callbacks:</p><pre><code>httpGetA(url1, |json1| =>
httpGetA(url2, |json2| =>
useJsonBodies(json1, json2)
)
)</code></pre><p>You may recognize this coding style as "callback hell." There are
plenty of techniques in common languages to work around that, usually
around the idea of promises or futures. And you may have heard
something about how Javascript futures are a monad, and expect me to
be talking about how Haskell does monads better. But I'm not going to
do that at all. Instead, I want to show you what the asynchronous
version of the code looks like in Haskell</p><pre><code class="haskell">json1 <- httpGet url1
json2 <- httpGet url2
useJsonBodies json1 json2</code></pre><p>This may surprise you, since this looks exactly like the blocking
pseudocode I showed above. It turns out that Haskell has a powerful
runtime system. It will automatically convert your blocking-style code
into asynchronous system calls, and automatically handle all of the
work of scheduling threads and waking them up when data is available.</p><p>This is pretty great, but it's hardly unique to Haskell. Erlang and
Go, as two popular examples, both have this as well. If we want to see
what makes Haskell different...</p><p>we have to go deeper.</p><h3 id="concurrency">Concurrency</h3><p>It's pretty lame that we need to wait for our first HTTP request to
complete before even starting our second. What we'd like to do is kick
off both requests at the same time. You may be imagining some really
hairy APIs with threads, and mutable variables, and locks. But here's
how you do this in Haskell:</p><pre><code class="haskell">(json1, json2) <- concurrently
(httpGet url1)
(httpGet url2)
useJsonBodies json1 json2</code></pre><p>Haskell has a green thread implementation which makes forking threads
cheap. The <code>async</code> library provides a powerful, high level interface
performing actions in parallel without bothering with the low level
aspects of locking primitives and mutable variables. And this builds
naturally on top of the async I/O system already described to be cheap
about system resource usage.</p><h3 id="canceling">Canceling</h3><p>What we've seen already is elegant in Haskell, but it's not terribly
difficult to achieve in other languages. Let's take it to the next
level. Instead of needing both JSON response bodies, we only need one:
whichever one comes back first. In pseudocode, this might look like:</p><pre><code>promise1 := httpGet(url1)
promise2 := httpGet(url2)
result := newMutex()
promise1.andThen(|json1| =>
result.set(json1)
promise2.cancel())
promise2.andThen(|json2| =>
result.set(json2)
promise1.cancel())
useJsonBody(result.get())</code></pre><p>This code is tedious and error prone, but it gets the job done. As you
can probably guess, there's a simple API for this in Haskell:</p><pre><code class="haskell">eitherJson <- race
(httpGet url1)
(httpGet url2)
case eitherJson of
Left json1 -> useJsonBody1 json1
Right json2 -> useJsonBody2 json2</code></pre><p>At first, this may seem like it's just a well designed API. But
there's quite a bit more going on under the surface. The Haskell
runtime system itself supports the idea of an asynchronous exception,
which allows us to cancel any other running thread. This feature is
vital to making <code>race</code> work.</p><p>And here's the final piece in the puzzle. All of the thread scheduing
and canceling logic I've described doesn't just apply to async I/O
calls. It works for CPU-intensive tasks as well. That means you can
fork thousands of threads, and even if one of them is busy performing
computation, other threads will not be starved. Plus, you can
interrupt these long-running computations:</p><pre><code class="haskell">let tenSeconds = 10 * 1000 * 1000
timeout tenSeconds expensiveComputation</code></pre><h3 id="summary:-concurrency-and-async-io">Summary: concurrency and async I/O</h3><p><b>Advantages</b></p><ul><li>Cheap threads</li><li>Simple API</li><li>Highly responsive</li></ul><p><b>Disadvantages</b></p><ul><li>Complicated runtime system</li><li>Need to be aware of async exceptions when writing code</li></ul><h2 id="immutability-and-purity">Immutability and purity</h2><p>Most programming languages out there default to mutability: a variable
or field in a data structure can be changed at any time. Haskell is
different in two ways:</p><ol><li>Values are immutable by default, and mutability must be explicitly
indicated with a variable type</li><li>Mutating a mutable variable is considered a side effect, and that
mutable is tracked by the type system</li></ol><p>For example, the following Haskell-like code is impossible:</p><pre><code class="haskell">let mut total = 0
loop i =
if i > 1000000
then total
else total += i; loop (i + 1)
in loop 1</code></pre><p>From pure code, we cannot create, read, or modify a mutable
variable. We also need to say what kind of mutable variable we want:</p><pre><code class="haskell">total <- newIORef 0
let loop i =
if i > 1000000
then readIORef total
else do
modifyIORef total (+ i)
loop (i + 1)
loop 1</code></pre><p>This is a lot of ceremony for a simple algorithm. Of course, the
recommended Haskell way of doing this would be to avoid mutable
variables, and use a more natural functional style.</p><pre><code class="haskell">let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0</code></pre><p>Besides pushing us towards this supposedly better functional approach,
why is immutable, pure code such a nice thing?</p><h3 id="reasoning-about-code">Reasoning about code</h3><p>You'll often hear Haskellers throw around a phrase "reasoning about
code." Personally, I think the phrase is used to mean too many
different things. But let me give you an example that I think is
accurate. Let's look at some pseudocode:</p><pre><code>// scores.txt
Alice,32
Bob,55
Charlie,22
func main() {
results := readResultsFromFile("results.txt")
printScoreRange(results)
print("First result was by: " + results[0].name)
}
func printScoreRange(results: Vector<TestResult>) {
...
}</code></pre><p>If you look at the code above, what do you expect the output to be? I
think it would be reasonable to guess something like:</p><pre><code>Lowest: 22
Highest: 55
First result was by: Alice</code></pre><p>However, now let's throw in another piece of information: the
definition of <code>printScoreRange</code>:</p><pre><code>func printScoreRange(results: Vector<TestResult>) {
results.sortBy(|result| => result.score)
print("Lowest: " + results[0].score)
print("Highest: " + results[results.len() - 1].score)
}</code></pre><p>Suddenly our assumptions change. We can see that this function mutates
the <code>results</code> value passed to it. If we're passing mutable references
to vectors in this made up language, then our output is going to look
more like:</p><pre><code>Lowest: 22
Highest: 55
First result was by: Charlie</code></pre><p>Since the original <code>results</code> value in our <code>main</code> function has been
modified. This is what I mean by hurting our ability to reason about
the code: it's no longer sufficient to look at just the <code>main</code>
function to understand what will be happening. Instead, we're required
to understand what may possibly be occurring in the rest of our
program to mutate our variables.</p><p>In Haskell, the code would instead look like:</p><pre><code class="haskell">main :: IO ()
main = do
results <- readResultsFromFile "results.txt"
printScoreRange results
putStrLn $ "First result was by: " ++ name (head results)
printScoreRange :: [TestResult] -> IO ()
printScoreRange results = do
let results' = sortBy score results
putStrLn $ "Lowest: " ++ show (score (head results'))
putStrLn $ "Highest: " ++ show (score (last results'))</code></pre><p>We know that it's impossible for <code>printScoreRange</code> to modify the
<code>results</code> value we have in the <code>main</code> function. Looking at only this
bit of code in <code>main</code> is sufficient to know what will happen with the
<code>results</code> value.</p><h3 id="data-races">Data races</h3><p>Even more powerful than the single threaded case is how immutability
affects multithreaded applications. Ignoring the insanity of multiple
threads trying to output to the console at the same time, we can
easily parallelize our code:</p><pre><code class="haskell">main :: IO ()
main = do
results <- readResultsFromFile "results.txt"
concurrently_ printFirstResult printScoreRange
printFirstResult results =
putStrLn $ "First result was by: " ++ name (head results)
printScoreRange results = do
let results' = sortBy score results
putStrLn $ "Lowest: " ++ show (score (head results'))
putStrLn $ "Highest: " ++ show (score (last results'))</code></pre><p>There's no need to worry about concurrent accesses to data
structures. It's impossible for the other threads to alter our
data. If you do want other threads to affect your local data, you'll
need to be more explicit about it, which we'll get back to.</p><h3 id="mutability-when-needed">Mutability when needed</h3><p>One thing you may be worried about is how this affects
performance. For example, it's much more efficient to sort a vector
using mutable access instead of only pure operations. Haskell has two
tricks for that. The first is the ability to explicitly create mutable
data structures, and mutate them in place. This breaks all of the
guarantees I already mentioned, but if you need the performance, it's
available. And unlike mutable-by-default approaches, you now know
exactly which pieces of data you need to handle with care when coding
to avoid tripping yourself up.</p><p>The other approach is to create a mutable copy of the original data,
perform your mutable algorithm on it, and then freeze the new copy
into an immutable version. With sorting, this looks something like:</p><pre><code class="haskell">sortMutable :: MutableVector a -> ST (MutableVector a)
sortMutable = ... -- normal sorting algorithm
sortImmutable :: Vector a -> Vector a
sortImmutable orig = runST $ do
mutable <- newMutableVector (length orig)
copyValues orig mutable
sort mutable
freeze mutable</code></pre><p><code>ST</code> is something we use to have temporary and local mutable
effects. Because of how it's implemented, we know that none of the
effects can be visible from outside of our function, and that for the
same input, the <code>sortImmutable</code> function will always have the same
output. While this approach requires an extra memory buffer and an
extra copy of the elements in the vector, it avoids completely the
worries of your data being changed behind your back.</p><h3 id="summary:-immutability-and-purity">Summary: immutability and purity</h3><p><b>Advantages</b></p><ul><li>Easier to reason about code</li><li>Avoid many cases of data races</li><li>Functions are more reliable, returning the same output for the same
input</li></ul><p><b>Disadvantages</b></p><ul><li>Lots of ceremony if you actually want mutation</li><li>Some runtime performance hit for mutable algorithms</li></ul><h2 id="software-transactional-memory">Software Transactional Memory</h2><p>Let's say you actually need to be able to mutate some values. And for
fun, let's say you want to do this from multiple threads. A common
example of this is a bank. Let's again play with some pseudocode:</p><pre><code>runServer (|request| => {
from := accounts.lookup(request.from)
to := accounts.lookup(request.to)
accounts.set(request.from, from - request.amt)
accounts.set(request.to, to + request.amt)
})</code></pre><p>This looks reasonable, except that if two requests come in at the same
time for the same account, we can end up with a race
condition. Consider something like this:</p><pre><code>Thread 1: receive request: Alice gives $25
Thread 2: receive request: Alice receives $25
Thread 1: lookup that Alice has $50
Thread 2: lookup that Alice has $50
Thread 1: set Alice's account to $25
Thread 2: set Alice's account to $75</code></pre><p>We know that we want Alice to end up with $50, but because of our data
race, Alice ends up with $75. Or, if the threads ran differently, it
could be $25. Neither of these is correct. In order to avoid this, we
would typically deal with some kind of locking:</p><pre><code>runServer (|request| => {
accounts.lock(request.from)
accounts.lock(request.to)
// same code as before
accounts.unlock(request.from)
accounts.unlock(request.to)
})</code></pre><p>Unfortunately, this leads to deadlocks! Consider this scenario:</p><pre><code>Thread 1: receive request: $50 from Alice to Bob
Thread 2: receive request: $50 from Bob to Alice
Thread 1: lock Alice
Thread 2: lock Bob
Thread 1: try to lock Bob, but can't, so wait
Thread 2: try to lock Alice, but can't, so wait
...</code></pre><p>This kind of problem is the bane of many concurrent programs. Let me
show you another approach. As you may guess, here's some Haskell:</p><pre><code class="haskell">runServer $ \request -> atomically $ do
let fromVar = lookup (from request) accounts
toVar = lookup (to request) accounts
origFrom <- readTVar fromVar
writeTVar fromVar (origFrom - amt request)
origTo <- readTVar toVar
writeTVar toVar (origTo + amt request)</code></pre><p>There are helper functions to make this shorter, but I wanted to do
this the long way to prove a point. This looks like <i>exactly</i> the kind
of race condition I described before. However, that <code>atomically</code>
function is vital here. It ensures that only a complete transaction is
ever committed. If any of the variables we touch are mutated by
another thread before our transaction is complete, all of our changes
are rolled back, and the transaction is retried. No need for explicit
locking, and therefore many less worries about data races and
deadlocks.</p><p>A <code>TVar</code> is a "transactional variable." It's an alternative to the
<code>IORef</code> that I mentioned earlier. There are other kinds of mutable
variables in Haskell, including channels and <code>MVar</code>s which are like
mutexes. This is what I meant when I said you need to be explicit
about what kind of mutation you want in Haskell.</p><h3 id="purity39s-role">Purity's role</h3><p>What do you think will happen with this program:</p><pre><code class="haskell">atomically $ do
buyBitcoins 3 -- side effects on my bank account
modifyTVar myBitcoinCount (+ 3)</code></pre><p>Here, <code>buyBitcoins</code> is going off to some exchange a buying about
$100,000 in bitcoin (or whatever ridiculous amount they're selling for
now). I said before that, if the variables are modified while running,
the transaction will be retried. It seems like this function is very
dangerous, as it may result in me going about $10,000,000 into debt
buying bitcoins!</p><p>This is where purity steps in. Inside <code>atomically</code>, you are not
allowed to perform any side effects outside of STM itself. That means
you can modify <code>TVar</code>s, but you cannot read or write files, print to the
console, fire the missiles, or place multi million dollar currency
purchases. This may feel like a limitation, but the tradeoff is that
it's perfectly safe for the runtime system to retry your transactions
as many times as it wants.</p><h3 id="summary-of-stm">Summary of STM</h3><p><b>Advantages</b></p><ul><li>Makes concurrent data modification much easier</li><li>Bypass many race conditions and deadlocks</li></ul><p><b>Disadvantages</b></p><ul><li>Depends on purity to work at all</li><li>Not really a disadvantage, you're already stuck with purity in
Haskell</li><li>Not really any other disadvantages, so just use it!</li></ul><h2 id="laziness">Laziness</h2><p>It's a little cheeky of me to get this far into a talk about unique
features of Haskell and ignore one of its most notable features:
laziness. Laziness is much more of a double-edged sword than the other
features I've talked about, and let me prove that by revisiting one of
our previous examples.</p><pre><code class="haskell">let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0</code></pre><p>I didn't describe it before, but this function will sum up the numbers
from 1 to 1,000,000. There are two problems with this function:</p><ol><li>There's a major performance bug in it</li><li>It's much more cumbersome than it should be</li></ol><h3 id="space-leaks">Space leaks</h3><p>The bane of laziness is space leaks, something you've probably heard
about if you've read at all about Haskell. To understand this, let's
look at how laziness is implemented. When you say something like:</p><pre><code class="haskell">let foo = 1 + 2</code></pre><p><code>foo</code> doesn't actually contain <code>3</code> right now. Instead, it contains an
instruction to apply the operator <code>+</code> to the values <code>1</code> and <code>2</code>. This
kind of instruction is called a <i>thunk</i>. And as you might guess,
storing the thunk is a lot more expensive than storing a simple
integer. We'll see why this helps in a bit, but for now we just care
about why it sucks. Let's look at what happens in our <code>loop</code> function:</p><pre><code class="haskell">let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0</code></pre><p>Each time we step through the loop, we have to compare <code>i</code> to the
number 1,000,000. Therefore, we are forced to evaluate it, which means
turning it into a simple integer. But we never look at the value of
<code>total</code>. Instead of storing a simple integer, which would be cheap, we
end up building a huge tree that looks like "add 1 to the result of
add 2 to the result of ... to 1,000,000." This is really bad: it uses
more memory and more CPU than we'd like.</p><p>We can work around this in Haskell by being explicit about which
values should be evaluated. There are a few ways to do this, but in
our case, the easiest is:</p><pre><code class="haskell">let loop i !total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0</code></pre><p>All I've done is added an exclamation point in front of the <code>total</code>
argument. This is known as a bang pattern, and says "make sure this is
evaluated before running the rest of this function." The need to do
this in some cases is definitely a downside to Haskell's laziness. On
the other hand, as we'll see shortly, you often don't need to bother
if you use the right kinds of functions.</p><h3 id="laziness-is-awesome">Laziness is awesome</h3><p>Let's go back to pseudocode and rewrite our summation:</p><pre><code>total := 0
for(i := 1; i <= 1000000; i++) {
total += i
}</code></pre><p>Pretty simple. But now let's modify this to only sum up the even
numbers:</p><pre><code>total := 0
for(i := 1; i <= 1000000; i++) {
if (isEven(i)) {
total += i
}
}</code></pre><p>OK, that's fine. But now, let's sum up the indices modulus 13 (for
some weird reason):</p><pre><code>total := 0
for(i := 1; i <= 1000000; i++) {
if (isEven(i)) {
total += i % 13
}
}</code></pre><p>Each of these modifications is fine on its own, but at this point it's
getting harder to see the forest for the trees. And fortunately each
of these transformations was relatively simple. If some of the
requirements were more complicated, fitting it into the <code>for</code> loop may
be more challenging.</p><p>Let's go back to the beginning with Haskell. We saw how we could do it
with a loop, but let's see the real way to sum the numbers from 1 to
1,000,000:</p><pre><code class="haskell">-- Bad
let loop i !total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
-- Awesome!
sum [1..1000000]</code></pre><p>We use list range syntax to create a list with one million numbers in
it. On its face, this looks terrible: we need to allocate about 8mb of
data to hold onto this integers, when this should run in constant
space. But this is exactly where laziness kicks in: instead of
allocating all of these values immediately, we allocate a thunk. Each
time we step through the list, our thunk generates one new integer and
a new thunk for the rest of the list. We're never using more than a
few machine words.</p><p>There are also other optimizations in GHC to avoid even allocating
those thunks, but that's not something I'm going to cover today.</p><p>Anyway, let's continue. We can easily tweak this to only add up the
even numbers:</p><pre><code class="haskell">sum (filter even [1..1000000])</code></pre><p>This uses the <code>filter</code> higher order function, and likewise avoids
allocating an entire list at once. And doing the silly modulus 13
trick:</p><pre><code class="haskell">sum (map (`mod` 13) (filter even [1..1000000]))</code></pre><p>Laziness is definitely a mixed bag, but combined with the functional
style of Haskell in general, it allows you to write higher level,
declarative code, while keeping great performance.</p><h3 id="short-circuiting-for-free">Short circuiting for free</h3><p>Lots of languages define <code>&&</code> and <code>||</code> operators which stop evaluation
early, e.g.:</p><pre><code>foo() && bar()</code></pre><p><code>bar</code> is only called if <code>foo</code> returns <code>true</code>. Haskell works the same way, but these operators aren't special; they just use laziness!</p><pre><code class="haskell">False && _ = False
True && x = x
True || _ = True
False || x = x</code></pre><p>This even scales up to functions working on lists of values, such as
<code>and</code>, <code>or</code>, <code>all</code>, and <code>any</code>.</p><h3 id="other-downsides">Other downsides</h3><p>There's one other downside to laziness, and a historical
artifact. Laziness means that exceptions can be hiding inside any
thunk. This is also known as partial values and partial functions. For
example, what does this mean?</p><pre><code class="haskell">head []</code></pre><p>Generally speaking, partiality is frowned upon, and you should use
total functions in Haskell.</p><p>The historical artifact is that many bad functions are still easily
available, and they should be avoided. <code>head</code> is arguably an example
of that. Another is the lazy left fold function, <code>foldl</code>. In virtually
all cases, you should replace it with a strict left fold <code>foldl'</code>.</p><h3 id="summary-of-laziness">Summary of laziness</h3><p><b>Advantages</b></p><ul><li>More composable code</li><li>Get efficient results from combining high level functions</li><li>Short-circuiting like <code>&&</code> and <code>||</code> is no longer a special case</li></ul><p><b>Disadvantages</b></p><ul><li>Need to worry about space leaks</li><li>Exceptions can be hiding in many places</li><li>Unfortunately some bad functions like <code>foldl</code> still hanging around</li></ul><p><b>Side note</b> There's a major overlap with Python generators or Rust
iterators, but laziness in Haskell is far more pervasive than these
other approaches.</p><h2 id="others">Others</h2><p>Due to time constraints, I'm not going to be able to go into detail on
a bunch of other examples I wanted to talk about. Let me just throw
out some quick thoughts on them.</p><h3 id="parser-and-other-dsls">Parser (and other) DSLs</h3><ul><li>Operator overloading!</li><li>Abstract type classes like <code>Applicative</code> and <code>Alternative</code> a natural
fit, e.g.: <code>parseXMLElement <|> parseXMLText</code>.</li><li>Able to reuse huge number of existing library functions,
e.g. <code>optional</code>, <code>many</code></li><li>General purpose <code>do</code>-notation is great</li></ul><pre><code class="haskell">data Time = Time Hour Minutes Seconds (Maybe AmPm)
data AmPm = Am | Pm
parseAmPm :: Parser Time
parseAmPm = Time
<$> decimal
<*> (":" *> decimal)
<*> (":" *> decimal)
<*> optional (("AM" $> Am) <|> ("PM" $> Pm))</code></pre><p>c/o <a href="https://twitter.com/queertypes/status/941064338848100352">@queertypes</a></p><h3 id="advanced-techniques">Advanced techniques</h3><ul><li>Free monads</li><li>Monad transformer stacks</li><li>Lens, conduit, pipes, ...</li><li>Lots of ways to do things in Haskell!</li><li>It's a plus and a minus</li><li>Recommendation: choose a useful subset of Haskell and its libraries,
and define some best practices</li></ul><h2 id="conclusion">Conclusion</h2><ul><li>Haskell combines a lot of uncommon features</li><li>Very few of those features are unique</li><li>Combining those features allows you to write code very differently
than in other languages</li><li>If you want readable, robust, easy to maintain code: I think it's a
great choice</li><li>Be aware of the sharp edges: they do exist!</li></ul><h2 id="qampa">Q&A</h2>Sun, 17 Dec 2017 08:00:00 +0000Robert Harper: A proof by contradiction is not a proof that derives a contradictionhttp://existentialtype.wordpress.com/?p=1467
https://existentialtype.wordpress.com/2017/03/04/a-proof-by-contradiction-is-not-a-proof-that-derives-a-contradiction/
<p>It is well-known that constructivists renounce “proof by contradiction”, and that classicists scoff at the critique. “Those constructivists,” the criticism goes, “want to rule out proofs by contradiction. How absurd! Look, Pythagoras showed that the square root of two is irrational by deriving a contradiction from the assumption that it is rational. There is nothing wrong with this. Ignore them!”</p>
<p>On examination that sort of critique fails, because <em>a proof by contradiction is not a proof that derives a contradiction</em>. Pythagoras’s proof is valid, one of the eternal gems of mathematics. No one questions the validity of that argument, even if they question proof by contradiction.</p>
<p>Pythagoras’s Theorem expresses a negation: <em>it is not the case that</em> the square root of two can be expressed as the ratio of two integers. Assume that it can be so represented. A quick deduction shows that this is impossible. So the assumption is false. Done. This is a<em> direct proof</em> of a negative assertion; it is <em>not</em> a “proof by contradiction”.</p>
<p>What, then, <i>is</i> a proof by contradiction? It is the <em>affirmation</em> of a positive statement by refutation of its denial. It is a <em>direct proof</em> of the negation of a negated assertion that is then pressed into service as a <em>direct proof</em> of the assertion, which it is not.<em> </em>Anyone is free to ignore the distinction for the sake of convenience, as a philosophical issue, or as a sly use of “goto” in a proof, but the distinction nevertheless exists and is important. Indeed, part of the beauty of constructive mathematics is that one can draw such distinctions. Once drawn, such distinctions can be disregarded; once blurred, forever blurred, a pure loss of expressiveness.</p>
<p>For the sake of explanation, let me rehearse a standard example of a genuine proof by contradiction. The claim is that there exists irrationals <em>a</em> and <em>b</em> such that <em>a</em> to the <em>b</em> power is rational. Here is an indirect proof, a true proof by contradiction. Let us prove instead that it is impossible that any two irrationals <em>a</em> and <em>b</em> are such that <em>a</em> to the <em>b</em> is irrational. This is a negative statement, so of course one proves it by deriving a contradiction from assuming that which is negated. Suppose, for a contradiction, that for every two irrationals <em>a</em> and <em>b, </em>the exponentiation <em>a</em> to the <em>b</em> power is irrational. We know from Pythagoras that root two is irrational, so plug it in for both <em>a</em> and <em>b</em>, and conclude that root two to the root two power is irrational. Now use the assumption again, taking <em>a</em> to be root two to the root two, and <em>b</em> to be root two. Calculate <em>a</em> to the power of <em>b</em>, it is two, which is eminently rational. Contradiction.</p>
<p>We have now proved that it is not the case that every pair of irrationals, when exponentiated, give an irrational. There is nothing questionable about this proof. But it does not prove that there are two irrationals whose exponent is rational! If you think it does, then I ask you, please name them for me. That information is not in this proof (there are other proofs that do name them, but that is not relevant for my purposes). You may, if you wish, disregard the distinction I am drawing, that is your prerogative, and neither I nor anyone has any problem with that. But you cannot claim that it is a <em>direct proof</em>, it is rather an <em>indirect proof</em>, that proceeds by refuting the negative of the intended assertion.</p>
<p>So why am I writing this? Because I have learned, to my dismay, that in U.S. computer science departments–of all places!–students are being taught, <em>erroneously,</em> that any proof that derives a contradiction is a “proof by contradiction”. It is not. Any proof of a negative must proceed by contradiction. A proof by contradiction in the long-established sense of the term is, contrarily, an indirect proof of a positive by refutation of the negative. This distinction is important, even if you want to “mod out” by it in your work, for it is only by drawing the distinction that one can even define the equivalence with which to quotient.</p>
<p>That’s my main point. But for those who may not be familiar with the distinction between direct and indirect proof, let me take the opportunity to comment on why one might care to draw such a distinction. It is a matter of honesty, of a sort: the information content of the foregoing indirect proof does not fulfill the expectation stated in the theorem. It is a kind of boast, an overstatement, to claim otherwise. Compare the original statement with the reformulation used in the proof. The claim that it is not the case that every pair of irrationals exponentiate to an irrational is uncontroversial. The proof proves it directly, and there is nothing particularly surprising about it. One would even wonder why anyone would bother to state it. Yet the supposedly equivalent claim stated at the outset appears much more fascinating, because most people cannot easily think up an example of two irrationals that exponentiate to rationals. Nor does the proof provide one. Once, when shown the indirect proof, a student of mine blurted out “oh that’s so cheap.” Precisely.</p>
<p>Why should you care? Maybe you don’t, but there are nice benefits to keeping the distinction, because it demarcates the boundary between constructive proofs, which have direct interpretation as functional programs, and classical proofs, which have only an indirect such interpretation (using continuations, to be precise, and giving up canonicity). Speaking as a computer scientist, this distinction matters, and it’s not costly to maintain. May I ask that you adhere to it?</p>
<p><em>Edit: </em><em>rewrote final paragraph, sketchy and irrelevant, and improved prose throughout. Word-smithing, typos.</em></p>
<p><em><br />
</em></p>
<p> </p><br />Filed under: <a href="https://existentialtype.wordpress.com/category/programming/">Programming</a>, <a href="https://existentialtype.wordpress.com/category/research/">Research</a>, <a href="https://existentialtype.wordpress.com/category/teaching-2/">Teaching</a> <a href="http://feeds.wordpress.com/1.0/gocomments/existentialtype.wordpress.com/1467/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/comments/existentialtype.wordpress.com/1467/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/godelicious/existentialtype.wordpress.com/1467/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/delicious/existentialtype.wordpress.com/1467/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gofacebook/existentialtype.wordpress.com/1467/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/facebook/existentialtype.wordpress.com/1467/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gotwitter/existentialtype.wordpress.com/1467/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/twitter/existentialtype.wordpress.com/1467/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gostumble/existentialtype.wordpress.com/1467/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/stumble/existentialtype.wordpress.com/1467/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/godigg/existentialtype.wordpress.com/1467/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/digg/existentialtype.wordpress.com/1467/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/goreddit/existentialtype.wordpress.com/1467/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/reddit/existentialtype.wordpress.com/1467/" alt="" border="0" /></a> <img src="https://pixel.wp.com/b.gif?host=existentialtype.wordpress.com&blog=2157150&post=1467&subd=existentialtype&ref=&feed=1" alt="" height="1" border="0" width="1" />Sat, 16 Dec 2017 19:20:08 +0000Mark Jason Dominus: Wasteful and frugal proofs in Ramsey theorytag:,2017:/math/ramsey-waste
https://blog.plover.com/math/ramsey-waste.html
<p><a href="https://math.stackexchange.com/questions/2567469/among-any-11-integers-sum-of-6-of-them-is-divisible-by-6">This math.se
question</a>
asks how to show that, among any 11 integers, one can find a subset of
exactly six that add up to a multiple of 6. Let's call this
“Ebrahimi’s theorem”.</p>
<p>This was the last thing I read before I put away my phone and closed
my eyes for the night, and it was a race to see if I would find an
answer before I fell asleep. Sleep won the race this time. But the
answer is not too hard.</p>
<ol>
<li><p>First, observe that among any five numbers there are three that sum
to a multiple of 3: Consider the remainders of the five numbers upon
division by 3. There are three possible remainders. If all three
remainders are represented, then the remainders are <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24%5c%5c%7b0%2c1%2c2%5c%5c%7d%24" />
and the sum of their representatives
is a multiple of 3. Otherwise there is some remainder with three
representatives, and the sum of these three is a multiple of 3.</p></li>
<li><p>Now take the 11 given numbers. Find a group of three whose sum is
a multiple of 3 and set them aside. From the remaining 8 numbers,
do this a second time. From the remaining 5 numbers, do it a third
time. </p></li>
<li><p>We now have three groups of three numbers that each sum to a
multiple of 3. Two of these sums must have the same parity. The
six numbers in those two groups have an even sum that is a multiple
of 3, and we win.</p></li>
</ol>
<p>Here is a randomly-generated example:</p>
<p>$$3\quad 17\quad 35\quad 42\quad 44\quad 58\quad 60\quad 69\quad
92\quad 97\quad 97$$</p>
<p>Looking at the first 5 numbers <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%243%5c%2017%5c%2035%5c%2042%5c%2044%24" /> we see that on
division by 3 these have remainders <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%240%5c%202%5c%202%5c%200%5c%202%24" />. The remainder
<img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%242%24" /> is there three times, so we choose those three numbers <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24%5clangle17%5c%2035%5c%0a44%5crangle%24" />, whose sum is a multiple of 3, and set them aside.</p>
<p>Now we take the leftover <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%243%24" /> and <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%2442%24" /> and supplement them with
three more unused numbers <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%2458%5c%2060%5c%2069%24" />. The remainders are <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%240%5c%200%5c%201%5c%200%5c%0a0%24" /> so we take <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24%5clangle3%5c%2042%5c%2060%5crangle%24" /> and set them aside as a second group.</p>
<p>Then we take the five remaining unused numbers <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%2458%5c%2069%5c%2092%5c%2097%5c%2097%24" />.
The remainders are <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%241%5c%200%5c%202%5c%201%5c%201%24" />.
The first three <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24%5clangle%2058%5c%2069%5c%2092%5crangle%24" />have all different
remainders, so let's use those as our third group.</p>
<p>The three groups are now <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24%0a%5clangle17%5c%2035%5c%2044%5crangle%2c%20%0a%5clangle3%5c%2042%5c%2060%5crangle%2c%20%0a%5clangle58%5c%2069%5c%2092%5crangle%24" />. The first one has an even sum and the
second has an odd sum. The third group has an odd sum, which matches
the second group, so we choose the second and third groups, and that is
our answer:</p>
<p>$$3\qquad 42\qquad 60\qquad 58 \qquad 69 \qquad 92$$</p>
<p>The sum of these is <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24324%20%3d%206%5ccdot%2054%24" />.</p>
<p>This proves that 11 input numbers are sufficient to produce one output
set of 6 whose sum is a multiple of 6. Let's write <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%28n%2c%20k%29%24" /> to
mean that <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24n%24" /> inputs are enough to produce <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24k%24" /> outputs. That is,
<img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%28n%2c%20k%29%24" /> means “any set of <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24n%24" /> numbers contains <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24k%24" /> distinct
6-element subsets whose sum is a multiple of 6.” Ebrahimi’s theorem,
which we have just proved, states that <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2811%2c%201%29%24" /> is true, and
obviously it also proves <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%28n%2c%201%29%24" /> for all larger <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24n%24" />.</p>
<p>I would like to consider the following questions:</p>
<ol>
<li>Does this proof suffice to show that <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2810%2c%201%29%24" /> is false?</li>
<li>Does this proof suffice to show that <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2811%2c%202%29%24" /> is false?</li>
</ol>
<p>I am specifically <em>not</em> asking whether <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2810%2c%0a1%29%24" /> or <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2811%2c%202%29%24" /> are <em>actually</em> false. There are easy
counterexamples that can be found without reference to the proof
above. What I want to know is if the proof, as given, contains
nontrivial information about these questions.</p>
<p>The reason I think this is interesting is that I think, upon more
careful examination, that I will find that the proof above <em>does</em>
prove at least one of these, perhaps with a very small bit of
additional reasoning. But there are many similar proofs that do not
work this way. Here is a famous example. Let <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24W%28n%2c%20k%29%24" /> be
shorthand for the following claim:</p>
<blockquote>
<p>Let the integers from 1 to <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24n%24" /> be partitioned into two sets. Then
one of the two sets contains <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24k%24" /> distinct subsets of three
elements of the form <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24%5c%5c%7ba%2c%20a%2bd%2c%20a%2b2d%5c%5c%7d%24" /> for integers <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24a%2c%20d%24" />.</p>
</blockquote>
<p>Then: </p>
<blockquote>
<p><a href="https://en.wikipedia.org/wiki/Van_der_Waerden%27s_theorem">Van der Waerden's theorem</a>:
<img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24W%28325%2c%201%29%24" /> is true.</p>
</blockquote>
<p><img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24W%28%29%24" />, like <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%28%29%24" />, is monotonic: van der Waerden's theorem
trivially implies <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24W%28n%2c%201%29%24" /> for all <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24n%24" /> larger than 325. Does it
also imply that <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24W%28n%2c%201%29%24" /> is false for smaller <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24n%24" />? No, not at
all; this is actually untrue. Does it also imply that <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24W%28325%2c%20k%29%24" />
is false for <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24k%3e1%24" />? No, this is false also.</p>
<p>Van der Waerden's theorem takes 325 inputs (the integers) and among
them finds one output (the desired set of three). But this is
extravagantly wasteful. A better argument shows that only 9 inputs
were required for the same output, and once we know this it is trivial
that 325 inputs will always produce at least 36 outputs, and probably
a great many more.</p>
<p>Proofs of theorems in Ramsey theory are noted for being extravagant in
exactly this way. But the proof of Ebrahimi's theorem is different.
It is not only frugal, it is optimally so. It uses no more inputs
than are absolutely necessary.</p>
<p>What is different about these cases? What is the source the frugality
of the proof of Ebrahimi’s theorem? Is there a way that we can see
from examination of the proof that it will be optimally frugal?</p>
<p>Ebrahimi’s theorem shows <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2811%2c%201%29%24" />. Suppose instead we want to
show <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%28n%2c%202%29%24" /> for some <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24n%24" />. From Ebrahimi’s theorem itself we
immediately get <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2822%2c%202%29%24" /> and indeed <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2817%2c%202%29%24" />. Is this the
best we can do? (That is, is <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24E%2816%2c%202%29%24" /> false?) I bet it isn't.
If it isn't, what went wrong? Or rather, what went <em>right</em> in the
<img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24k%3d1%24" /> case that stopped working when <img src="https://chart.apis.google.com/chart?chf=bg,s,00000000&cht=tx&chl=%24k%3e1%24" />?</p>
<p>I don't know.</p>Fri, 15 Dec 2017 17:09:00 +0000mjd@plover.com (Mark Dominus)Ken T Takusagawa: [agobrown] Longest games of chomptag:blogger.com,1999:blog-6757805.post-2800790007557669388
http://kenta.blogspot.com/2017/12/agobrown-longest-games-of-chomp.html
<p>What Chomp starting positions offer the longest games, perhaps the most possibilities for interesting games? Among rectangular starting positions, good starting positions are 13x12, 12x11, 10x9, 9x8, 11x6, 7x6, 8x5, 6x5, 5x4. Missing from the pattern of (N)x(N-1) are 11x10 and 8x7. (Chomp is weird in how there aren't simple patterns. It might be a good candidate for machine learning.)</p><p>We assumed 3 types of positions in Chomp are instantly known lost (P positions):</p><ol style=""><li>L-shaped positions with both arms of the L having unit width and same lengths</li>
<li>2-row positions of the form [a,a-1]</li>
<li>3-row positions of the form [a,a-2,2]</li>
</ol><p>The 3-row [a,a-2,2] class of positions is noted in Proposition 2 of "Three-Rowed Chomp" by <a href="http://sites.math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/byrnes.html">Doron Zeilberger</a>. The winning strategy from such a position is as follows:</p><p>The base case is [4,2,2] (which looks kind of like a pistol). If the opponent moves to [3,2,2], then respond moving to [3,2] and follow the 2-row strategy (or move to [3,1,1] and L-shaped strategy). If [2,2,2] then 2-row strategy vertically. If [4,1,1] then [3,1,1] and L-shaped strategy. If [4,2,1] then [2,2,1] and 2-row strategy vertically. If [4,2] then 2-row strategy.</p><p>For larger 3-row positions [a,a-2,2], if the opponent moves in the first 2 rows, leaving at least 4 in the first row and at least 2 in the second row, then restore the position to the shape [b,b-2,2]. If [3,3,2] then [3,1,1] and L-shaped strategy. If [a,1,1] then [3,1,1] and L-shaped strategy. If the opponent moves on the third row to [a,a-2,1] then [2,2,1] and follow the 2-row strategy vertically. If [a,a-2], then 2-row strategy.</p><p><a href="http://web.mit.edu/kenta/www/three/chomp/agobrown/">Here is complete output of all positions within board size 13x13 and Haskell source code.</a> A selection of some positions and their game values are also given below. Computing board size 12 required 8.5 GB of RAM on a machine with 16 GB of RAM. (Haskell programs tend to use a lot of memory unless one puts effort into conserving memory, which we did not do.)</p><p>For computing board size 13, we allowed swapping to virtual memory on SSD on a machine with 8 GB of physical RAM. The output of /usr/bin/time was:</p><p>5751.60user 86808.57system 39:48:33elapsed 64%CPU (0avgtext+0avgdata 7192640maxresident)k<br />
10410518744inputs+8outputs (184956202major+316491058minor)pagefaults 0swaps</p><p>This suggests a slowdown factor of about 25 for using virtual memory on SSD compared to RAM for this program which made heavy use of Data.Map. Polling "ps xu" saw a maximum virtual memory usage of 39 GB. For the output of the board size 13 at the link above, we omitted saving the "Win_in 1" positions to save disk space.</p><p>There are only 3 "Lose in 2" positions: [6,3,3]; [5,5,3]; and [5,2,1,1]. Memorize them to get an edge against opponents. One could also memorize the 7 "Lose in 4" positions, 14 "Lose in 6", 26 "Lose in 8"...</p><p>There seem to be some more patterns that lose: [odd,2,1,1,1,...]; [even,3,1,1,1,...]; [even,2,2,2,1,1,1,...]; [even,2,2,1,1,1,...]; [odd,4,1,1,1,...]. These deserve future investigation. <a href="https://www.win.tue.nl/~aeb/games/chomp.html">Andries Brouwer's web site</a> suggests that losing families of positions exist in 3-row chomp for [a+11,a+7,5]; [?,?,7]; [?,?,9]; [?,?,11]; [?,?,14] (not 13, once again breaking what seemed to be a simple pattern of odd third rows). It still needs to be explicitly articulated how to win after giving your opponent these losing positions. Work by Steven Byrnes suggests the game values of all 3-row Chomp positions can be rapidly computed, though probably not by a human in his or her head. Future versions of the code should bound not by board size but number of pieces, to investigate thin positions and roughly L-shaped positions.</p><p>(Position [13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 12, 5], Win_in 103)<br />
(Position [13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 5], Win_in 103)<br />
(Position [13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13], Win_in 101)<br />
(Position [12, 12, 12, 12, 12, 12, 12, 12, 12, 10, 7], Lose_in 86)<br />
(Position [12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12], Win_in 79)<br />
(Position [11, 11, 11, 10, 10, 10, 10, 10, 2], Win_in 57)<br />
(Position [11, 11, 11, 10, 10, 10, 10, 9, 2], Win_in 57)<br />
(Position [11, 11, 11, 11, 11, 9, 9, 7, 1, 1], Win_in 57)<br />
(Position [11, 11, 11, 11, 11, 9, 9, 9, 1, 1], Win_in 57)<br />
(Position [11, 11, 11, 11, 11, 11], Win_in 43)<br />
(Position [11, 11, 11, 11, 11, 11, 11], Win_in 41)<br />
(Position [11, 11, 11, 11, 11, 11, 11, 11], Win_in 39)<br />
(Position [11, 11, 11, 11, 11, 11, 11, 11, 11], Win_in 37)<br />
(Position [11, 11, 11, 11, 11], Win_in 35)<br />
(Position [11, 11, 11, 11, 11, 11, 11, 11, 11, 11], Win_in 21)<br />
(Position [10, 10, 10, 10, 10, 10, 10, 10, 4], Lose_in 56)<br />
(Position [10, 10, 10, 10, 10, 10, 10, 10, 10], Win_in 55)<br />
(Position [9, 9, 9, 9, 9, 9, 9, 9], Win_in 41)<br />
(Position [8, 8, 8, 8, 8], Win_in 23)<br />
(Position [8, 8, 8, 8, 8, 8], Win_in 15)<br />
(Position [8, 8, 8, 8, 8, 8, 8], Win_in 13)<br />
(Position [7, 7, 7, 7, 7, 7], Win_in 21)<br />
(Position [6, 6, 6, 6, 2], Win_in 13)<br />
(Position [6, 6, 6, 6, 6], Win_in 9)<br />
(Position [5, 5, 5, 5], Win_in 5)<br />
(Position [4, 4, 4, 4], Win_in 1)<br />
(Position [4, 4, 4], Win_in 1)<br />
(Position [4, 4], Win_in 1)<br />
(Position [4], Win_in 1)</p><p>(Position [5, 2, 1, 1], Lose_in 2)<br />
(Position [5, 5, 3], Lose_in 2)<br />
(Position [6, 3, 3], Lose_in 2)</p><p>(Position [5, 3, 3, 2], Lose_in 4)<br />
(Position [5, 5, 2, 2], Lose_in 4)<br />
(Position [6, 2, 2, 1, 1], Lose_in 4)<br />
(Position [6, 2, 2, 2], Lose_in 4)<br />
(Position [6, 3, 1, 1, 1], Lose_in 4)<br />
(Position [7, 2, 1, 1, 1, 1], Lose_in 4)<br />
(Position [7, 4, 3], Lose_in 4)</p><p>(Position [6, 4, 3, 3, 2], Lose_in 6)<br />
(Position [7, 2, 2, 2, 2], Lose_in 6)<br />
(Position [7, 3, 2, 1, 1, 1], Lose_in 6)<br />
(Position [7, 3, 2, 2], Lose_in 6)<br />
(Position [7, 3, 3, 1, 1], Lose_in 6)<br />
(Position [7, 3, 3, 2, 1, 1], Lose_in 6)<br />
(Position [7, 4, 1, 1, 1], Lose_in 6)<br />
(Position [7, 5, 3, 2], Lose_in 6)<br />
(Position [7, 7, 4], Lose_in 6)<br />
(Position [8, 2, 2, 1, 1, 1, 1], Lose_in 6)<br />
(Position [8, 2, 2, 2, 1, 1], Lose_in 6)<br />
(Position [8, 3, 1, 1, 1, 1, 1], Lose_in 6)<br />
(Position [8, 4, 4], Lose_in 6)<br />
(Position [9, 2, 1, 1, 1, 1, 1, 1], Lose_in 6)</p><p>(Position [6, 4, 4, 3, 3], Lose_in 8)<br />
(Position [6, 6, 3, 3, 3], Lose_in 8)<br />
(Position [6, 6, 4, 3, 2], Lose_in 8)<br />
(Position [7, 3, 3, 3, 2, 2], Lose_in 8)<br />
(Position [7, 4, 2, 2, 2, 2], Lose_in 8)<br />
(Position [7, 4, 4, 2], Lose_in 8)<br />
(Position [7, 5, 3, 3, 1, 1], Lose_in 8)<br />
(Position [7, 7, 3, 3], Lose_in 8)<br />
(Position [8, 3, 2, 2, 2], Lose_in 8)<br />
(Position [8, 3, 3, 3], Lose_in 8)<br />
(Position [8, 4, 2, 1, 1, 1], Lose_in 8)<br />
(Position [8, 4, 2, 2], Lose_in 8)<br />
(Position [8, 5, 1, 1, 1], Lose_in 8)<br />
(Position [8, 5, 4, 2], Lose_in 8)<br />
(Position [9, 2, 2, 2, 2, 1, 1], Lose_in 8)<br />
(Position [9, 2, 2, 2, 2, 2], Lose_in 8)<br />
(Position [9, 3, 2, 1, 1, 1, 1, 1], Lose_in 8)<br />
(Position [9, 3, 2, 2, 1, 1, 1], Lose_in 8)<br />
(Position [9, 4, 1, 1, 1, 1, 1], Lose_in 8)<br />
(Position [9, 4, 4, 1, 1], Lose_in 8)<br />
(Position [9, 5, 3, 1, 1, 1, 1], Lose_in 8)<br />
(Position [9, 5, 4], Lose_in 8)<br />
(Position [10, 2, 2, 1, 1, 1, 1, 1, 1], Lose_in 8)<br />
(Position [10, 2, 2, 2, 1, 1, 1, 1], Lose_in 8)<br />
(Position [10, 3, 1, 1, 1, 1, 1, 1, 1], Lose_in 8)<br />
(Position [11, 2, 1, 1, 1, 1, 1, 1, 1, 1], Lose_in 8)</p>Fri, 15 Dec 2017 08:05:03 +0000noreply@blogger.com (Ken)Mike Izbicki: how to cheat at settlers by loading the dicehttp://izbicki.me/blog/how-to-cheat-at-settlers-of-catan-by-loading-the-dice-and-prove-it-with-p-values.html
http://izbicki.me/blog/how-to-cheat-at-settlers-of-catan-by-loading-the-dice-and-prove-it-with-p-values.html
<h1>how to cheat at settlers by loading the dice
<br />(and prove it with p-values)
</h1>
<div class="info">
posted on 2017-12-14
</div>
<hr />
<b>tl;dr</b> This post shows how to create loaded dice, and how to use these dice to gain between 5-15 additional resource cards per game of Settlers of Catan. Surprisingly, we’ll prove that standard scientific tests are not powerful enough to determine that the dice are unfair while playing a game. This essentially means that it’s impossible for your opponents to scientifically prove that you’re cheating. This impossibility is due to methodological defects in the current state of scientific practice, and we’ll highlight some ongoing work to fix these defects.
<hr />
<center>
<img src="https://izbicki.me/blog/category/img/settlers/settlers-of-catan-board.jpg" />
</center>
<hr />
<h3 id="loading-the-dice">Loading the dice</h3>
<p>My copy of Settlers of Catan came with two normal wooden dice. To load these dice, I placed them in a small plate of water overnight, leaving the 6 side exposed.</p>
<center>
<img src="https://izbicki.me/blog/category/img/settlers/wooden-dice-in-water.jpg" />
</center>
<p>The submerged area absorbed water, becoming heavier. My hope was that when rolled, the heavier wet sides would be more likely to land face down, and the lighter dry side would be more likely to land face up. So by leaving the 6 exposed, I was hoping to create dice that roll 6’s more often.</p>
<p>This effect is called the <em>bias</em> of the dice. To measure this bias, my wife and I spent the next 7 days rolling dice while eating dinner. (She must love me a <em>lot</em>!)</p>
<center>
<img src="https://izbicki.me/blog/category/img/settlers/wife-rolling-dice-dinner.jpg" />
</center>
<p>In total, we rolled the dice 4310 times. The raw results are shown below.</p>
<center>
<img src="https://izbicki.me/blog/category/img/settlers/counts.png" />
</center>
<table>
<tbody><tr>
<td>
</td>
<td>
1
</td>
<td>
2
</td>
<td>
3
</td>
<td>
4
</td>
<td>
5
</td>
<td>
6
</td>
</tr>
<tr>
<td>
number of rolls
</td>
<td>
622
</td>
<td>
698
</td>
<td>
650
</td>
<td>
684
</td>
<td>
666
</td>
<td>
812
</td>
</tr>
<tr>
<td>
probability
</td>
<td>
0.151
</td>
<td>
0.169
</td>
<td>
0.157
</td>
<td>
0.165
</td>
<td>
0.161
</td>
<td>
0.196
</td>
</tr>
</tbody></table>
<p>Looking at the data, it’s “obvious” that our dice are biased: The 6 gets rolled more times than any of the other numbers. Before we prove this bias formally, however, let’s design a strategy to exploit this bias while playing Settlers of Catan.</p>
<h3 id="a-strategy-for-loaded-dice">A strategy for loaded dice</h3>
<p>The <a href="http://www.settlers-strategy.com/settlers_of_catan_strategy_placement_1.html">key to winning at Settlers of Catan</a> is to get a lot of resources. We want to figure out how many extra resources we can get using our biased dice.</p>
<p>First, let’s quickly review the rules. Each settlement is placed on the corner of three tiles, and each tile has a number token. Whenever the dice are rolled, if they add up to one of the numbers on the tokens, you collect the corresponding resource card. For example:</p>
<center>
<img src="https://izbicki.me/blog/category/img/settlers/settlers-of-catan-rules-dice-resource.jpg" />
</center>
<p>A good settlement will be placed next to numbers that will be rolled often.</p>
<p>To make strategizing easier, the game designers put helpful dots on each token below the number. These dots count the ways to roll that token’s number using two dice.</p>
<center>
<img src="https://izbicki.me/blog/category/img/settlers/settlers-numbers-pips.jpg" />
</center>
<p>We can use these dots to calculate the probability of rolling each number. For example, a <span class="math inline">\(4\)</span> can be rolled in three ways. If we name our two dice <span class="math inline">\(A\)</span> and <span class="math inline">\(B\)</span>, then the possible combinations are <span class="math inline">\((A=1,B=3)\)</span>, <span class="math inline">\((A=2,B=2)\)</span>, <span class="math inline">\((A=3,B=1)\)</span>. To calculate the probability of rolling a 4, we calculate the probability of each of these rolls and add them together. For fair dice, the probability of every roll is the same <span class="math inline">\((1/6)\)</span>, so the calculation is:</p>
<span class="math display">\[\begin{align}
Pr(A+B = 4)
&= Pr(A = 1)Pr(B=3) + Pr(A=2)Pr(B=2) + Pr(A=3)Pr(B=1) \\
&= (1/6)(1/6) + (1/6)(1/6) + (1/6)(1/6) \\
&= 1/12 \\
&\approx 0.08333
\end{align}\]</span>
<p>For our biased dice, the probability of each roll is different. Using the numbers from the table above, we get:</p>
<span class="math display">\[\begin{align}
Pr(A+B = 4)
&= Pr(A = 1)Pr(B=3) + Pr(A=2)Pr(B=2) + Pr(A=3)Pr(B=1) \\
&= (0.151)(0.157) + (0.169)(0.169) + (0.157)(0.151) \\
&= 0.07597
\end{align}\]</span>
<p>So rolling a <span class="math inline">\(4\)</span> is now less likely with our biased dice. Performing this calculation for each possible number gives us the following chart.</p>
<center>
<img src="https://izbicki.me/blog/category/img/settlers/twodice.png" />
</center>
<p>All the numbers below <span class="math inline">\(7\)</span> are now less likely, and the numbers above 7 are now more likely. The shift is small, but it has important strategic implications.</p>
<p>Consider the two initial settlement placements below.</p>
<center>
<img src="https://izbicki.me/blog/category/img/settlers/red-v-blue-settlers2.png" />
</center>
<p>The <font color="red">naughty</font> player knows that the dice are biased and puts her settlements on locations with high numbers, but the <font color="blue">nice</font> player doesn’t know the dice are biased and puts her settlements on locations with low numbers. Notice that if the dice were fair, both settlement locations would be equally good because they have the same number of dots.</p>
<p>The following formula calculates the average number of cards a player receives on each dice roll:</p>
<p><span class="math display">\[
\text{expected cards per roll} = \sum_{\text{adjacent tokens}} Pr(A+B=\text{token value})
\]</span></p>
<p>Substituting the appropriate values gives us the following results.</p>
<table>
<tbody><tr>
<td colspan="3" align="right">
<center>
expected cards per roll
</center>
</td>
</tr>
<tr>
<td>
</td>
<td>
<font color="red">naughty</font>
</td>
<td>
<font color="blue">nice</font>
</td>
</tr>
<tr>
<td>
fair dice
</td>
<td>
0.500
</td>
<td>
0.500
</td>
</tr>
<tr>
<td>
biased dice
</td>
<td>
0.543
</td>
<td>
0.457
</td>
</tr>
</tbody></table>
<p>So the difference between the <font color="red">naughty</font> and <font color="blue">nice</font> player is <span class="math inline">\(0.086\)</span> cards per roll of the biased dice. A typical game of Settlers contains about 60 dice rolls (about 15 turns per player in a 4 player game), so this results in <span class="math inline">\(0.086*60=5.16\)</span> more cards for the <font color="red">naughty</font> player. </p>
<p>And this is only considering the two starting settlements. As the game progresses, more settlements will be built, and some settlements will be upgraded to cities (which receive two cards per roll instead of one). Calculating the exact effect of these additional sources of cards is difficult because these improvements will be built at random points throughout the game. We’ll have to make some additional assumptions.</p>
<p>If we assume that the <font color="red">naughty</font> player gets 0.043 more cards per roll per settlement/city than the <font color="blue">nice</font> player (this exact number will vary depending on the quality of the settlement), and that both players build settlement/cities at turns 10,20,25,30,35,40,45, and 50, then the <font color="red">naughty</font> player will on average receive 15.050 more cards than the <font color="blue">nice</font> player.</p>
<p>To summarize, the <font color="red">naughty</font> player will receive somewhere between 5 and 15 more resource cards depending on how their future settlements and cities are built. This advantage can’t guarantee a victory, but it’ll definitely help.</p>
<h3 id="a-scientific-analysis">A scientific analysis</h3>
Now we’re going to do some simple statistics to prove two things:
<ol>
<li>
The dice really are biased. So the fact that the 6 was rolled more times than the other numbers wasn’t just due to random chance.
</li><li>
There are not enough dice rolls in a game of Settlers for our opponents to scientifically prove that the dice are biased. So it’s scientifically impossible for our opponents to know that we’re cheating.
</li></ol>
<p>To show that the dice are biased, we will use a standard scientific technique called <a href="https://en.wikipedia.org/wiki/Statistical_hypothesis_testing#Null_hypothesis_statistical_significance_testing">null hypothesis significance testing</a>. We begin by assuming a hypothesis that we want to <em>disprove</em>. In our case, we assume that the dice are <em>not</em> biased. In other words, we assume that each number on the dice has a <span class="math inline">\(1/6\approx 0.166\)</span> chance of being rolled. Our goal is to show that under this assumption, the number of 6’s rolled above is very unlikely. We therefore conclude that our hypothesis is also unlikely, and that the dice probably are in fact biased.</p>
More formally, we let <span class="math inline">\(X\)</span> be a random variable that represents the total number of 6’s we would roll if we were to repeat our initial experiment with fair dice. Then <span class="math inline">\(X\)</span> follows a <a href="https://en.wikipedia.org/wiki/Binomial_distribution">binomial distribution</a> whose density is plotted below.
<center>
<img src="https://izbicki.me/blog/category/img/settlers/binomial.png" />
</center>
The <a href="https://en.wikipedia.org/wiki/P-value"><span class="math inline">\(p\)</span>-value</a> for our experiment is defined informally to be the probability of getting results similar to the results we observed if the dice are <em>not</em> biased. The formal definition and formula is
<span class="math display">\[\begin{equation}
p\text{-value}=
Pr(X\ge k)
=
%1-\sum_{i=0}^k {4310\choose 812} (1/6)^i(1-1/6)^{n-i}
1-\sum_{i=0}^k {n\choose k} q^i(1-q)^{n-i}
,
\end{equation}\]</span>
<p>where <span class="math inline">\(n\)</span> is the total number of dice rolls (4310), <span class="math inline">\(k\)</span> is the number of 6’s actually rolled (812), and <span class="math inline">\(q\)</span> is the assumed probability of rolling a 6 (1/6). Substituting these numbers gives us <span class="math display">\[
p\text{-value}=
Pr(X\ge k)
\approx
0.0000884
.
\]</span> In other words, if we repeated this experiment one million times with fair dice, we would expect to get results similar to the results we actually got only 88 times. Since this is so unlikely, we conclude that our original assumption (that the dice are not biased) is probably false. Most science classes teach that <span class="math inline">\(p\)</span>-values less than 0.05 are “significant.” We are very far below that threshold, so our result is “very significant.”</p>
<p>Our <span class="math inline">\(p\)</span>-value is so low because the number of trials we conducted was very large <span class="math inline">\((n=4310)\)</span>. In a typical game of Settlers, however, there will be many fewer trials. This makes it hard for our opponents to prove that we’re cheating.</p>
<p>We said before that there are 60 dice rolls in a typical game. Since we have two dice, that means <span class="math inline">\(n=120\)</span>. To keep the math simple, we’ll assume that we role an average number of 6’s. That is, the number of sixes rolled during the game is <span class="math display">\[
k=812\cdot \frac{120}{4310}\approx23.
\]</span> Substituting into our formula for the <span class="math inline">\(p\)</span>-value, we get <span class="math display">\[
p\text{-value}=P(X\ge k) \approx 0.265
.
\]</span> In words, this means that if the dice were actually fair, then we would still role this number of 6’s <span class="math inline">\(26.5\%\)</span> of the time. Since this probability is so high, the standard scientific protocol tells us to conclude that we have no “significant” evidence that the dice are biased. (Notice that this is subtly different from having evidence that the dice are not biased! Confusing these two statements is a common mistake, <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/">even for trained phd scientists</a>, and <a href="http://slatestarcodex.com/2013/12/17/statistical-literacy-among-doctors-now-lower-than-chance/">especially for medical doctors</a>.)</p>
<p>So how many games can we play without getting caught? It turns out that if we play 6 games (so <span class="math inline">\(n=6*120=720\)</span>, and <span class="math inline">\(k=812\cdot(720/4310)\approx136\)</span>), then the resulting <span class="math inline">\(p\)</span>-value is 0.05. In other words, as long as we play fewer than 6 games, then our opponents won’t have enough data to conclude that their measurements of the biased dice are “significant.” The standard scientific method won’t prove we’re cheating.</p>
<h3 id="some-flaws-with-the-p-value-and-significance">Some flaws with the <span class="math inline">\(p\)</span>-value and “significance”</h3>
<p>The <span class="math inline">\(p\)</span>-value argument above is how most scientists currently test their hypotheses. But there’s some major flaws with this approach. For example:</p>
<ol>
<li>
<p>The <span class="math inline">\(p\)</span>-value test doesn’t use all the available information. In particular, our opponents may have other reasons to believe that the dice are loaded. If you look closely at the dice, you’ll notice some slight discoloration where it was submerged in water.</p>
<center>
<img src="https://izbicki.me/blog/category/img/settlers/dice-discolored2.jpg" width="650" />
</center>
<p>This discoloration was caused because the water spread the ink on the die’s face. If you see similar discoloration on the dice in your game, it makes sense to be extra suspicious about the dice’s bias.</p>
<p>Unfortunately, there’s no way to incorporate this suspicion into the <span class="math inline">\(p\)</span>-value analysis we conducted above. An alternative to the <span class="math inline">\(p\)</span>-value called the <a href="https://en.wikipedia.org/wiki/Bayes_factor">bayes factor</a> can incorporate this prior evidence. So if our opponent uses a bayes factor analysis, they may be able to determine that we’re cheating. The bayes factor is more complicated than the <span class="math inline">\(p\)</span>-value, however, and so it is not widely taught to undergraduate science majors. It is rarely even used in phd-level scientific publications, and many statisticians are calling for <a href="https://www.nature.com/articles/d41586-017-07522-z">increased use of these more sophisticated analysis techniques</a>.</p>
</li><li>
<p>Another weakness of the <span class="math inline">\(p\)</span>-value test is that <a href="https://en.wikipedia.org/wiki/False_positives_and_false_negatives">false positives</a> are very common. Using the standard significance threshold of <span class="math inline">\(p\le0.05\)</span> means that 5 of every 100 games will have “significant” evidence that the dice are biased to role 6’s. Common sense, however, tells us that cheating at Settlers of Catan is almost certainly not this common because most people just don’t want to cheat. But when you run many experiments, some of them will give “significant” results just by random chance. This is one of the many reasons why some scientists have concluded that <a href="http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124">most published research is false</a>. This effect is thought to be one of the reasons that <a href="https://understandinguncertainty.org/node/1286">evidence of extra sensorial perception (ESP) continues to be published in scientific journals</a>. Some less scrupulous scientists exploit this deficiency in a process called <a href="http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002106">p-hacking</a> to make their research seem more important.</p>
<p>To alleviate the problem of false positives, <a href="https://www.nature.com/news/big-names-in-statistics-want-to-shake-up-much-maligned-p-value-1.22375">a group of statisticians is proposing a new significance threshold of <span class="math inline">\(p\le0.005\)</span></a> for a result to qualify as “significant”. While this reduces the risk of false positives, it also makes detecting true effects harder. Under this new criterion, we’d have to play 16 games (for <span class="math inline">\(n=1920\)</span> dice roles) to get statistically significant evidence that the dice are biased.</p>
</li></ol>
<p>At this point, you might be feeling overwhelmed at the complexity of statistical analysis. And this is just for the toy problem of detecting loaded dice in a game. Real world problems like evaluating the effectiveness of chemotherapy drugs are much more complicated, and so require much more complicated statistical analyses. Doing science is hard!</p>
<hr />
<p><b>Edit after peer review:</b> <a href="http://http://vijaylulla.com/">Vijay Lulla</a> sent me the following message:</p>
<blockquote>
<i> The blog mentions that you rolled the dice 4310 times and all your calculations are based on it, but the frequency table adds up to 4312. </i>
</blockquote>
<p>Whooops! It looks like a messed up my addition. Fortunately, this mistake is small enough that it won’t affect any of the numbers in the article by much.</p>
<p>A lot of people mistakenly think that peer review is where other scientists repeat an experiment to test the conclusion. But that’s not the case. The purpose for peer review is for scientists like Vijay to just do a sanity check on the whole procedure to make sure obvious mistakes like this get caught. Sadly, another commonly made mistake in science is that <a href="http://andrewgelman.com/2015/06/17/born-open-data/">researchers don’t publish their data</a>, so there’s no way for checks like this to be performed.</p>
<p>If this were a real publication in a scientific journal, I would redo all the calculations. But since it’s not, I’ll leave the mistake for posterity.</p>
<p><b>Edit 2:</b> <a href="https://www.reddit.com/r/statistics/comments/7k0ufr/i_wrote_a_blog_post_about_using_settlers_of_catan/">There’s a good discussion on reddit’s /r/statistics</a>. This discussion provides a much more nuanced view about significance testing than my discussion above, and a few users point out ways that I might be overstating some conclusions.</p>Thu, 14 Dec 2017 00:00:00 +0000FP Complete: Software Release Management Best Practiceshttps://www.fpcomplete.com/blog/software-release-management-best-practices
https://www.fpcomplete.com/blog/software-release-management-best-practices
<div class="hs-featured-image-wrapper">
<a href="https://www.fpcomplete.com/blog/software-release-management-best-practices" class="hs-featured-image-link" title=""> <img src="https://www.fpcomplete.com/hubfs/Blog/Software%20Release%20Management.jpg?t=1513366076380" alt="Software Release Management Best Practices" style="width: auto !important; float: left; margin: 0 15px 15px 0;" class="hs-featured-image" /> </a>
</div>
<h2>What is software release management?</h2>
<p>At its most general, “releasing software” is the process by which software is delivered from the engineers creating it to its users. This can take such forms as: </p>
<img src="https://track.hubspot.com/__ptq.gif?a=2814979&k=14&r=https%3A%2F%2Fwww.fpcomplete.com%2Fblog%2Fsoftware-release-management-best-practices&bu=https%253A%252F%252Fwww.fpcomplete.com%252Fblog&bvt=rss" alt="" height="1" style="width: 1px!important;" width="1" />Wed, 13 Dec 2017 21:11:00 +0000michael@fpcomplete.com (Michael Snoyman)Neil Mitchell: Benchmarking strchr vs memchrtag:blogger.com,1999:blog-7094652.post-1655750833035966289
http://neilmitchell.blogspot.com/2017/12/benchmarking-strchr-vs-memchr.html
<p><em>Summary: memchr is faster, but the obvious implement seems to beat the builtin versions.</em></p><p>There are two related C functions for finding the next character in a string - <code>strchr</code> which assumes the string has a NUL character at the end, and <code>memchr</code> which takes the string length as an argument. For strings where you have the size <em>and</em> a NUL terminator, which is fastest? Using <code>gcc</code> 6.2.0 64bit MSYS2 on Windows 10, searching for a single byte 10M bytes along a string, the times were (fastest to slowest):</p><ul><li>11.05ms <code>memchr</code> implemented <a href="http://clc-wiki.net/wiki/memchr">the obvious way</a>.</li><li>14.82ms <code>strchr</code> implemented <a href="http://clc-wiki.net/wiki/strchr">the obvious way</a>.</li><li>14.96ms <code>memchr</code> provided by GCC.</li><li>19.63ms <code>strchr</code> provided by GCC.</li></ul><p>Trying on 3 different Windows computers, the results are all similar (but scaled).</p><p>Given the choice, you should prefer <code>memchr</code> over <code>strchr</code>.</p><p><strong>Surprise result</strong></p><p>The optimised implementations shipped with GCC are <em>slower</em> than the obvious C implementations taken from a wiki. I have absolutely no idea why. From what I can tell, the builtin versions are coded in assembly, operating on multiple bytes at a time, using SSE instructions. In contrast, the C variants operate on a single byte at a time, and aren't vectorised by the optimiser according to <a href="https://godbolt.org/">Godbolt</a>. If anyone has an explanation I'd be keen to hear it.</p><p><strong>Benchmark Code</strong></p><p>To benchmark the variants I wrote a Haskell program using <a href="https://hackage.haskell.org/package/criterion"><code>criterion</code></a>. The full code and build instructions are available in <a href="https://gist.github.com/ndmitchell/32fc47874eb888998557f445f98ff44f">this gist</a>. I compiled the C code with <code>-O3</code>, using the <tt>gcc</tt> shipped with GHC 8.2.1. I've reproduced the Haskell code below, with some comments:</p><pre><code>-- Import all the necessary pieces<br />import qualified Data.ByteString as BS<br />import qualified Data.ByteString.Unsafe as BS<br />import Criterion.Main<br />import Foreign<br />import Foreign.C.Types<br />import Data.Monoid<br /><br />-- Make all the C imports<br />foreign import ccall unsafe "string.h memchr" memchr_std :: Ptr Word8 -> CInt -> CSize -> IO (Ptr Word8)<br />foreign import ccall unsafe "string.h strchr" strchr_std :: Ptr Word8 -> CInt -> IO (Ptr Word8)<br />foreign import ccall unsafe memchr_c :: Ptr Word8 -> CInt -> CSize -> IO (Ptr Word8)<br />foreign import ccall unsafe strchr_c :: Ptr Word8 -> CInt -> IO (Ptr Word8)<br /><br />-- Method for ignoring the size when using strchr<br />ignoreSize f a b _ = f a b<br /><br />-- Build a suitable string with an interesting character i bytes along<br />cstr i = BS.replicate i 32 <> BS.singleton 64 <> BS.replicate i 32 <> BS.singleton 0<br /><br />-- The functions to benchmark<br />funs =<br /> [("memchr_std", memchr_std)<br /> ,("strchr_std", ignoreSize strchr_std)<br /> ,("memchr_c", memchr_c)<br /> ,("strchr_c", ignoreSize strchr_c)]<br /><br />-- The main function, using Criterion<br />main = defaultMain<br /> [ seq bs $ bench (show i ++ " " ++ name) $ whnfIO $ test fun bs<br /> | i <- [1,10,100,1000,10000,100000,1000000,10000000]<br /> , let bs = cstr i<br /> , (name, fun) <- funs]<br /><br />-- The function under test and input string<br />{-# NOINLINE test #-}<br />test fun bs =<br /> BS.unsafeUseAsCStringLen bs $ \(ptr,len) -><br /> fun (castPtr ptr) 64 (fromIntegral len)<br /></code></pre>Tue, 12 Dec 2017 16:56:00 +0000noreply@blogger.com (Neil Mitchell)Jeremy Gibbons: Streaming Arithmetic Codinghttp://patternsinfp.wordpress.com/?p=336
https://patternsinfp.wordpress.com/2017/12/11/streaming-arithmetic-coding/
<p>
In the <a href="https://patternsinfp.wordpress.com/2017/12/05/arithmetic-coding/">previous post</a> we saw the basic definitions of arithmetic encoding and decoding, and a proof that decoding does indeed successfully retrieve the input. In this post we go on to show how both encoding and decoding can be turned into streaming processes.</p>
<p></p><h2> Producing bits </h2>
<p>
Recall that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bencode%7D_0+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5Crightarrow+%5Cmathit%7BRational%7D+%5C%5C+%5Cmathit%7Bencode%7D_0%5C%3Bm+%3D+%5Cmathit%7Bpick%7D+%5Ccdot+%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bnarrow%7D%5C%3B%5Cmathit%7Bunit%7D+%5Ccdot+%5Cmathit%7BencodeSyms%7D%5C%3Bm+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bdecode%7D_0+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5C%5C+%5Cmathit%7Bdecode%7D_0%5C%3Bm%5C%3Bx+%3D+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D%5C%3B%28m%2Cx%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{encode}_0 :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow \mathit{Rational} \\ \mathit{encode}_0\;m = \mathit{pick} \cdot \mathit{foldr}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m \vrule width0pt depth2ex \\ \mathit{decode}_0 :: \mathit{Model} \rightarrow \mathit{Rational} \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_0\;m\;x = \mathit{unfoldr}\;\mathit{step}\;(m,x) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{encode}_0 :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow \mathit{Rational} \\ \mathit{encode}_0\;m = \mathit{pick} \cdot \mathit{foldr}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m \vrule width0pt depth2ex \\ \mathit{decode}_0 :: \mathit{Model} \rightarrow \mathit{Rational} \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_0\;m\;x = \mathit{unfoldr}\;\mathit{step}\;(m,x) \end{array} " />
</p></blockquote>
<p> Encoding and decoding work together. But they work only in batch mode: encoding computes a fraction, and yields nothing until the last step, and so decoding cannot start until encoding has finished. We really want encoding to yield as the encoded text a list of bits representing the fraction, rather than the fraction itself, so that we can stream the encoded text and the decoding process. To this end, we replace <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BRational%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick} :: \mathit{Interval} \rightarrow \mathit{Rational}}" class="latex" title="{\mathit{pick} :: \mathit{Interval} \rightarrow \mathit{Rational}}" /> by <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D_2+%3D+%5Cmathit%7BfromBits%7D+%5Ccdot+%5Cmathit%7BtoBits%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick}_2 = \mathit{fromBits} \cdot \mathit{toBits}}" class="latex" title="{\mathit{pick}_2 = \mathit{fromBits} \cdot \mathit{toBits}}" />, where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathbf%7Btype%7D%5C%3B%5Cmathit%7BBit%7D+%3D+%5Cmathit%7BInteger%7D+-+%5Cmbox%7B%5Cquad+0+or+1+only%7D+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7BtoBits%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5B%5Cmathit%7BBit%7D%5D+%5C%5C+%5Cmathit%7BfromBits%7D+%3A%3A+%5B%5Cmathit%7BBit%7D%5D+%5Crightarrow+%5Cmathit%7BRational%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathbf{type}\;\mathit{Bit} = \mathit{Integer} - \mbox{\quad 0 or 1 only} \vrule width0pt depth2ex \\ \mathit{toBits} :: \mathit{Interval} \rightarrow [\mathit{Bit}] \\ \mathit{fromBits} :: [\mathit{Bit}] \rightarrow \mathit{Rational} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathbf{type}\;\mathit{Bit} = \mathit{Integer} - \mbox{\quad 0 or 1 only} \vrule width0pt depth2ex \\ \mathit{toBits} :: \mathit{Interval} \rightarrow [\mathit{Bit}] \\ \mathit{fromBits} :: [\mathit{Bit}] \rightarrow \mathit{Rational} \end{array} " />
</p></blockquote>
<p> The obvious definitions have <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BtoBits%7D%5C%3Bi%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{toBits}\;i}" class="latex" title="{\mathit{toBits}\;i}" /> yield the shortest binary expansion of any fraction within <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" />, and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BfromBits%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{fromBits}}" class="latex" title="{\mathit{fromBits}}" /> evaluate this binary expansion. However, we don’t do quite this—it turns out to prevent the streaming condition from holding—and instead arrange for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BtoBits%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{toBits}}" class="latex" title="{\mathit{toBits}}" /> to yield the bit sequence that when extended with a 1 yields the shortest expansion of any fraction within <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> (and indeed, the shortest binary expansion necessarily ends with a 1), and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BfromBits%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{fromBits}}" class="latex" title="{\mathit{fromBits}}" /> compute the value with this 1 appended. </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7BfromBits%7D+%3D+%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bpack%7D%5C%3B%28%5Cfrac+1+2%29+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bpack%7D+%3A%3A+%5Cmathit%7BBit%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5C%5C+%5Cmathit%7Bpack%7D%5C%3Bb%5C%3Bx+%3D+%28b+%2B+x%29+%2F+2+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7BtoBits%7D+%3D+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7BnextBit%7D+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7BnextBit%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cmathit%7BBit%7D%2C+%5Cmathit%7BInterval%7D%29+%5C%5C+%5Cmathit%7BnextBit%7D%5C%3B%28l%2Cr%29+%5C%5C+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%5Cquad%7Dclcl%7D+%7C+%26+r+%5Cle+%5Cfrac+1+2+%26%3D%26+%5Cmathit%7BJust%7D%5C%3B%280%2C+%280%2C+%5Cfrac+1+2%29+%5Cmathbin%7B%5Ctriangleleft%7D+%28l%2Cr%29%29+%5C%5C+%7C+%26+%5Cfrac+1+2+%5Cle+l+%26%3D%26+%5Cmathit%7BJust%7D%5C%3B%281%2C+%28%5Cfrac+1+2%2C1%29+%5Cmathbin%7B%5Ctriangleleft%7D+%28l%2Cr%29%29+%5C%5C+%7C+%26+%5Cmathbf%7Botherwise%7D+%26%3D%26+%5Cmathit%7BNothing%7D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{fromBits} = \mathit{foldr}\;\mathit{pack}\;(\frac 1 2) \vrule width0pt depth2ex \\ \mathit{pack} :: \mathit{Bit} \rightarrow \mathit{Rational} \rightarrow \mathit{Rational} \\ \mathit{pack}\;b\;x = (b + x) / 2 \vrule width0pt depth2ex \\ \mathit{toBits} = \mathit{unfoldr}\;\mathit{nextBit} \vrule width0pt depth2ex \\ \mathit{nextBit} :: \mathit{Interval} \rightarrow \mathsf{Maybe}\;(\mathit{Bit}, \mathit{Interval}) \\ \mathit{nextBit}\;(l,r) \\ \begin{array}[t]{@{\quad}clcl} | & r \le \frac 1 2 &=& \mathit{Just}\;(0, (0, \frac 1 2) \mathbin{\triangleleft} (l,r)) \\ | & \frac 1 2 \le l &=& \mathit{Just}\;(1, (\frac 1 2,1) \mathbin{\triangleleft} (l,r)) \\ | & \mathbf{otherwise} &=& \mathit{Nothing} \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{fromBits} = \mathit{foldr}\;\mathit{pack}\;(\frac 1 2) \vrule width0pt depth2ex \\ \mathit{pack} :: \mathit{Bit} \rightarrow \mathit{Rational} \rightarrow \mathit{Rational} \\ \mathit{pack}\;b\;x = (b + x) / 2 \vrule width0pt depth2ex \\ \mathit{toBits} = \mathit{unfoldr}\;\mathit{nextBit} \vrule width0pt depth2ex \\ \mathit{nextBit} :: \mathit{Interval} \rightarrow \mathsf{Maybe}\;(\mathit{Bit}, \mathit{Interval}) \\ \mathit{nextBit}\;(l,r) \\ \begin{array}[t]{@{\quad}clcl} | & r \le \frac 1 2 &=& \mathit{Just}\;(0, (0, \frac 1 2) \mathbin{\triangleleft} (l,r)) \\ | & \frac 1 2 \le l &=& \mathit{Just}\;(1, (\frac 1 2,1) \mathbin{\triangleleft} (l,r)) \\ | & \mathbf{otherwise} &=& \mathit{Nothing} \end{array} \end{array} " />
</p></blockquote>
<p> Thus, if <img src="https://s0.wp.com/latex.php?latex=%7Br+%5Cle+%5Cfrac+1+2%7D&bg=ffffff&fg=000000&s=0" alt="{r \le \frac 1 2}" class="latex" title="{r \le \frac 1 2}" /> then the binary expansion of any fraction within <img src="https://s0.wp.com/latex.php?latex=%7B%5Bl%2Cr%29%7D&bg=ffffff&fg=000000&s=0" alt="{[l,r)}" class="latex" title="{[l,r)}" /> starts with 0; and similarly, if <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac+1+2+%5Cle+l%7D&bg=ffffff&fg=000000&s=0" alt="{\frac 1 2 \le l}" class="latex" title="{\frac 1 2 \le l}" />, the binary expansion starts with 1. Otherwise, the interval <img src="https://s0.wp.com/latex.php?latex=%7B%5Bl%2Cr%29%7D&bg=ffffff&fg=000000&s=0" alt="{[l,r)}" class="latex" title="{[l,r)}" /> straddles <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac+1+2%7D&bg=ffffff&fg=000000&s=0" alt="{\frac 1 2}" class="latex" title="{\frac 1 2}" />; the shortest binary expansion within is it the expansion of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac+1+2%7D&bg=ffffff&fg=000000&s=0" alt="{\frac 1 2}" class="latex" title="{\frac 1 2}" />, so we yield the empty bit sequence.</p>
<p>
Note that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D_2+%3D+%5Cmathit%7BfromBits%7D+%5Ccdot+%5Cmathit%7BtoBits%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick}_2 = \mathit{fromBits} \cdot \mathit{toBits}}" class="latex" title="{\mathit{pick}_2 = \mathit{fromBits} \cdot \mathit{toBits}}" /> is a hylomorphism, so we have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bpick%7D_2%5C%3B%28l%2Cr%29+%5C%5C+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%5Cquad%7Dclcl%7D+%7C+%26+r+%5Cle+%5Cfrac+1+2+%26%3D%26+%5Cmathit%7Bpick%7D_2%5C%3B%28%280%2C%5Cfrac+1+2%29+%5Cmathbin%7B%5Ctriangleleft%7D+%28l%2Cr%29%29+%2F+2+%5C%5C+%7C+%26+%5Cfrac+1+2+%5Cle+l+%26%3D%26+%281+%2B+%5Cmathit%7Bpick%7D_2%5C%3B%28%28%5Cfrac+1+2%2C1%29+%5Cmathbin%7B%5Ctriangleleft%7D+%28l%2Cr%29%29%29+%2F+2+%5C%5C+%7C+%26+%5Cmathbf%7Botherwise%7D+%26%3D%26+%5Cfrac+1+2+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{pick}_2\;(l,r) \\ \begin{array}[t]{@{\quad}clcl} | & r \le \frac 1 2 &=& \mathit{pick}_2\;((0,\frac 1 2) \mathbin{\triangleleft} (l,r)) / 2 \\ | & \frac 1 2 \le l &=& (1 + \mathit{pick}_2\;((\frac 1 2,1) \mathbin{\triangleleft} (l,r))) / 2 \\ | & \mathbf{otherwise} &=& \frac 1 2 \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{pick}_2\;(l,r) \\ \begin{array}[t]{@{\quad}clcl} | & r \le \frac 1 2 &=& \mathit{pick}_2\;((0,\frac 1 2) \mathbin{\triangleleft} (l,r)) / 2 \\ | & \frac 1 2 \le l &=& (1 + \mathit{pick}_2\;((\frac 1 2,1) \mathbin{\triangleleft} (l,r))) / 2 \\ | & \mathbf{otherwise} &=& \frac 1 2 \end{array} \end{array} " />
</p></blockquote>
<p> Moreover, it is clear that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BtoBits%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{toBits}}" class="latex" title="{\mathit{toBits}}" /> yields a finite bit sequence for any non-empty interval (since the interval doubles in width at each step, and the process stops when it includes <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac+1+2%7D&bg=ffffff&fg=000000&s=0" alt="{\frac 1 2}" class="latex" title="{\frac 1 2}" />); so this equation serves to uniquely define <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D_2%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick}_2}" class="latex" title="{\mathit{pick}_2}" />. In other words, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BnextBit%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{nextBit}}" class="latex" title="{\mathit{nextBit}}" /> is a <em>recursive coalgebra</em>. Then it is a straightforward exercise to prove that <img src="https://s0.wp.com/latex.php?latex=%7Bi+%5Cni+%5Cmathit%7Bpick%7D_2%5C%3Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i \ni \mathit{pick}_2\;i}" class="latex" title="{i \ni \mathit{pick}_2\;i}" />; so although <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick}}" class="latex" title="{\mathit{pick}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D_2%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick}_2}" class="latex" title="{\mathit{pick}_2}" /> differ, they are sufficiently similar for our purposes.</p>
<p>
Now we redefine encoding to yield a bit sequence rather than a fraction, and decoding correspondingly to consume that bit sequence: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bencode%7D_1+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5Crightarrow+%5B%5Cmathit%7BBit%7D%5D+%5C%5C+%5Cmathit%7Bencode%7D_1%5C%3Bm+%3D+%5Cmathit%7BtoBits%7D+%5Ccdot+%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bnarrow%7D%5C%3B%5Cmathit%7Bunit%7D+%5Ccdot+%5Cmathit%7BencodeSyms%7D%5C%3Bm+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bdecode%7D_1+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5B%5Cmathit%7BBit%7D%5D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5C%5C+%5Cmathit%7Bdecode%7D_1%5C%3Bm+%3D+%5Cmathit%7Bdecode%7D_0%5C%3Bm+%5Ccdot+%5Cmathit%7BfromBits%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{encode}_1 :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow [\mathit{Bit}] \\ \mathit{encode}_1\;m = \mathit{toBits} \cdot \mathit{foldr}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m \vrule width0pt depth2ex \\ \mathit{decode}_1 :: \mathit{Model} \rightarrow [\mathit{Bit}] \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_1\;m = \mathit{decode}_0\;m \cdot \mathit{fromBits} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{encode}_1 :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow [\mathit{Bit}] \\ \mathit{encode}_1\;m = \mathit{toBits} \cdot \mathit{foldr}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m \vrule width0pt depth2ex \\ \mathit{decode}_1 :: \mathit{Model} \rightarrow [\mathit{Bit}] \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_1\;m = \mathit{decode}_0\;m \cdot \mathit{fromBits} \end{array} " />
</p></blockquote>
<p> That is, we move the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BfromBits%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{fromBits}}" class="latex" title="{\mathit{fromBits}}" /> part of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D_2%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick}_2}" class="latex" title="{\mathit{pick}_2}" /> from the encoding stage to the decoding stage.</p>
<p></p><h2> Streaming encoding </h2>
<p>
Just like <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bencode%7D_0%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{encode}_0}" class="latex" title="{\mathit{encode}_0}" />, the new version <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bencode%7D_1%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{encode}_1}" class="latex" title="{\mathit{encode}_1}" /> of encoding consumes all of its input before producing any output, so does not work for encoding infinite inputs, nor for streaming execution even on finite inputs. However, it is nearly in the right form to be a <em>metamorphism</em>—a change of representation from lists of symbols to lists of bits. In particular, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bnarrow%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{narrow}}" class="latex" title="{\mathit{narrow}}" /> is associative, and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunit%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unit}}" class="latex" title="{\mathit{unit}}" /> is its unit, so we can replace the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldr}}" class="latex" title="{\mathit{foldr}}" /> with a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" />: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bencode%7D_1%5C%3Bm+%3D+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7BnextBit%7D+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bnarrow%7D%5C%3B%5Cmathit%7Bunit%7D+%5Ccdot+%5Cmathit%7BencodeSyms%7D%5C%3Bm+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{encode}_1\;m = \mathit{unfoldr}\;\mathit{nextBit} \cdot \mathit{foldl}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m " class="latex" title="\displaystyle \mathit{encode}_1\;m = \mathit{unfoldr}\;\mathit{nextBit} \cdot \mathit{foldl}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m " />
</p></blockquote>
<p> Now that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bencode%7D_1%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{encode}_1}" class="latex" title="{\mathit{encode}_1}" /> is in the right form, we must check the streaming condition for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bnarrow%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{narrow}}" class="latex" title="{\mathit{narrow}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BnextBit%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{nextBit}}" class="latex" title="{\mathit{nextBit}}" />. We consider one of the two cases in which <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BnextBit%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{nextBit}}" class="latex" title="{\mathit{nextBit}}" /> is productive, and leave the other as an exercise. When <img src="https://s0.wp.com/latex.php?latex=%7Br+%5Cle+%5Cfrac+1+2%7D&bg=ffffff&fg=000000&s=0" alt="{r \le \frac 1 2}" class="latex" title="{r \le \frac 1 2}" />, and assuming <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunit%7D+%5Csupseteq+%28p%2Cq%29%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unit} \supseteq (p,q)}" class="latex" title="{\mathit{unit} \supseteq (p,q)}" />, we have: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dcl%7D+%26+%5Cmathit%7BnextBit%7D%5C%3B%28%28l%2Cr%29+%5Cmathbin%7B%5Ctriangleright%7D+%28p%2Cq%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bnarrow%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7BnextBit%7D%5C%3B%28%5Cmathit%7Bweight%7D%5C%3B%28l%2Cr%29%5C%3Bp%2C+%5Cmathit%7Bweight%7D%5C%3B%28l%2Cr%29%5C%3Bq%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%28l%2Cr%29+%5Cni+%5Cmathit%7Bweight%7D%5C%3B%28l%2Cr%29%5C%3Bq+%5Cmbox%7B%2C+so+in+particular+%7D+%5Cmathit%7Bweight%7D%5C%3B%28l%2Cr%29%5C%3Bq+%3C+r+%5Cle+%5Cfrac+1+2+%5C%7D+%5C%5C+%26+%5Cmathit%7BJust%7D%5C%3B%280%2C+%280%2C+%5Cfrac+1+2%29+%5Cmathbin%7B%5Ctriangleleft%7D+%28%28l%2Cr%29+%5Cmathbin%7B%5Ctriangleright%7D+%28p%2Cq%29%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bwiden%7D+%5Cmbox%7B+associates+with+%7D+%5Cmathit%7Bnarrow%7D+%5Cmbox%7B+%28see+below%29%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7BJust%7D%5C%3B%280%2C+%28%280%2C+%5Cfrac+1+2%29+%5Cmathbin%7B%5Ctriangleleft%7D+%28l%2Cr%29%29+%5Cmathbin%7B%5Ctriangleright%7D+%28p%2Cq%29%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}cl} & \mathit{nextBit}\;((l,r) \mathbin{\triangleright} (p,q)) \\ = & \qquad \{ \mathit{narrow} \} \\ & \mathit{nextBit}\;(\mathit{weight}\;(l,r)\;p, \mathit{weight}\;(l,r)\;q) \\ = & \qquad \{ (l,r) \ni \mathit{weight}\;(l,r)\;q \mbox{, so in particular } \mathit{weight}\;(l,r)\;q < r \le \frac 1 2 \} \\ & \mathit{Just}\;(0, (0, \frac 1 2) \mathbin{\triangleleft} ((l,r) \mathbin{\triangleright} (p,q))) \\ = & \qquad \{ \mathit{widen} \mbox{ associates with } \mathit{narrow} \mbox{ (see below)} \} \\ & \mathit{Just}\;(0, ((0, \frac 1 2) \mathbin{\triangleleft} (l,r)) \mathbin{\triangleright} (p,q)) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}cl} & \mathit{nextBit}\;((l,r) \mathbin{\triangleright} (p,q)) \\ = & \qquad \{ \mathit{narrow} \} \\ & \mathit{nextBit}\;(\mathit{weight}\;(l,r)\;p, \mathit{weight}\;(l,r)\;q) \\ = & \qquad \{ (l,r) \ni \mathit{weight}\;(l,r)\;q \mbox{, so in particular } \mathit{weight}\;(l,r)\;q < r \le \frac 1 2 \} \\ & \mathit{Just}\;(0, (0, \frac 1 2) \mathbin{\triangleleft} ((l,r) \mathbin{\triangleright} (p,q))) \\ = & \qquad \{ \mathit{widen} \mbox{ associates with } \mathit{narrow} \mbox{ (see below)} \} \\ & \mathit{Just}\;(0, ((0, \frac 1 2) \mathbin{\triangleleft} (l,r)) \mathbin{\triangleright} (p,q)) \end{array} " />
</p></blockquote>
<p> as required. The last step is a kind of associativity property: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++i+%5Cmathbin%7B%5Ctriangleleft%7D+%28j+%5Cmathbin%7B%5Ctriangleright%7D+k%29+%3D+%28i+%5Cmathbin%7B%5Ctriangleleft%7D+j%29+%5Cmathbin%7B%5Ctriangleright%7D+k+&bg=ffffff&fg=000000&s=0" alt="\displaystyle i \mathbin{\triangleleft} (j \mathbin{\triangleright} k) = (i \mathbin{\triangleleft} j) \mathbin{\triangleright} k " class="latex" title="\displaystyle i \mathbin{\triangleleft} (j \mathbin{\triangleright} k) = (i \mathbin{\triangleleft} j) \mathbin{\triangleright} k " />
</p></blockquote>
<p> whose proof is left as another exercise. Therefore the streaming condition holds, and we may fuse the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}}" class="latex" title="{\mathit{unfoldr}}" /> with the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" />, defining </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bencode%7D_2+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5Crightarrow+%5B%5Cmathit%7BBit%7D%5D+%5C%5C+%5Cmathit%7Bencode%7D_2%5C%3Bm+%3D+%5Cmathit%7Bstream%7D%5C%3B%5Cmathit%7BnextBit%7D%5C%3B%5Cmathit%7Bnarrow%7D%5C%3B%5Cmathit%7Bunit%7D+%5Ccdot+%5Cmathit%7BencodeSyms%7D%5C%3Bm+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{encode}_2 :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow [\mathit{Bit}] \\ \mathit{encode}_2\;m = \mathit{stream}\;\mathit{nextBit}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{encode}_2 :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow [\mathit{Bit}] \\ \mathit{encode}_2\;m = \mathit{stream}\;\mathit{nextBit}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m \end{array} " />
</p></blockquote>
<p> which streams the encoding process: the initial bits are output as soon as they are fully determined, even before all the input has been read. Note that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bencode%7D_1%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{encode}_1}" class="latex" title="{\mathit{encode}_1}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bencode%7D_2%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{encode}_2}" class="latex" title="{\mathit{encode}_2}" /> differ, in particular on infinite inputs (the former diverges, whereas the latter does not); but they coincide on finite symbol sequences.</p>
<p></p><h2> Streaming decoding </h2>
<p>
Similarly, we want to be able to stream decoding, so that we don’t have to wait for the entire encoded text to arrive before starting decoding. Recall that we have so far </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bdecode%7D_1%5C%3Bm+%3D+%5Cmathit%7Bdecode%7D_0%5C%3Bm+%5Ccdot+%5Cmathit%7BfromBits%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{decode}_1\;m = \mathit{decode}_0\;m \cdot \mathit{fromBits} " class="latex" title="\displaystyle \mathit{decode}_1\;m = \mathit{decode}_0\;m \cdot \mathit{fromBits} " />
</p></blockquote>
<p> where <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bdecode%7D_0%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{decode}_0}" class="latex" title="{\mathit{decode}_0}" /> is an <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}}" class="latex" title="{\mathit{unfoldr}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BfromBits%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{fromBits}}" class="latex" title="{\mathit{fromBits}}" /> a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldr}}" class="latex" title="{\mathit{foldr}}" />. The first obstacle to streaming is that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldr}}" class="latex" title="{\mathit{foldr}}" />, which we need to be a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" /> instead. We have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Ctextstyle+%5Cmathit%7BfromBits%7D+%3D+%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bpack%7D%5C%3B%28%5Cfrac+1+2%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \textstyle \mathit{fromBits} = \mathit{foldr}\;\mathit{pack}\;(\frac 1 2) " class="latex" title="\displaystyle \textstyle \mathit{fromBits} = \mathit{foldr}\;\mathit{pack}\;(\frac 1 2) " />
</p></blockquote>
<p> Of course, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpack%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pack}}" class="latex" title="{\mathit{pack}}" /> is not associative—it doesn’t even have the right type for that. But we can view each bit in the input as a function on the unit interval: bit~0 is represented by the function <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%5C%3B%280%2C%5Cfrac+1+2%29%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}\;(0,\frac 1 2)}" class="latex" title="{\mathit{weight}\;(0,\frac 1 2)}" /> that focusses into the lower half of the unit interval, and bit~1 by the function <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%5C%3B%28%5Cfrac+1+2%2C+1%29%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}\;(\frac 1 2, 1)}" class="latex" title="{\mathit{weight}\;(\frac 1 2, 1)}" /> that focusses into the upper half. The fold itself composes a sequence of such functions; and since function composition is associative, this can be written equally well as a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldr}}" class="latex" title="{\mathit{foldr}}" /> or a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" />. Having assembled the individual focussers into one composite function, we finally apply it to <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac+1+2%7D&bg=ffffff&fg=000000&s=0" alt="{\frac 1 2}" class="latex" title="{\frac 1 2}" />. (This is in fact an instance of a general trick for turning a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldr}}" class="latex" title="{\mathit{foldr}}" /> into a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" />, or vice versa.) Thus, we have: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Ctextstyle+%5Cmathit%7BfromBits%7D%5C%3Bbs+%3D+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bfocusf%7D%5C%3B%5Cmathit%7Bid%7D%5C%3Bbs%5C%3B%28%5Cfrac+1+2%29+%5Cquad%5Cmathbf%7Bwhere%7D%5C%3B+%5Cmathit%7Bfocusf%7D%5C%3Bh%5C%3Bb+%3D+h+%5Ccdot+%5Cmathit%7Bweight%7D%5C%3B%28%5Cmathit%7Bhalf%7D%5C%3Bb%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \textstyle \mathit{fromBits}\;bs = \mathit{foldl}\;\mathit{focusf}\;\mathit{id}\;bs\;(\frac 1 2) \quad\mathbf{where}\; \mathit{focusf}\;h\;b = h \cdot \mathit{weight}\;(\mathit{half}\;b) " class="latex" title="\displaystyle \textstyle \mathit{fromBits}\;bs = \mathit{foldl}\;\mathit{focusf}\;\mathit{id}\;bs\;(\frac 1 2) \quad\mathbf{where}\; \mathit{focusf}\;h\;b = h \cdot \mathit{weight}\;(\mathit{half}\;b) " />
</p></blockquote>
<p> where <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bhalf%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{half}}" class="latex" title="{\mathit{half}}" /> yields either the lower or the upper half of the unit interval: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%7D+%5Cmulticolumn%7B3%7D%7B%40%7B%7Dl%7D%7B%5Cmathit%7Bhalf%7D+%3A%3A+%5Cmathit%7BBit%7D+%5Crightarrow+%5Cmathit%7BInterval%7D%7D+%5C%5C+%5Cmathit%7Bhalf%7D%5C%3B0+%26%3D%26+%280%2C+%5Cfrac+1+2%29+%5C%5C+%5Cmathit%7Bhalf%7D%5C%3B1+%26%3D%26+%28%5Cfrac+1+2%2C+1%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl} \multicolumn{3}{@{}l}{\mathit{half} :: \mathit{Bit} \rightarrow \mathit{Interval}} \\ \mathit{half}\;0 &=& (0, \frac 1 2) \\ \mathit{half}\;1 &=& (\frac 1 2, 1) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl} \multicolumn{3}{@{}l}{\mathit{half} :: \mathit{Bit} \rightarrow \mathit{Interval}} \\ \mathit{half}\;0 &=& (0, \frac 1 2) \\ \mathit{half}\;1 &=& (\frac 1 2, 1) \end{array} " />
</p></blockquote>
<p> In fact, not only may the individual bits be seen as focussing functions <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%5C%3B%280%2C+%5Cfrac+1+2%29%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}\;(0, \frac 1 2)}" class="latex" title="{\mathit{weight}\;(0, \frac 1 2)}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%5C%3B%28%5Cfrac+1+2%2C+1%29%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}\;(\frac 1 2, 1)}" class="latex" title="{\mathit{weight}\;(\frac 1 2, 1)}" /> on the unit interval, so too may compositions of such functions: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%7D+%5Cmathit%7Bid%7D+%26%3D%26+%5Cmathit%7Bweight%7D%5C%3B%5Cmathit%7Bunit%7D+%5C%5C+%5Cmathit%7Bweight%7D%5C%3Bi+%5Ccdot+%5Cmathit%7Bweight%7D%5C%3Bj+%26%3D%26+%5Cmathit%7Bweight%7D%5C%3B%28i+%5Cmathbin%7B%5Ctriangleright%7D+j%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl} \mathit{id} &=& \mathit{weight}\;\mathit{unit} \\ \mathit{weight}\;i \cdot \mathit{weight}\;j &=& \mathit{weight}\;(i \mathbin{\triangleright} j) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl} \mathit{id} &=& \mathit{weight}\;\mathit{unit} \\ \mathit{weight}\;i \cdot \mathit{weight}\;j &=& \mathit{weight}\;(i \mathbin{\triangleright} j) \end{array} " />
</p></blockquote>
<p> So any such composition is of the form <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%5C%3Bi%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}\;i}" class="latex" title="{\mathit{weight}\;i}" /> for some interval <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" />, and we may represent it concretely by <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> itself, and retrieve the function via <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}}" class="latex" title="{\mathit{weight}}" />: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Ctextstyle+%5Cmathit%7BfromBits%7D%5C%3Bbs+%3D+%5Cmathit%7Bweight%7D%5C%3B%28%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bfocus%7D%5C%3B%5Cmathit%7Bunit%7D%5C%3Bbs%29%5C%3B%28%5Cfrac+1+2%29+%5Cquad%5Cmathbf%7Bwhere%7D%5C%3B+%5Cmathit%7Bfocus%7D%5C%3Bi%5C%3Bb+%3D+i+%5Cmathbin%7B%5Ctriangleright%7D+%5Cmathit%7Bhalf%7D%5C%3Bb+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \textstyle \mathit{fromBits}\;bs = \mathit{weight}\;(\mathit{foldl}\;\mathit{focus}\;\mathit{unit}\;bs)\;(\frac 1 2) \quad\mathbf{where}\; \mathit{focus}\;i\;b = i \mathbin{\triangleright} \mathit{half}\;b " class="latex" title="\displaystyle \textstyle \mathit{fromBits}\;bs = \mathit{weight}\;(\mathit{foldl}\;\mathit{focus}\;\mathit{unit}\;bs)\;(\frac 1 2) \quad\mathbf{where}\; \mathit{focus}\;i\;b = i \mathbin{\triangleright} \mathit{half}\;b " />
</p></blockquote>
<p> So we now have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Ctextstyle+%5Cmathit%7Bdecode%7D_1%5C%3Bm+%3D+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D+%5Ccdot+%5Cmathit%7Bprepare%7D%5C%3Bm+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bfocus%7D%5C%3B%5Cmathit%7Bunit%7D+%5Cquad%5Cmathbf%7Bwhere%7D%5C%3B+%5Cmathit%7Bprepare%7D%5C%3Bm%5C%3Bi+%3D+%28m%2C+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cfrac+1+2%29%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \textstyle \mathit{decode}_1\;m = \mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}\;m \cdot \mathit{foldl}\;\mathit{focus}\;\mathit{unit} \quad\mathbf{where}\; \mathit{prepare}\;m\;i = (m, \mathit{weight}\;i\;(\frac 1 2)) " class="latex" title="\displaystyle \textstyle \mathit{decode}_1\;m = \mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}\;m \cdot \mathit{foldl}\;\mathit{focus}\;\mathit{unit} \quad\mathbf{where}\; \mathit{prepare}\;m\;i = (m, \mathit{weight}\;i\;(\frac 1 2)) " />
</p></blockquote>
<p> This is almost in the form of a metamorphism, except for the occurrence of the adapter <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bprepare%7D%5C%3Bm%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{prepare}\;m}" class="latex" title="{\mathit{prepare}\;m}" /> in between the unfold and the fold. It is not straightforward to fuse that adapter with either the fold or the unfold; fortunately, however, we can split it into the composition </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bprepare%7D%5C%3Bm+%3D+%5Cmathit%7Bprepare%7D_2+%5Ccdot+%5Cmathit%7Bprepare%7D_1%5C%3Bm+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{prepare}\;m = \mathit{prepare}_2 \cdot \mathit{prepare}_1\;m " class="latex" title="\displaystyle \mathit{prepare}\;m = \mathit{prepare}_2 \cdot \mathit{prepare}_1\;m " />
</p></blockquote>
<p> of two parts, where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%7D+%5Cmulticolumn%7B3%7D%7B%40%7B%7Dl%7D%7B%5Cmathit%7Bprepare%7D_1+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5Cmathit%7BInterval%7D+%5Crightarrow+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BInterval%7D%29%7D+%5C%5C+%5Cmathit%7Bprepare%7D_1%5C%3Bm%5C%3Bi+%26%3D%26+%28m%2Ci%29+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmulticolumn%7B3%7D%7B%40%7B%7Dl%7D%7B%5Cmathit%7Bprepare%7D_2+%3A%3A+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BInterval%7D%29+%5Crightarrow+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BRational%7D%29%7D+%5C%5C+%5Cmathit%7Bprepare%7D_2%5C%3B%28m%2Ci%29+%26%3D%26+%28m%2C+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cfrac+1+2%29%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl} \multicolumn{3}{@{}l}{\mathit{prepare}_1 :: \mathit{Model} \rightarrow \mathit{Interval} \rightarrow (\mathit{Model}, \mathit{Interval})} \\ \mathit{prepare}_1\;m\;i &=& (m,i) \vrule width0pt depth2ex \\ \multicolumn{3}{@{}l}{\mathit{prepare}_2 :: (\mathit{Model}, \mathit{Interval}) \rightarrow (\mathit{Model}, \mathit{Rational})} \\ \mathit{prepare}_2\;(m,i) &=& (m, \mathit{weight}\;i\;(\frac 1 2)) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl} \multicolumn{3}{@{}l}{\mathit{prepare}_1 :: \mathit{Model} \rightarrow \mathit{Interval} \rightarrow (\mathit{Model}, \mathit{Interval})} \\ \mathit{prepare}_1\;m\;i &=& (m,i) \vrule width0pt depth2ex \\ \multicolumn{3}{@{}l}{\mathit{prepare}_2 :: (\mathit{Model}, \mathit{Interval}) \rightarrow (\mathit{Model}, \mathit{Rational})} \\ \mathit{prepare}_2\;(m,i) &=& (m, \mathit{weight}\;i\;(\frac 1 2)) \end{array} " />
</p></blockquote>
<p> in such a way that the first part <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bprepare%7D_1%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{prepare}_1}" class="latex" title="{\mathit{prepare}_1}" /> fuses with the fold and the second part <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bprepare%7D_2%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{prepare}_2}" class="latex" title="{\mathit{prepare}_2}" /> fuses with the unfold. For fusing the first half of the adapter with the fold, we just need to carry around the additional value <img src="https://s0.wp.com/latex.php?latex=%7Bm%7D&bg=ffffff&fg=000000&s=0" alt="{m}" class="latex" title="{m}" /> with the interval being focussed: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bprepare%7D_1%5C%3Bm+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bfocus%7D%5C%3Bi+%3D+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bmfocus%7D%5C%3B%28m%2Ci%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{prepare}_1\;m \cdot \mathit{foldl}\;\mathit{focus}\;i = \mathit{foldl}\;\mathit{mfocus}\;(m,i) " class="latex" title="\displaystyle \mathit{prepare}_1\;m \cdot \mathit{foldl}\;\mathit{focus}\;i = \mathit{foldl}\;\mathit{mfocus}\;(m,i) " />
</p></blockquote>
<p> where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bmfocus%7D+%3A%3A+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BInterval%7D%29+%5Crightarrow+%5Cmathit%7BBit%7D+%5Crightarrow+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BInterval%7D%29+%5C%5C+%5Cmathit%7Bmfocus%7D%5C%3B%28m%2Ci%29%5C%3Bb+%3D+%28m%2C+%5Cmathit%7Bfocus%7D%5C%3Bi%5C%3Bb%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{mfocus} :: (\mathit{Model}, \mathit{Interval}) \rightarrow \mathit{Bit} \rightarrow (\mathit{Model}, \mathit{Interval}) \\ \mathit{mfocus}\;(m,i)\;b = (m, \mathit{focus}\;i\;b) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{mfocus} :: (\mathit{Model}, \mathit{Interval}) \rightarrow \mathit{Bit} \rightarrow (\mathit{Model}, \mathit{Interval}) \\ \mathit{mfocus}\;(m,i)\;b = (m, \mathit{focus}\;i\;b) \end{array} " />
</p></blockquote>
<p> For fusing the second half of the adapter with the unfold, let us check the fusion condition. We have (exercise!): </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bstep%7D%5C%3B%28%5Cmathit%7Bprepare%7D_2%5C%3B%28m%2C+i%29%29+%3D+%5Cmathit%7Bfmap%7D%5C%3B%5Cmathit%7Bprepare%7D_2%5C%3B%28%5Cmathit%7BJust%7D%5C%3B%28s%2C+%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%2C+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%5Cmathbin%7B%5Ctriangleleft%7D+i%29%29%29+%5C%5C+%5Cqquad%5Cmathbf%7Bwhere%7D%5C%3Bs+%3D+%5Cmathit%7BdecodeSym%7D%5C%3Bm%5C%3B%28%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cfrac+1+2%29%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{step}\;(\mathit{prepare}_2\;(m, i)) = \mathit{fmap}\;\mathit{prepare}_2\;(\mathit{Just}\;(s, (\mathit{newModel}\;m\;s, \mathit{encodeSym}\;m\;s \mathbin{\triangleleft} i))) \\ \qquad\mathbf{where}\;s = \mathit{decodeSym}\;m\;(\mathit{weight}\;i\;(\frac 1 2)) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{step}\;(\mathit{prepare}_2\;(m, i)) = \mathit{fmap}\;\mathit{prepare}_2\;(\mathit{Just}\;(s, (\mathit{newModel}\;m\;s, \mathit{encodeSym}\;m\;s \mathbin{\triangleleft} i))) \\ \qquad\mathbf{where}\;s = \mathit{decodeSym}\;m\;(\mathit{weight}\;i\;(\frac 1 2)) \end{array} " />
</p></blockquote>
<p> where the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfmap%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{fmap}}" class="latex" title="{\mathit{fmap}}" /> is the functorial action for the base functor of the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathsf%7BList%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathsf{List}}" class="latex" title="{\mathsf{List}}" /> datatype, applying just to the second component of the optional pair. We therefore define </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bstepi%7D+%3A%3A+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BInterval%7D%29+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cmathit%7BSymbol%7D%2C+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BInterval%7D%29%29+%5C%5C+%5Cmathit%7Bstepi%7D%5C%3B%28m%2Ci%29+%3D+%5Cmathit%7BJust%7D%5C%3B%28s%2C+%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%2C+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%5Cmathbin%7B%5Ctriangleleft%7D+i%29%29+%5C%5C+%5Cqquad%5Cmathbf%7Bwhere%7D%5C%3Bs+%3D+%5Cmathit%7BdecodeSym%7D%5C%3Bm%5C%3B%28%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cfrac+1+2%29%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{stepi} :: (\mathit{Model}, \mathit{Interval}) \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, (\mathit{Model}, \mathit{Interval})) \\ \mathit{stepi}\;(m,i) = \mathit{Just}\;(s, (\mathit{newModel}\;m\;s, \mathit{encodeSym}\;m\;s \mathbin{\triangleleft} i)) \\ \qquad\mathbf{where}\;s = \mathit{decodeSym}\;m\;(\mathit{weight}\;i\;(\frac 1 2)) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{stepi} :: (\mathit{Model}, \mathit{Interval}) \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, (\mathit{Model}, \mathit{Interval})) \\ \mathit{stepi}\;(m,i) = \mathit{Just}\;(s, (\mathit{newModel}\;m\;s, \mathit{encodeSym}\;m\;s \mathbin{\triangleleft} i)) \\ \qquad\mathbf{where}\;s = \mathit{decodeSym}\;m\;(\mathit{weight}\;i\;(\frac 1 2)) \end{array} " />
</p></blockquote>
<p> and have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bstep%7D%5C%3B%28%5Cmathit%7Bprepare%7D_2%5C%3B%28m%2C+i%29%29+%3D+%5Cmathit%7Bfmap%7D%5C%3B%5Cmathit%7Bprepare%7D_2%5C%3B%28%5Cmathit%7Bstepi%7D%5C%3B%28m%2Ci%29%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{step}\;(\mathit{prepare}_2\;(m, i)) = \mathit{fmap}\;\mathit{prepare}_2\;(\mathit{stepi}\;(m,i)) " class="latex" title="\displaystyle \mathit{step}\;(\mathit{prepare}_2\;(m, i)) = \mathit{fmap}\;\mathit{prepare}_2\;(\mathit{stepi}\;(m,i)) " />
</p></blockquote>
<p> and therefore </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D+%5Ccdot+%5Cmathit%7Bprepare%7D_2+%3D+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstepi%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2 = \mathit{unfoldr}\;\mathit{stepi} " class="latex" title="\displaystyle \mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2 = \mathit{unfoldr}\;\mathit{stepi} " />
</p></blockquote>
<p> Note that the right-hand side will eventually lead to intervals that exceed the unit interval. When <img src="https://s0.wp.com/latex.php?latex=%7Bj+%5Csupseteq+i%7D&bg=ffffff&fg=000000&s=0" alt="{j \supseteq i}" class="latex" title="{j \supseteq i}" />, it follows that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunit%7D+%5Csupseteq+j+%5Cmathbin%7B%5Ctriangleleft%7D+i%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unit} \supseteq j \mathbin{\triangleleft} i}" class="latex" title="{\mathit{unit} \supseteq j \mathbin{\triangleleft} i}" />; but the unfolding process keeps widening the interval without bound, so it will necessarily eventually exceed the unit bounds. We return to this point shortly.</p>
<p>
We have therefore concluded that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%7D+%5Cmathit%7Bdecode%7D_1%5C%3Bm+%26%3D%26+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D+%5Ccdot+%5Cmathit%7Bprepare%7D%5C%3Bm+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bfocus%7D%5C%3B%5Cmathit%7Bunit%7D+%5C%5C+%26%3D%26+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D+%5Ccdot+%5Cmathit%7Bprepare%7D_2+%5Ccdot+%5Cmathit%7Bprepare%7D_1%5C%3Bm+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bfocus%7D%5C%3B%5Cmathit%7Bunit%7D+%5C%5C+%26%3D%26+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstepi%7D+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bmfocus%7D%5C%3B%28m%2C%5Cmathit%7Bunit%7D%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl} \mathit{decode}_1\;m &=& \mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}\;m \cdot \mathit{foldl}\;\mathit{focus}\;\mathit{unit} \\ &=& \mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2 \cdot \mathit{prepare}_1\;m \cdot \mathit{foldl}\;\mathit{focus}\;\mathit{unit} \\ &=& \mathit{unfoldr}\;\mathit{stepi} \cdot \mathit{foldl}\;\mathit{mfocus}\;(m,\mathit{unit}) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl} \mathit{decode}_1\;m &=& \mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}\;m \cdot \mathit{foldl}\;\mathit{focus}\;\mathit{unit} \\ &=& \mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2 \cdot \mathit{prepare}_1\;m \cdot \mathit{foldl}\;\mathit{focus}\;\mathit{unit} \\ &=& \mathit{unfoldr}\;\mathit{stepi} \cdot \mathit{foldl}\;\mathit{mfocus}\;(m,\mathit{unit}) \end{array} " />
</p></blockquote>
<p>
Now we need to check the streaming condition for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bmfocus%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{mfocus}}" class="latex" title="{\mathit{mfocus}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstepi%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stepi}}" class="latex" title="{\mathit{stepi}}" />. Unfortunately, this is never going to hold: <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstepi%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stepi}}" class="latex" title="{\mathit{stepi}}" /> is always productive, so <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstream%7D%5C%3B%5Cmathit%7Bstepi%7D%5C%3B%5Cmathit%7Bmfocus%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stream}\;\mathit{stepi}\;\mathit{mfocus}}" class="latex" title="{\mathit{stream}\;\mathit{stepi}\;\mathit{mfocus}}" /> will only take production steps and never consume any input. The problem is that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstepi%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}\;\mathit{stepi}}" class="latex" title="{\mathit{unfoldr}\;\mathit{stepi}}" /> is too aggressive, and we need to use the more cautious flushing version of streaming instead. Informally, the streaming process should be productive from a given state <img src="https://s0.wp.com/latex.php?latex=%7B%28m%2Ci%29%7D&bg=ffffff&fg=000000&s=0" alt="{(m,i)}" class="latex" title="{(m,i)}" /> only when the whole of interval <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> maps to the same symbol in model <img src="https://s0.wp.com/latex.php?latex=%7Bm%7D&bg=ffffff&fg=000000&s=0" alt="{m}" class="latex" title="{m}" />, so that however <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> is focussed by subsequent inputs, that symbol cannot be invalidated.</p>
<p>
More formally, note that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstepi%7D+%3D+%5Cmathit%7Bapo%7D%5C%3B%5Cmathit%7Bsafestepi%7D%5C%3B%28%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstepi%7D%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{unfoldr}\;\mathit{stepi} = \mathit{apo}\;\mathit{safestepi}\;(\mathit{unfoldr}\;\mathit{stepi}) " class="latex" title="\displaystyle \mathit{unfoldr}\;\mathit{stepi} = \mathit{apo}\;\mathit{safestepi}\;(\mathit{unfoldr}\;\mathit{stepi}) " />
</p></blockquote>
<p> where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bsafestepi%7D+%3A%3A+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BInterval%7D%29+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cmathit%7BSymbol%7D%2C+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BInterval%7D%29%29+%5C%5C+%5Cmathit%7Bsafestepi%7D%5C%3B%28m%2Ci%29+%5C%5C+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%5Cquad%7Dclcl%7D+%7C+%26+%5Cmathit%7Bsafe%7D%5C%3B%28m%2Ci%29+%26%3D%26+%5Cmathit%7Bstepi%7D%5C%3B%28m%2Ci%29+%5C%5C+%7C+%26+%5Cmathbf%7Botherwise%7D+%26%3D%26+%5Cmathit%7BNothing%7D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{safestepi} :: (\mathit{Model}, \mathit{Interval}) \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, (\mathit{Model}, \mathit{Interval})) \\ \mathit{safestepi}\;(m,i) \\ \begin{array}[t]{@{\quad}clcl} | & \mathit{safe}\;(m,i) &=& \mathit{stepi}\;(m,i) \\ | & \mathbf{otherwise} &=& \mathit{Nothing} \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{safestepi} :: (\mathit{Model}, \mathit{Interval}) \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, (\mathit{Model}, \mathit{Interval})) \\ \mathit{safestepi}\;(m,i) \\ \begin{array}[t]{@{\quad}clcl} | & \mathit{safe}\;(m,i) &=& \mathit{stepi}\;(m,i) \\ | & \mathbf{otherwise} &=& \mathit{Nothing} \end{array} \end{array} " />
</p></blockquote>
<p> and </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bsafe%7D+%3A%3A+%28%5Cmathit%7BModel%7D%2C%5Cmathit%7BInterval%7D%29+%5Crightarrow+%5Cmathit%7BBool%7D+%5C%5C+%5Cmathit%7Bsafe%7D%5C%3B%28m%2C+i%29+%3D+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%5Csupseteq+i+%5Cquad%5Cmathbf%7Bwhere%7D%5C%3B+s+%3D+%5Cmathit%7BdecodeSym%7D%5C%3Bm%5C%3B%28%5Cmathit%7Bmidpoint%7D%5C%3Bi%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{safe} :: (\mathit{Model},\mathit{Interval}) \rightarrow \mathit{Bool} \\ \mathit{safe}\;(m, i) = \mathit{encodeSym}\;m\;s \supseteq i \quad\mathbf{where}\; s = \mathit{decodeSym}\;m\;(\mathit{midpoint}\;i) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{safe} :: (\mathit{Model},\mathit{Interval}) \rightarrow \mathit{Bool} \\ \mathit{safe}\;(m, i) = \mathit{encodeSym}\;m\;s \supseteq i \quad\mathbf{where}\; s = \mathit{decodeSym}\;m\;(\mathit{midpoint}\;i) \end{array} " />
</p></blockquote>
<p> That is, the interval <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> is “safe” for model <img src="https://s0.wp.com/latex.php?latex=%7Bm%7D&bg=ffffff&fg=000000&s=0" alt="{m}" class="latex" title="{m}" /> if it is fully included in the encoding of some symbol <img src="https://s0.wp.com/latex.php?latex=%7Bs%7D&bg=ffffff&fg=000000&s=0" alt="{s}" class="latex" title="{s}" />; then all elements of <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> decode to <img src="https://s0.wp.com/latex.php?latex=%7Bs%7D&bg=ffffff&fg=000000&s=0" alt="{s}" class="latex" title="{s}" />. Then, and only then, we may commit to outputting <img src="https://s0.wp.com/latex.php?latex=%7Bs%7D&bg=ffffff&fg=000000&s=0" alt="{s}" class="latex" title="{s}" />, because no further input bits could lead to a different first output symbol. </p>
<p>
Note now that the interval remains bounded by unit interval during the streaming phase, because of the safety check in <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bsafestepi%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{safestepi}}" class="latex" title="{\mathit{safestepi}}" />, although it will still exceed the unit interval during the flushing phase. However, at this point we can undo the fusion we performed earlier, “<a href="http://www.cs.ox.ac.uk/publications/publication1470-abstract.html">fissioning</a>” <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstepi%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}\;\mathit{stepi}}" class="latex" title="{\mathit{unfoldr}\;\mathit{stepi}}" /> into <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D+%5Ccdot+%5Cmathit%7Bprepare%7D_2%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2}" class="latex" title="{\mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2}" /> again: this manipulates rationals rather than intervals, so there is no problem with intervals getting too wide. We therefore have: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bdecode%7D_1%5C%3Bm+%3D+%5Cmathit%7Bapo%7D%5C%3B%5Cmathit%7Bsafestepi%7D%5C%3B%28%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D+%5Ccdot+%5Cmathit%7Bprepare%7D_2%29+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bmfocus%7D%5C%3B%28m%2C%5Cmathit%7Bunit%7D%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{decode}_1\;m = \mathit{apo}\;\mathit{safestepi}\;(\mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2) \cdot \mathit{foldl}\;\mathit{mfocus}\;(m,\mathit{unit}) " class="latex" title="\displaystyle \mathit{decode}_1\;m = \mathit{apo}\;\mathit{safestepi}\;(\mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2) \cdot \mathit{foldl}\;\mathit{mfocus}\;(m,\mathit{unit}) " />
</p></blockquote>
<p>
Now let us check the streaming condition for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bmfocus%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{mfocus}}" class="latex" title="{\mathit{mfocus}}" /> and the more cautious <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bsafestepi%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{safestepi}}" class="latex" title="{\mathit{safestepi}}" />. Suppose that <img src="https://s0.wp.com/latex.php?latex=%7B%28m%2Ci%29%7D&bg=ffffff&fg=000000&s=0" alt="{(m,i)}" class="latex" title="{(m,i)}" /> is a productive state, so that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bsafe%7D%5C%3B%28m%2Ci%29%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{safe}\;(m,i)}" class="latex" title="{\mathit{safe}\;(m,i)}" /> holds, that is, all of interval <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> is mapped to the same symbol in <img src="https://s0.wp.com/latex.php?latex=%7Bm%7D&bg=ffffff&fg=000000&s=0" alt="{m}" class="latex" title="{m}" />, and let </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%7D+s+%26%3D%26+%5Cmathit%7BdecodeSym%7D%5C%3Bm%5C%3B%28%5Cmathit%7Bmidpoint%7D%5C%3Bi%29+%5C%5C+m%27+%26%3D%26+%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs+%5C%5C+i%27+%26%3D%26+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%5Cmathbin%7B%5Ctriangleleft%7D+i+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl} s &=& \mathit{decodeSym}\;m\;(\mathit{midpoint}\;i) \\ m' &=& \mathit{newModel}\;m\;s \\ i' &=& \mathit{encodeSym}\;m\;s \mathbin{\triangleleft} i \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl} s &=& \mathit{decodeSym}\;m\;(\mathit{midpoint}\;i) \\ m' &=& \mathit{newModel}\;m\;s \\ i' &=& \mathit{encodeSym}\;m\;s \mathbin{\triangleleft} i \end{array} " />
</p></blockquote>
<p> so that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bsafestepi%7D%5C%3B%28m%2Ci%29+%3D+%5Cmathit%7BJust%7D%5C%3B%28s%2C+%28m%27%2Ci%27%29%29%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{safestepi}\;(m,i) = \mathit{Just}\;(s, (m',i'))}" class="latex" title="{\mathit{safestepi}\;(m,i) = \mathit{Just}\;(s, (m',i'))}" />. Consuming the next input <img src="https://s0.wp.com/latex.php?latex=%7Bb%7D&bg=ffffff&fg=000000&s=0" alt="{b}" class="latex" title="{b}" /> leads to state <img src="https://s0.wp.com/latex.php?latex=%7B%28m%2C+%5Cmathit%7Bfocus%7D%5C%3Bi%5C%3Bb%29%7D&bg=ffffff&fg=000000&s=0" alt="{(m, \mathit{focus}\;i\;b)}" class="latex" title="{(m, \mathit{focus}\;i\;b)}" />. This too is a productive state, because <img src="https://s0.wp.com/latex.php?latex=%7Bi+%5Csupseteq+%5Cmathit%7Bfocus%7D%5C%3Bi%5C%3Bb%7D&bg=ffffff&fg=000000&s=0" alt="{i \supseteq \mathit{focus}\;i\;b}" class="latex" title="{i \supseteq \mathit{focus}\;i\;b}" /> for any <img src="https://s0.wp.com/latex.php?latex=%7Bb%7D&bg=ffffff&fg=000000&s=0" alt="{b}" class="latex" title="{b}" />, and so the whole of the focussed interval is also mapped to the same symbol <img src="https://s0.wp.com/latex.php?latex=%7Bs%7D&bg=ffffff&fg=000000&s=0" alt="{s}" class="latex" title="{s}" /> in the model. In particular, the midpoint of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfocus%7D%5C%3Bi%5C%3Bb%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{focus}\;i\;b}" class="latex" title="{\mathit{focus}\;i\;b}" /> is within interval <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" />, and so the first symbol produced from the state after consumption coincides with the symbol <img src="https://s0.wp.com/latex.php?latex=%7Bs%7D&bg=ffffff&fg=000000&s=0" alt="{s}" class="latex" title="{s}" /> produced from the state before consumption. That is, </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bsafestepi%7D%5C%3B%28%5Cmathit%7Bmfocus%7D%5C%3B%28m%2Ci%29%5C%3Bb%29+%3D+%5Cmathit%7BJust%7D%5C%3B%28s%2C+%5Cmathit%7Bmfocus%7D%5C%3B%28m%27%2C+i%27%29%5C%3Bb%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{safestepi}\;(\mathit{mfocus}\;(m,i)\;b) = \mathit{Just}\;(s, \mathit{mfocus}\;(m', i')\;b) " class="latex" title="\displaystyle \mathit{safestepi}\;(\mathit{mfocus}\;(m,i)\;b) = \mathit{Just}\;(s, \mathit{mfocus}\;(m', i')\;b) " />
</p></blockquote>
<p> as required. We can therefore rewrite decoding as a flushing stream computation: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bdecode%7D_2+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5B%5Cmathit%7BBit%7D%5D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5C%5C+%5Cmathit%7Bdecode%7D_2%5C%3Bm+%3D+%5Cmathit%7Bfstream%7D%5C%3B%5Cmathit%7Bsafestepi%7D%5C%3B%28%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D+%5Ccdot+%5Cmathit%7Bprepare%7D_2%29%5C%3B%5Cmathit%7Bmfocus%7D%5C%3B%28m%2C%5Cmathit%7Bunit%7D%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{decode}_2 :: \mathit{Model} \rightarrow [\mathit{Bit}] \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_2\;m = \mathit{fstream}\;\mathit{safestepi}\;(\mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2)\;\mathit{mfocus}\;(m,\mathit{unit}) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{decode}_2 :: \mathit{Model} \rightarrow [\mathit{Bit}] \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_2\;m = \mathit{fstream}\;\mathit{safestepi}\;(\mathit{unfoldr}\;\mathit{step} \cdot \mathit{prepare}_2)\;\mathit{mfocus}\;(m,\mathit{unit}) \end{array} " />
</p></blockquote>
<p> That is, initial symbols are output as soon as they are completely determined, even before all the input bits have been read. This agrees with <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bdecode%7D_1%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{decode}_1}" class="latex" title="{\mathit{decode}_1}" /> on finite bit sequences.</p>
<p></p><h2> Fixed-precision arithmetic </h2>
<p>
We will leave arithmetic coding at this point. There is actually still quite a bit more arithmetic required—in particular, for competitive performance it is important to use only <em>fixed-precision</em> arithmetic, restricting attention to rationals within the unit interval with denominator <img src="https://s0.wp.com/latex.php?latex=%7B2%5Ek%7D&bg=ffffff&fg=000000&s=0" alt="{2^k}" class="latex" title="{2^k}" /> for some fixed~<img src="https://s0.wp.com/latex.php?latex=%7Bk%7D&bg=ffffff&fg=000000&s=0" alt="{k}" class="latex" title="{k}" />. In order to be able to multiply two numerators using 32-bit integer arithmetic without the risk of overflow, we can have at most <img src="https://s0.wp.com/latex.php?latex=%7Bk%3D15%7D&bg=ffffff&fg=000000&s=0" alt="{k=15}" class="latex" title="{k=15}" />. Interval narrowing now needs to be <em>approximate</em>, rounding down both endpoints to integer multiples of <img src="https://s0.wp.com/latex.php?latex=%7B2%5E%7B-k%7D%7D&bg=ffffff&fg=000000&s=0" alt="{2^{-k}}" class="latex" title="{2^{-k}}" />. Care needs to be taken so that this rounding never makes the two endpoints of an interval coincide. Still, encoding can be written as an instance of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstream%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stream}}" class="latex" title="{\mathit{stream}}" />. Decoding appears to be more difficult: the approximate arithmetic means that we no longer have interval widening as an exact inverse of narrowing, so the approach above no longer works. Instead, our <a href="http://www.cs.ox.ac.uk/publications/publication2333-abstract.html">2002 lecture notes</a> introduce a “destreaming” operator that simulates and inverts streaming: the decoder works in sympathy with the encoder, performing essentially the same interval arithmetic but doing the opposite conversions. Perhaps I will return to complete that story some time…</p>
<p><img src="https://pixel.wp.com/b.gif?host=patternsinfp.wordpress.com&blog=15593982&post=336&subd=patternsinfp&ref=&feed=1" alt="" height="1" border="0" width="1" /></p>Mon, 11 Dec 2017 13:32:35 +0000Philip Wadler: Simplicity and Michelsontag:blogger.com,1999:blog-9757377.post-6652402548002460330
http://wadler.blogspot.com/2017/12/simplicity-and-michelson.html
<div style="clear: both; text-align: center;" class="separator"><a style="margin-left: 1em; margin-right: 1em;" href="https://1.bp.blogspot.com/-fjz4u3UagOU/WiqFgPsdjwI/AAAAAAAAmSk/f-BlLX9swoAMvtDsKDedJp05mMvnYEHCACLcBGAs/s1600/simplicity.jpg"><img src="https://1.bp.blogspot.com/-fjz4u3UagOU/WiqFgPsdjwI/AAAAAAAAmSk/f-BlLX9swoAMvtDsKDedJp05mMvnYEHCACLcBGAs/s640/simplicity.jpg" height="426" border="0" width="640" /></a></div><h2>Simplicity</h2>Only once in my life have I encountered a programming language that was <i>too simple</i> to use. That was Lispkit Lisp, developed by Peter Henderson, Geraint Jones, and Simon Jones, which I saw while serving as a postdoc at Oxford, 1983–87, and which despite its simplicity was used to implement an entire operating system. It is an indightment of the field of programming languages that I have not since encountered another system that I consider <i>too simple</i>. Until today. I can now add a second system to the list of those that are <i>too simple</i>, the appropriately-titled Simplicity, developed by Russell O'Connor of Blockstream. It is described by a paper <a href="https://blockstream.com/simplicity.pdf">here</a>and a website <a href="https://blockstream.com/2017/10/30/simplicity.html">here</a>. <br />The core of Simplicity consists of just nine combinators: three for products (<code>pair</code>, <code>take</code>, and <code>drop</code>), three for sums (<code>injl</code>, <code>injr</code>, and <code>case</code>), one for unit (<code>unit</code>), and two for plumbing (<code>iden</code> and <code>comp</code>). It is throughly grounded in ideas from the functional programming, programming language, and formal methods communities. <br />When I call Simplicity <i>too simple</i> it is intended as a compliment. It is delightful to see full adders and cryptographic hash functions cobbled together using just products, sums, and units. It is eye-opening to see how far one can get without recursion or iteration, and how this enables simple analyses of the time and space required to execute a program. It is a confirmation to see a system with foundations in category theory and sequent calculus. Now I know what to say when developers respond to my talk "Categories for the Working Hacker" by asking "But how can we use this in practice?" <br />The system is accompanied by a proof of its correctness in Coq, which sets a high bar for competing systems. O'Connor even claims to have a proof in Coq that the Simplicity implementation of SHA-256 matches the reference specification provided by Andrew Appel's Verified Software Toolchain project (VST), which VST proved corresponds to the OpenSSL implementation of SHA-256 in C. <br />At IOHK, I have been involved in the design of Plutus Core, our own smart contract scripting language, working with Darryl McAdams, Duncan Coutts, Simon Thompson, Pablo Lamela Seijas, and Grigore Rosu and his semantics team. We have a formal specification which we are preparing for release. O'Connor's work on Simplicity has caused us to rethink our own work: what can we do to make it simpler? Thank you, Russell! <br />That said, Simplicity is still <i>too simple</i>, and despite its emphasis on rigour there are some gaps in its description. <br /><h3>Jets</h3>A 256-bit full adder is expressed with 27,348 combinators, meaning addition in Simplicity requires several orders of magnitude more work than the four 64-bit addition instructions one would normally use. Simplicity proposes a solution: any commonly used sequence of instructions may be abbreviated as a "<i>jet</i>", and implemented in any equivalent matter. Hence, the 27,348 combinators for the 256-bit full adder can be ignored, and replaced by the equivalent four 64-bit additions. <br />All well and good, but this is where it gets <i>too simple</i>. No one can afford to be inefficient by several orders of magnitude. Hence, any programmer will need to know what jets exist and to exploit them whenever possible. In this sense, Simplicity is misleadingly simple. It would be clearer and cleaner to define each jet as an opcode. Each opcode could still be specified by its equivalent in the other combinators of Simplicity, but programs would be more compact, faster to execute, and—most important—easier to read, understand, and analyse accurately. If one ignores jets, the analyses of time and space required to execute a program, given toward the end of the paper, will be useless—off by orders of magnitude. The list of defined jets is given nowhere in the paper. Nor could I spot additional information on Simplicity linked to from its web page or findable by a web search. More needs to be done before Simplicity can be used in practice. <br /><h3>Gaps</h3>It's not just the definition of jets which is absent from the paper, and cannot be found elsewhere on the web. Lots more remains to be supplied. <br /><ul><li>Sections 2.4, 2.5, 3.2 claim proofs in Coq, but apart from defining the semantics of the nine combinators in Appendix A, no Coq code is available for scrutiny. </li><li>Section 2.5 claims a representation of Simplicity terms as a dag, but it is not specified. Lacking this, there is no standard way to exchange code written in Simplicity. </li><li>Section 4.4 defines an extended semantics for Simplicity that can read the signature of the current transaction, support Merklised abstract syntax trees, and fail when a transaction does not validate. It also lifts meanings of core (unextended) Simplicity programs to the extended semantics. However, it says nothing about how the seven combinators that combine smaller Simplicity programs into bigger ones act in the extended semantics! It's not hard to guess the intended definitions, but worrying that they were omitted from a paper that aims for rigour. </li><li>Section 3 provides a Bit Machine to model the space and time required to execute Simplicity. The model is of limited use, since it ignores the several orders of magnitude improvement offered by jets. Further, the Bit Machine has ten instructions, enumerated on pages 10–12, but the list omits the vital "case" instruction which appears in Figure 2. Again, it's not hard to guess, but worrying it was omitted. </li></ul><h2>Michelson</h2>A second language for scripting blockchains is Michelson. It is described by a paper <a href="https://tezos.com/static/papers/language.pdf">here</a>and a website <a href="https://www.michelson-lang.com/">here</a>. (Oddly, the website fails to link to the paper.) <br />I will offer just one word on Michelson. The word is: "Why?" <br />Michelson takes many ideas from the functional programming community, including higher-order functions, data structures such as lists and maps, and static type safety. Currently, it is also much more thoroughly described and documented than Simplicity. All of this is to be commended. <br />But Michelson is an inexplicably low-level language, requiring the programmer to explicitly manipulate a stack. Perhaps this was done so that there is an obvious machine model, but Simplicity offers a far superior solution: a high-level model for programming, which compiles to a low-level model (the Bit Machine) to explicate time and space costs. <br />Or perhaps Michelson is low-level to improve efficiency. Most of the cost of evaluating a smart contract is in cryptographic primitives. The rest is cheap, whether compiled or interpreted. Saving a few pennies of electricity by adopting an error prone language—where there is a risk of losing millions of dollars in an exploit—is a false economy indeed. Premature optimisation is the root of all evil. <br />The language looks a bit like all the bad parts of Forth and Lisp, without the unity that makes each of those languages a classic. Lisp idioms such as <code>CAAR</code> and <code>CDADAR</code> are retained, with new ones like <code>DUUP</code>, <code>DIIIIP</code>, and <code>PAAIAIAAIR</code> thrown in. <br />There is a fair set of built-in datatypes, including strings, signed and unsigned integers, unit, product, sum, options, lists, sets, maps, and higher-order functions. But there is no way for users to define their own data types. There is no way to name a variable or a routine; everything must be accessed by navigating a data structure on the stack. <br />Some operations are specified formally, but others are left informal. For lists, we are given formal rewriting rules for the first three operators (CONS, NIL, IF_CONS) but not the last two (MAP, REDUCE). Type rules are given in detail, but the process of type inference is not described, leaving me with some questions about which programs are well typed and which are not. It reminds me of a standard problem one sees in early work by students—the easy parts are thoroughly described, but the hard parts are glossed over. <br />If I have understood correctly, the inference rules assign types that are monomorphic, meaning each term has exactly one type. This omits one of the most successful ideas in functional programming, polymorphic routines that act on many types. It means back to the bad old days of Pascal, where one has to write one routine to sort a list of integers and a different routine to sort a list of strings. <br />Several of these shortcomings are also shared by Simplicity. But whereas Simplicity is intended as a compilation target, not to be read by humans, the Michelson documentation includes a large collection of examples suggesting it is intended for humans to write and read. <br />Here is one of the simpler examples from the paper. <br /><pre> { DUP ; CDAAR ; # T<br /> NOW ;<br /> COMPARE ; LE ;<br /> IF { DUP ; CDADR ; # N<br /> BALANCE ;<br /> COMPARE ; LE ;<br /> IF { CDR ; UNIT ; PAIR }<br /> { DUP ; CDDDR ; # B<br /> BALANCE ; UNIT ;<br /> DIIIP { CDR } ;<br /> TRANSFER_TOKENS ;<br /> PAIR } }<br /> { DUP ; CDDAR ; # A<br /> BALANCE ;<br /> UNIT ;<br /> DIIIP { CDR } ;<br /> TRANSFER_TOKENS ;<br /> PAIR } }<br /></pre>The comment <code># T</code> is inserted as a reminder that <code>CDAAR</code> extracts variable <code>T</code>, and similarly for the other variables <code>N</code>, <code>B</code>, and <code>A</code>. This isn't the 1950s. Why don't we write <code>T</code> when we mean <code>T</code>, instead of <code>CDAAR</code>? WHY ARE WE WRITING IN ALL CAPS? <br />In short, Michelson is a bizarre mix of some of the best and worst of computing. <br /><h2>Conclusion</h2>It is exciting to see ideas from the functional programming, programming languages, and formal methods communities gaining traction among cryptocurrencies and blockchains. While there are shortcomings, it is fantastic to see an appreciation of how these techniques can be applied to increase reliability—something which the multi-million dollar exploits against Ethereum show is badly needed. I look forward to participating in the conversations that ensue!<br /><h3>Postscript</h3>The conversation has begun! Tezos have put up a page to explain <a href="http://www.michelson-lang.com/why-michelson.html">Why Michelson</a>. I've also learned there is a higher-level language intended to compile into Michelson, called <a href="http://www.liquidity-lang.org/">Liquidity</a>.Mon, 11 Dec 2017 09:37:51 +0000noreply@blogger.com (Philip Wadler)Mark Jason Dominus: Legal Nerdsnipingtag:,2017:/law/volokh-decision-roundup
https://blog.plover.com/law/volokh-decision-roundup.html
<p><a href="https://www.washingtonpost.com/news/volokh-conspiracy/">The Volokh
Conspiracy</a> is
a frequently-updated blog about legal issues. It reports on
interesting upcoming court cases and recent court decisions and
sometimes carries thoughtful and complex essays on legal theory. It
is hosted by, but not otherwise affiliated with, the <em>Washington
Post</em>.</p>
<p>Volokh periodically carries a “roundup of recent federal court
decisions”, each with an intriguing one-paragraph summary and a link
to the relevant documents, usually to the opinion itself. I love
reading federal circuit court opinions. They are almost always
carefully thought out and clearly-written. Even when I disagree with
the decision, I almost always concede that the judges have a point.
It often happens that I read the decision and say “of course that is
how it must be decided, nobody could disagree with that”, and then I
read the dissenting opinion and I say exactly the same thing. Then I
rub my forehead and feel relieved that I'm not a federal circuit court
judge.</p>
<p>This is true of U.S. Supreme Court decisions also. Back when I had
more free time I would sometimes visit <a href="https://www.supremecourt.gov/opinions/slipopinion/16">the listing of all recent
decisions</a> and
pick out some at random to read. They were almost always really
interesting. When you read the newspaper about these decisions, the
newspaper always wants to make the issue simple and usually tribal.
(“Our readers are on the (Red / Blue) Team, and the (Red / Blue) Team
loves mangel-wurzels. Justice Furter voted against mangel-wurzels,
that is because he is a very bad man who hates liberty! Rah rah
team!”) The <em>actual</em> Supreme Court is almost always better than this.</p>
<p>For example we have <a href="https://supreme.justia.com/cases/federal/us/545/1/dissent2.html">Clarence Thomas's wonderful
dissent</a>
in the case of <a href="https://en.wikipedia.org/wiki/Gonzales_v._Raich">Gonzales
v. Raich</a>. Raich was
using marijuana for his personal medical use in California, where
medical marijuana had been legal for years. The DEA confiscated and
destroyed his supplier's plants. But the Constitution only gives
Congress the right to regulate <em>interstate</em> commerce. This marijuana
had been grown in California by a Californian, for use in California
by a Californian, in accordance with California law, and had never
crossed any state line. In a 6–3 decision, the court found that the
relevant laws were nevertheless a permitted exercise of Congress's
power to regulate commerce. You might have expected Justice Thomas to
vote against marijuana. But he did not:</p>
<blockquote>
<p>If the majority is to be taken seriously, the Federal Government may
now regulate quilting bees, clothes drives, and potluck suppers
throughout the 50 States. This makes a mockery of Madison’s
assurance to the people of New York that the “powers delegated” to
the Federal Government are “few and defined,” while those of the
States are “numerous and indefinite.” </p>
</blockquote>
<p>Thomas may not be a fan of marijuana, but he is even less a fan of
federal overreach and abuse of the Commerce Clause. These nine people
are much more complex than the newspapers would have you believe.</p>
<p>But I am digressing. Back to Volokh's federal court roundups. I have
to be careful not to look at these roundups when I have anything else
that must be done, because I inevitably get nerdsniped and read
several of them. If you enjoy this kind of thing, this is the kind of
thing you will enjoy.</p>
<p>I want to give some examples, but can't decide which sound most
interesting, so here are three chosen at random from <a href="https://www.washingtonpost.com/news/volokh-conspiracy/wp/2017/11/20/short-circuit-a-roundup-of-recent-federal-court-decisions-81/">the most recent
issue</a>:</p>
<blockquote>
<ul>
<li><p>Warden at Brooklyn, N.Y., prison declines prisoner’s request to keep
stuffed animals. A substantial burden on the prisoner’s sincere
religious beliefs?</p></li>
<li><p>Online reviewer pillories Newport Beach accountant. Must Yelp reveal
the reviewer’s identity?</p></li>
<li><p>With no crosswalks nearby, man jaywalks across five-lane avenue, is
struck by vehicle. Is the church he was trying to reach negligent
for putting its auxiliary parking lot there?</p></li>
</ul>
</blockquote>
<p><a href="https://www.washingtonpost.com/news/volokh-conspiracy/wp/2017/11/20/short-circuit-a-roundup-of-recent-federal-court-decisions-81/">Check it out</a>.</p>
<p>[ Addendum 20171213: <a href="http://reason.com/volokh">Volokh</a> has just left
the Washington Post, and moved to Reason, <a href="http://reason.com/volokh/2017/12/13/weve-moved-to-reason">citing changes in the
Post's paywall
policies</a>. ]</p>Sat, 09 Dec 2017 20:05:00 +0000mjd@plover.com (Mark Dominus)Edward Z. Yang: Systems ML workshop panelhttp://blog.ezyang.com/?p=10030
http://feedproxy.google.com/~r/ezyang/~3/gZlXw86gXC0/
<div class="document">
<ul class="simple">
<li>JG: Joseph Gonzalez</li>
<li>GG: Garth Gibson (CMU)</li>
<li>DS: Dawn Song (UC Berkeley)</li>
<li>JL: John Langford (Microsoft NY)j</li>
<li>YQ: Yangqing Jia (Facebook)</li>
<li>SB: Sarah Bird</li>
<li>M: Moderator</li>
<li>A: Audience</li>
</ul>
<p>M: This workshop is bringing together ML and systems. Can you put your place on that spectrum? Who is your home community?</p>
<p>YJ: Right in the middle. I'd like to move more towards systems side, but Berkeley Parallel Labs kicked me out. ML is my home base.</p>
<p>JL: ML is where I come from, and where I will be, but I'm interested in systems. My home is NIPS and ICML</p>
<p>DS: My area is AI and security, did computer security in the past, now moving into AI.</p>
<p>GG: Systems.</p>
<p>JG: I started out in ML, working on probabilistic methods. I basically, in middle of PhD, looked at systems. Now I'm moving to being a systems person that does ML.</p>
<p>M: We've seen a proliferation of deep learning / ML frameworks that require a lot of dev effort, money, time to put in. Q, what is the role of academia of doing research in this area. What kind of large scale ML learning can you do.</p>
<p>GG: I liked YJ's answer last time.</p>
<p>YJ: The thing that is astonishing is that academia is the source of so many innovations. With all due respect, we did very good work in Google, but then Alex came out with 2 GPUs and nuked the field. Academia is the amazing place where we find all of the new ideas, and industry scale it out.</p>
<p>JL: Some examples. If you're coming from academia, maybe you don't have research at big company, but it's an advantage as you will spend time about the right algorithm for solving it efficiently. And that's what will win in the long run. Short term, they'll brute force with AutoML. Long run, the learning algorithms are going to be designed where tjhey won't have parameters. A common ML paper is "we eliminate this hyperparameter". When they're more automatic, more efficient, great things will happen. There's an advantage in being resource constrained, as you will solve things in the right way.</p>
<p>Another example is, the study of machine learning tells us that in thefuture we will regard any model that u just learned and deploy as inherently broken adn buggy as data collection is not part of process of training, deploying. It will decay and become irrelevant. The overall paradagim of ML where you're interacting with the world, and learning, that can be studied easy in academia, and that has huge implications about how you're going to design systems,</p>
<p>DS: People often talk about in a startup, the best thing is to not raise a ton of money; if you're resource constrained you're more focused and creative. ML is really broad, there's lots of problems. Right now we learn from lots of data, but lots of talks at NIPS, humans have amazing ability to learn from very few example. These are problems for academia to tackle, given unique resource constraints.</p>
<p>GG: I'll say, it's difficult to concentrate on top accuracy if you don't have enough data, and the data available to students is stuff like DAWNbench which tends to lag. In academia, we build relationships with industry, send students for internships, they get the ability to do big data, while exploring first principles in university. IT's a challenge, but open publishing and open sharing of code world more berable.</p>
<p>JG: The one thing I've struggled with is focusing on human resources. I have grad students; good students, focus on a key problem can make a lot of progress. We struggle with a lot of data. Struggle with RL really is here, we can build simulators to build at this scale. Being able to use simualtion to get data; be creative, find new and interesting problems.</p>
<p>M: Follow-up on process. I think a lot of you have tried to publish ML in your communities. Are they equipped to appreciate work properly; what is a common reason they don't appreciate.</p>
<p>JG: Publishing ML in systems, or vice versa, is hard. It goes both ways. These communities are not equipped to evaluate work in other field. ML in systems, where if you saw here, it was surprising. Or vice versa, wouldn't have done well in systems venue as systems. The failure mode I see, is systems community doesn't appreciate extreme complexity. In ML, I have this very sophisticated thing, and reducing them to their essential components. ML tries to overextend their complexity as an innovation. MOre broadly, each of these communities has their own biases how they look at research. One thing I've noticed, it's gotten better. Systems is better at evaluating, and at this workshop, people are pushing research in an advanced way.</p>
<p>GG: I'm old, so I've seen creation of conference before. So, you start off with an overlap of areas. In my prior life, it was the notion of storage as a research area, rather than app of devices. You start off, send submission in. The PC has two people that know anything about it, and they aren't assigned, and the reviews are sloppy, and you get one conference that do a little better, but other conferences don't read it. I faced this with fault tolerance, database, OS communities, they don't read each other's stuff. You get enough mass, get a conference that focuses in the middle; reviewing and PC that have seen most of the good work in the area. That's hard, but we're on the edge of doing it in SysML. We're doing the right thing to do competitive, on top of state of the art.</p>
<p>M: Is that the only solution, or can we mix up PCs?</p>
<p>GG: I've seen a lot of experiments to try it. You can end up with permanently fractured communities.</p>
<p>JL: Joey and Dawn are an area chair at ICML. I have found the ML community to be friendly to system type things. There's an area chair systems. Hopefully papers get assigned appropriately.</p>
<p>M: We're not good about that at systems.</p>
<p>DS: About ML and security, we have this problem. In security, we also have very small percentage of ML, and the committee, if you submit ML, it's very hard to find people who can review the paper, and as a consequence, the review quality varies highly. Similar in terms of security in ML, similar problems. It's interesting to think about why this happens and how to solve the problem. In general, sometimes the most interesting work is the interdisciplinary areas. ML and systems, security, and examples I see, including machine learning in systems... so, one thing I actually can understand is, within each community, even though the review quality varies, I can see from committee's perspective, really what they want is papers that are more meaningful to community, help people get exposed to this new area, fostering new exploration. That's part of natural progression. As time goes on, there's more cross pollonization.</p>
<p>JG: We are launching a SysML conference. I had a little bit of reservations: ML is getting better at systems, but now I have to decide where I'm going to send a paper. A lot of papers we see in ML is going to have systems.</p>
<p>GG: When you have a new conference area, not all work is sent there. Overlapping, you have a favorite conference, your heros, and you'll send your most exciting work to that root conference. No problem.</p>
<p>YJ: SysML is great, and this is how it comes out. New fields, it warrants new conferences.</p>
<p>M: Do you think ML expert needs to also be a systems expert? Does such a person who lies at that intersection have a different way of looking? Or you come up with a nice algorithm, and you</p>
<p>JL: It's not OK to have a wall.</p>
<p>There's many way learning algorithms can be changed. The problem with having a wall, if you don't understand, throw engineer. But if you can bridge to understand, they're not artifacts, you can break open and modify. That can let you achieve much better solutions.</p>
<p>GG: AGreed, but what happens initially is you reach over to other side, you put it into system, and it's my innovation that redundancy makes fault tolerance, even though it's fairly pedestrian from the other side. If it is a substantial improvement, it is worth doing. We all grow up.</p>
<p>JG: We need a wall, but we're going to constantly tear it down. Matlab in grad school, we made jokes about it, and MKL community would make it fast. Then they said we are going to build ML for distributed computing algorithms, and ML would write class algorithms for system. That waned in the dev of pytorch, TF, etc., which leveled up abstraction. The stack is building up again; systems community to make more efficient. Well, fp could change, and that could affect algorithm. So we're tearing it down again. But systems is about designing the wall.</p>
<p>YJ: It's more like a bar stool. It's a barrier, but we don't have to be both to do anything, but you need it to make it efficient. A story: a training system we looked at, SGD. That person found a very nicely rounded number: 100. But people frown, you should round to 128. Understanding and improving the common core for CS and engineering, that helps a lot for people to have good sense for how to design ML algorithms.</p>
<p>M: There's a lot of talk about democratizing AI, and all of you have helped that process. What is a truly democratic AI landscape look like, and how far are we from that world.</p>
<p>YJ: I plead guilty in participating in framework wars. When reading CS history, one thing that's pretty natural, when field is strating, there's all sorts of standards, protocols. FTP, Gopher, and now in the end HTTP took over, and everything runs on HTTP. Right now, there's all kinds of different abstractions; boiling it down, everyone is doing computation graph, optimization. I look forward to when we have one really nice graph representation, protocol for optimizing graphs. It's not a rosy dream, because in compilers we have that solution, LLVM. I don't know if we'll reach that state but I think one day we'll get there.</p>
<p>JL: You have AI/ML democratized when anyone can use it. What does that mean, a programmer has a library, or language constructs, which that they use routinely and easily; no issues of data getting mismatched or confused or biased. All the bugs people worry about in data science; those are removed from the system because the system is designed right and easy to use. The level beyond that is when somebody is using a system, that system is learning to adapt to you. There's huge room for improvement in how people interact. I don't know how often there's a rewrite rule driving me crazy; why can't it rewrite the way I want. People can signal info to a learning algorithm, and when those can be used effectively tpo assist people, you have democratized AI.</p>
<p>DS: I have a very different view of democratizing AI. I think it's interesting to think about what democratization here really means. For systems people, it's about making it easier for people to do learning, to use these libraries, platforms. But that's really just providing them with tools. For me, I give talks on demccratizing AI, we are looking at it from a completely different perspective. Code: even, whoever controls AI will control the world. So who controls AI? Even if you give everyone the tools, push a button, but they don't have the data to do the training. So who controls the AI today, and tomorrow? It's Facebook, Microsoft, Google... so for me, democratization means something totally different. Today, they collect data, train models, and they control who has action to model, and users can get recommendations, but not direct access to models. We have a project to actually democratize AI, where users can control their data. Combining blockchain and AI, where users can donate their data to a smart contract, where the smart contract will specify the terms; e.g., if you train a model, the user can use the model, and if the model produces profits, the user can get part of the profits. The smart contract can specify various incentive terms; e.g., if the data is vbetter than others, they can get more profits, and other mechanisms. A developer will supply the ML training algorithm, and get benefits when it is trained well. We are decentralizing th epower of AI; users will be able to get direct access to models and use them. In this case, I hope for an alternate future, where big companies can continue with business, but users by pooling their data in a decentralized fashion, will see actual true democratization of AI; they will access the power of AI. Not just use tools.</p>
<p>(applause)</p>
<p>GG: I think that a lot of what's meant in democratizing AI is how can you move from a small number of people innovating, to a large number. Tool development and standards. We're close to being there. There was an example in the past, was VSLI paint boxes. Up until a certain point, only an EE could really develop hardware at all. They took a lot of effort and time to make sure it could make it through very part without very much crosstalk. a group came together and thought, well, there are some design rules. This lets you build hardware pretty easily. I could paint green/red boxes, hardware months later, worked. It never worked as fast as that EE guy, so there would always be a place for it, but it would let us build a RISC computer, and ship it. We were in the game, we could innvoate, and do it. The tools we're trying to build right now can build on statistical.</p>
<p>JG: When I started PhD, we did integrals and derivatives by hand. Automatic differentiation was a huge step forward. I blame that for the explosion of papers. A first year can build something far more complex than what I could do. That's moving AI forward, on algorithms side.</p>
<p>The data side is interesting, and that is one where I think about in systems. There's a lot of opportunities to think about how security interacts, leveraging hardware to protect it, markets to sell/buy data from sources, and protect the data across a lot of places. I would argue we're making a substantial amount of progress in how we think about algorithms.</p>
<p>M: When I think about democratizing pervasive AI, recent questions that have been consuming our minds, interpretability, fairness, etc. Can you share... any experience where things like interpretability came up and became a problem, issue, do we have to worry about a lot more in ML, or systems-ML.</p>
<p>JG: My grad students come to me and say the models stop working. I don't know how to fix that; the process is very experimental. Tracking experiments is a big part of the process. We cared a lot about interpretable models, and that meant something very particular. Now it's explainable; we don't need to know what it did exactly, but there needs tob e some connection to what we did. Interpretable, explain computation, it could be related or unrelated to the decision. That's two answers about explainability, and how we debug these systems.</p>
<p>GG: SOSP just happened, and they have ten years of... good copies of everything they submitted. At the end of the conference, Peter Chen took all the PDF files, and did a naive bayes classifier, and saw how well he would predict that it would be accepted. And half the things it predicted to be accepted, would be accepted.</p>
<p>So what did they do? They made ad etector for popular authors. And so what you did is those who had succeeded, they will follow behind. I recognize this problem. You might think that you found a good way, but it's actually Nicolai Zeldovich's paper.</p>
<p>DS: There's a big debate. Some think it's really important, and sometimes, as long as the model works, it's fine. Our brain, we can't really explain how we arrive at certain decisions, but it works fine. And it depends on application. Some applications have stronger requirements for explainability; e.g., law and healthcare, whereas in others it's less required. Also as a whole community, there's a lot we don't understand. We can dtalk about causality, transparenty, all related. As a whole community, we don't really understand what explainability means. Not a good definition. All these concepts are related, we're trying to figure out what's the real core. That's a really good open question.</p>
<p>JL: There's two different interpretations. Can you explain to a person? And that's limited; there's no explainable vision models. The other definition is debuggability. If you want to create complex systems, they need to be debuggable. This is nontrivial with a distributed system, it's nomntriival with ML. If you want to create nontrivial ML systems, yo uhave to figure out why they're not behaving the way you want it to.</p>
<p>DS: Do we debug our brains?</p>
<p>JL: Evolution has done this the hard way for a very long way... a lot of people have bugs in their brains. I know I have bugs. I get an ocular migraine sometimes... very annoying. No, we don't debug our brains, and it's a problem.</p>
<p>YJ: I'm suire there's bugs in my brains; I chased chickens in my grandma's house; the chicken has one spot in its back that if you press it, it just ducks and sits there. It shuts off because of fear. WE humans don't do that. But these bugs, are in our brain as well. Chasing for interpretability helps understand how things work. The old days, deep dream; this line of work started with figuring out what the gradients do, and we propagated back, and we found that direct gradient doesn't work; then we added L1 priors, and then we got pictures. This curiosity has lead to the fact that convnets with random weights are codifying the local correlation; we are hardcoding the structured info in CNNs which we didn't know before. So maybe we will not achieve full interpretability, but some amount of interpretability and creativity will help.</p>
<p>(audience questions)</p>
<p>A: I'd really like to hear what Jeff said about ML for systems. As systems, I'm interested in it, but people have said, you can get far with heuristics.</p>
<p>JL: I think it's exciting.</p>
<p>GG: The index databases, when I read it for reviewing, I went, "Wow! Is that possible?" I think things like that will change the way we do systems. The novelty of the application opens a lot of people's minds. Right now we think of the machine learning tools as being expensive things that repeat what humans do easily that computers don't do well. But that's not what DB index is. We can execute it, but we're not better. But to get it half the size and twice the speed, throwing in another way of thinking about compression through a predictor is a fabulous insight.</p>
<p>JG: I tried to publish in this area for a while. For a while, systems didn't like the idea of complex algorithms in the middle of their system. Now, these days, Systems is like, "ML is cool." But where it's easier to have success, you prediction improves the system, but a bad prediction doesn't break the system. So scheduling, that's good. Where models can boost performance but not hurt. The work in ML to solve systems is successful.</p>
<p>DS: ML for systems is super exciting. I'm personally very excited about this domain, esp. for people who have done systems work, and are interested in AI. ML for systems is an amazing domain of ML. I wouldn't be surprised, I would hope to see, in five years, our systems are more ML driven. A lot of systems have a lot of knobs to tune, trial and error setting, where exactly ML can help. On these amazing techniques, RL, bandits, instead of using bandits to serve ads, we can try to autotune systems. Just like we are seeing AI transforming a lot of application domains, and a lot more intelligent system, old systems, the one we built, should be more intelligent. It's a prediction: It hink we are going to see a lot of work in this domain. I think it will transform systems.</p>
<p>M: I work in this quite a bit. We have some successes with bandits in some settings, but there are settings that are really tough: stateful, choices, decisions influence the future, it makes it hard to apply RL, or the RL techniques take a lot of data. There are challenges, but there are successes. There are a lot of papers that apply RL in caching, resource allocation. The real question is why it's not used in production? I don't know if we have an answer to that, papers do it, it seems to be really good, but it's not that mainstream, esp. having RL all over the place. Why isn't it pervasive. That I don't see.</p>
<p>A: Isn't it because it's not verifiable. You want some kind of verification analysis.</p>
<p>GG: It's called a regression sweep. If you deploy on a lot of systems. There's a lot of money, it has to work. If it falls over, that's a lawsuit. I hired a VP of software. OK, now that I'm in charge, things are going to slow down. Every LoC is bugs, if I want low bug, I stop programmers from writing code, by making the bar very high. This is the thing JOy was talking about; they need a really compelling reason with no downsides, and then they have to pass tests before the pass. So anything stochastic has a high bar.</p>
<p>SB: Another thing that is happening, there aren't that many people who have understanding in both areas. It's really hard to do ML in systems without deep expertise in systems. You really need to understand to explain it.</p>
<p>GG: It wasn't that long since we didn't have hosted services.</p>
<p>M: Guardrails, you constrain the ML system to not suggest something bad. We have a scenario in MS, machines are unresponsive. How long to wait? You can do it in ML. The choices are reasonable, they're never more than the max you'd want to wait.</p>
<p>A: On democratization. There's been a lot of talk about optimizing the models so they can bear the cost. Another is decentralizing data... but there's two very big constraints for systems and models. They cost a lot of money, and there's big variance. Because of cost, if some guy gets into programming, and does research, he won't have resources to do it. So they won't go into engineering; they'll intern at Amazon instead. So if there is some community going into lowering the barrier, demoratizing, what solution is there to get people much more easily? Because there's huge economic costs. People are trying to make huge amounts of money, startups, but there's no... systems have faults with decentralization... there's just a big problem colliding and ML.</p>
<p>JG: We teach data, I teach data science at Berkeley. The summary is, what about the costs of getting into DL? There's cost to train models, GPUs, data, how do I get a freshman in college who is excited about this, chromebook, they can do research and explore opportunities. At Berkeley we have exactly this problem. I teach 200 students, a lot of them are freshmen, chromebook ipad as primary computer. We've built tools using Azure... we run a cloud in Azure, and on these devices they can experiment with models. They get to use pretrained models and appreciate how to ... Someone built a Russian Twitterbot detector, and saw value and opportunity in those. And then they got involved in research projects where they had more funds and tools.</p>
<p>JL: The right interfaces make a huge difference, because they prevent you from having bugs that prevent you from doing things. Also, DL, is all the rage, but framing the problem is more important than the representation you do. If you have the right problem, and a dumb representation, you'll still do something interesting. otherwise, it's just not going to work very well at all.</p>
<p>YJ: As industry, don't be afraid of industry and try it out. Back at Berkeley, when Berkeley AI was using GPUs, the requirement was that you have one project per GPU. We students, framed ten different projects, and we just asked for ten GPUs. NVIDIA came to us and asked, what are you donig. We'll just give you 40 GPUs and do research on that. Nowadays, FAIR has residency, and Google AI has residency, all of these things are creating very nice collaborations between industry and academia, and I want to encourage people to try it out. Industry has funds, academia has talent, marrying those together is an everlasting theme.</p>
<p>A: Going back to where do we go forward in terms of conferences, the future of this workshop; has any decision been made, where we go?</p>
<p>SB: This is work in progress. We're interested in feedback and what you think. We've had this workshop evolving for 10 yrs, with NIPS and iCML. Then we did one with SOSP, excciting. We are now doing a separate conference at Stanford in February. We think there's really an important role to play with workshops colocated with NIPS and ICML. We're still planning to conitnue this series of workshops. There's also a growing amount of systems work in ICML and NIPS, natural expansion to accept that work. The field is growing, and we're going to try several venues, and form a community. If people have ideas.</p>
<p>JG: More people should get involved.</p>
<p>M: We plan to continue this; audience is great, participation is great.</p>
<p>It's a panel, so I have to ask you to predict the future. Tell me something you're really excited... 50-100yrs from now. If you're alive then, I will find you and see if your prediction panned out. Or say what you hope will happen...</p>
<p>YJ: Today we write in Python. Hopefully, we'll write every ML model in one line. Classifier, get a cat.</p>
<p>JL: Right now, people are in a phase where they're getting more and more knobs in learning. ML is all about having less knobs. I believe the ML vision of less knobs. I also believe in democratizing AI. You are constantly turning ... around you, and devs can incorporate learning algorithms into systems. It will be part of tech. It's part of hype cycle. NIPS went through a phase transition. At some point it's gotta go down. When it becomes routine, we're democratizing things.</p>
<p>DS: It's hard to give predictions... I guess, right now, we see ML as an example, we see the waves. Not so long ago, there was the wave of NNs, graphical models, now we're back to NNs. I think... I hope that we... there's a plateauing. Even this year, I have been talking to a lot of great ML researchers, even though one can say there has been more papers written this year, when you hear what people talk about in terms of milestones, many people mentioned milestones from past years. AlexNet, ResNet, ... I do hope that we will see new innovation beyond deep learning. I do teach a DL class, but I hope that we see something beyond DL that can bring us... we need something more, to bring us to the next level.</p>
<p>GG: I'm tempted to point out DL is five years ago, and dotcom era was not more than five years... I think, I'm looking forward to a change in the way CS, science in general, does business, having learned from statistical AI. My favorite one is overfitting. I poorly understood overfitting, in vague stories, until ML hammered what this said. I look forward to the time when students tell me, they stopped writing code, because they were adding parameters... and they added a decent random, iid process for testing code. We're no where near there, but I think it's coming.</p>
<p>JG: I'm looking forward to the return of graphical models... actually not. When we're democratizing AI, but what ultimately happens, we're democratizing technology. I can walk up to Alexa and teach it. Or I can teach my Tesla how to park more appropriately. Tech that can adapt to us because it can learn; when I can explain to a computer what I want. (Star Trek but without a transporter.)</p>
</div>
<img src="http://feeds.feedburner.com/~r/ezyang/~4/gZlXw86gXC0" alt="" height="1" width="1" />Sat, 09 Dec 2017 02:17:08 +0000Edward Z. Yang: Accelerating Persistent Neural Networks at Datacenter Scale (Daniel Lo)http://blog.ezyang.com/?p=10027
http://feedproxy.google.com/~r/ezyang/~3/5P6XztH89rQ/
<div class="document">
<p>The below is a transcript of a talk by <a href="https://www.microsoft.com/en-us/research/people/dlo/" class="reference external">Daniel Lo</a> on <a href="https://www.microsoft.com/en-us/research/blog/microsoft-unveils-project-brainwave/" class="reference external">BrainWave</a>, at the <a href="https://nips.cc/Conferences/2017/Schedule?showEvent=8774" class="reference external">ML Systems Workshop</a> at NIPS'17.</p>
<hr class="docutils" />
<p>Deploy and serve accelerated DNNs at cloud scale. As we've seen, DNNs have enabled amazing applications. Architectures achieve SoTA on computer vision, language translation and speech recognition. But this is challenging to serve in large-scale interactive because there are latency, cost and power constraints. Also, DNNs are growing larger in size and complexity.</p>
<p>We've seen a Cambrian explosion in startups to solve this problem. Research groups have produced DNN processing units, DPUs, custom hardware solutions to prove high throughput efficient serving of DNNs. We categorize them into two categories: fast DPUs, where the algorithms and applications have to be fixed in at design time, because they're fabbing an ASIC, or a soft DPU, FPGA. But for soft DPUs, we haven't seen them deployed at scale.</p>
<p>To address this, we've been working on Project BrainWave. Solution to deploy large scale DNNs with FPGA-acceleration. We've designed it to be fast, flexible and friendly. High throughput, low latency acceleration using FPGAs. Flexibility with adaptive numerical precision, update to latest AI algorithms with reconfigurable FPGAs. And it's user friendly, because we have a full stack solution, compile CNTK/Caffe/TF and compile them down. This is deployed on our configurable cloud, an outer layer of CPUs, a data center that puts everything together, and a layer of reconfigurable FPGAs.</p>
<p>We've been deployed DNN models. LSTM model that takes tens to hundreds of milliseconds CPU. What we see is the 99th percentile for latency; even at 99 we are able to achieve sub-millisecond latencies. When you get to these levels of acceleration, it's negligible in the E2E pipeline.</p>
<p>Next I'll dive into details. It's a full stack solution. starting with a compiler and runtime that takes model sin high level frameworks and compiles them down to our architecture. A flexible ISA for serving DNNs. We have a throughput, low latency serving. We do this all with persistency at scale, to keep models pinned in FPGA memories. Deployed on our wide deployment of Intel FPGAs using hardware microservices.</p>
<p>To begin with, let's talk about hardware microservices. This is something we presented at Micro. The architecture of reconfigurable cloud is FPGAs sit between CPU and network. CPU can use FPGA locally for acceleration, but because FPGAs are connected over network, they can distribute between them. We have a proprietary network protocol for low latency compute.</p>
<p>We'vec disaggregated FPGA compute plane from CPU. So we can aggregate FPGAs together to form larger accelerators, and you don't have to match the rate of FPGAs to CPUs. You can serve a large number of CPUs with a small cluster of FPGAs, or vice versa.</p>
<p>Next I'll talk about the compiler and runtime. Goal is to make it very easy for ML specialists to do this. The typical ML specialist doesn't know how to program this. Models developed in high level frameworks, compile them down to our architecture. If you compile them down first into an intermediate graph based representation. We split them into portions split on FPGAs, and portions on CPU. When we execute, we also have runtime that handles orchestration and scheduling that handles it between parts.</p>
<p>There are two main categories of DNNs we have to optimize for. DNNs that have very high compute to data ratio, convnets, these are well studied. I'm going to focus on the other class of DNNs, those with less compute to data ratio, e.g. dense layers and RNNs.</p>
<p>The conventional approach to accelerating DNNs on FPGAs, you keep all model parameters in DRAM. When a request comes in, you're going to stream the model parameters of DRAM, and return a request. The issue with this is when you have DNN layers that are memory bandwidth bound, you're limited in how fast you can run this by memory bandwidth; you're not getting full compute capabilities of FPGA. Typically the way to solve this is with batching; you send a number of requests and use the model parameters for all requests. WHile you may achieve good throughput, latency will increase. For realtime services, this violates your SLA. What we want to do is provide high performance at low or no batching.</p>
<p>The way we do this is with persisted Dnets. FPGAs have lots of memory on chip: 10MB memory. Since they're on chip, it's high bandwidth. So we're going to keep the model parameters on the chip, so that when we get one request in, we distribute it across the entire FPGA chip.</p>
<p>The obvious question is, what happens if your model doesn't fit on chip? We take advantage of the hardware microcenter. We'll distribute a single model over multiple FPGAs in the datacenter.</p>
<p>Let's look at the architecture and microarchitecture of the processing unit we developed. The BrainWave DPU is a software programmable processor, programmed in single-threaded C, but we've added a number of instructions for serving DNNs, e.g., matrix multiply, convolution, nonlinear activations, embeddings. The processor is designed to use narrow precision format (float16) and easily flexible for extending to newer algorithms.</p>
<p>The microarchitecture of the processor, main portion is dedicated to matrix vector unit; matrix vector multiply, consisting of a number kernels on a tile of a larger matrix. Tiling gives us flexibility while maintaining performance. Other compute units are multifunction units; vector-vector operations, such as element-wise multiply, add and activation functions. Tying it all together is an on-chip network that lets us keep all the compute together at time.</p>
<p>Most of the chip is dedicated to matrix vector unit. It's composed of hundreds of multilane dot product units. Each of these dot product units is consists of tens of adds and muls. To keep them fed with data, each dot product unit is fed by a set of dedicated block rams.</p>
<p>Next, I'd like to show performance results for this architecture. Two years ago, we had a deployment of Stratix V FPGAs. It shows the effective teraflops of this format. 16 bit integer.. we've been playing with our own format Microsoft Floating Point. 4.5Tflops at MSFP5.8. These Stratix are pretty old.</p>
<p>(Demo for latest generation of FPGAs)</p>
<p>Looking at throughput oriented DPU, the latency is 65.81ms. With brainwave, latency is 0.98ms. Under 1 millisecond.</p>
<p>This was done on initial engineering silicon. For production silicon, we're expecting to get 12TOps at 16-bit integer. 90TOps for MSFP8. One question is how does numeric output affects output. Here is the normalized accuracy for three in-house text models, using GRU and LSTM. The orange bar shows what happens when you go to MSFP9, but we've developed a way to fine tune networks for this precision, and you see we recover our accuracy. We're working with MSFP8 and see similar results.</p>
<p>Project BrainWave is our project for accelerating DNNs at cloud scale. We hope it will be fast, friendly and cloud-scale, and expand capabilities of AI in the cloud, providing a way to run higher dimensional RNN networks for NLP and other great applications. We're planning to release to third parties, stay tuned.</p>
<p>Q: When you decrease batch size, what hardware are you evaluating? Hardware utilization as we decrease?</p>
<p>A: We stay highly utilized even as we decrease batch size; even at high batch size, we're still sending requests one by one. (Only one step will be processed?) Right.</p>
<p>Q: Regarding the FP9 and FP8, nine and eight being the number of bits used? (Yes) Is it in any way related to Flexpoint at Intel?</p>
<p>A: We developed this independently of flexpoint, and I'm not able to talk about our numeric format.</p>
<p>Q: In MS, do you really write Verilog for your FPGA, or do you use high level synthesis tool?</p>
<p>A: For this, we are writing System Verilog</p>
<p>Q: Batchnorm layers, which require batch computation; how do you put that onto the FPGA?</p>
<p>A: Part of the work of the compiler is to do splitting between CPU and FPGA. So things that are not amenable to FPGA, including batchnorm, we're still running them on CPU.</p>
</div>
<img src="http://feeds.feedburner.com/~r/ezyang/~4/5P6XztH89rQ" alt="" height="1" width="1" />Fri, 08 Dec 2017 20:08:44 +0000Edward Z. Yang: MOCHA: Federated Multi-Tasks Learning (Virginia Smith)http://blog.ezyang.com/?p=10024
http://feedproxy.google.com/~r/ezyang/~3/Ce01jTEf_80/
<div class="document">
<p>The below is a transcript of a talk by <a href="https://people.eecs.berkeley.edu/~vsmith/" class="reference external">Virginia Smith</a> on <a href="https://arxiv.org/abs/1705.10467" class="reference external">MOCHA</a>, at the <a href="https://nips.cc/Conferences/2017/Schedule?showEvent=8774" class="reference external">ML Systems Workshop</a> at NIPS'17.</p>
<hr class="docutils" />
<p>The motivation for this work comes from the way we think about solving ML problems in practice is changing. The typical ML workflow looks like this. You start iwth dataset and problem to solve. Say you want to build a classifier to identify high quality news articles. Next step is to select an ML model to solve the problem. Under the hood, to fit the model to your data, you have to select an optimization algorithm. The goal is to find an optimal model that minimizes some function over your data.</p>
<p>In practice, there's a very important part of the workflow that is missing. For new datasets, interesting and systems, the system and properties of system, play a large role in the optimization algorithm we select to fix. To give an example, in the past several years, data that is so large that must be distributed over multiple machines, in a datacenter environment. I've been thinking about how to perform fast distributed optimization in this setting, when data is so large.</p>
<p>But more and more frequently, data is not coming nicely packaged in datacenter. It's coming from mobile phones, devices, distributed across country and globe. Training ML in this setting is challenging. For one, whereas in datacenter you have hundreds to thousands, here you have millions and billions. Also, in datacenter, devices are similar capability; here, you have phones that are old, low battery, not connected to wifi. This can change ability to perform computation at any given iteration.</p>
<p>Additionally, there's heterogeneity in data itself. For privacy and computation reasons, data can become very unbalanced in network. And it can be non-IID, so much so that there can be interesting underlying structure to the data at hand. I'm excited because these challenges break down into both systems and statistical challenges. The one second summary of this work, thinking about both systems and statistical in this federated setting; the punchline is that systems setting plays a role not only in optimization algorithm but also the model we select to fit. IT plays a more important role in this overall workflow.</p>
<p>I'm going to go through how we holistically tackle systems and statistical challenges.</p>
<p>Starting with statistical. The goal is we have a bunch of devices generating data, could be unbalanced; some devices have more data than others. One approach used in past is fit a single model across all of this data. All of the data can be aggregated; you find one model that best achieves accuracy across all of the data simultaneously. The other extreme is you find a model for each of the data devices, and not share information. From systems point of view this is great, but statistically, you might have devices that are only ... that are poor in practice. What we're proposing is something between these two extremes. We want to find local models for each device, but share information in a structured way. This can be captured in a framework called multitask learning.</p>
<p>The goal is to fit a separate loss function for each device. These models can be aggregated in this matrix W, and the function of the regularizer, is to force some structure omega on it. This omega is a task relationship matrix, capturing interesting relationships, e.g., all the tasks are related and you want to learn weights, or most of the tasks are related and there are a few outliers, or there are clusters and groups, or there are more sophisticated relationships like asymmetric relationships. These can all be captured in multitask.</p>
<p>We developed a benchmarking set of real federated data. This includes trying to predict human activity from mobile phone, predict if eating or drinking, land mine, and vehicle sensor; distributed sensor to determine if a vehicle is passing by.</p>
<p>For these various datasets, we compared global, local and MTL. The goal is to fit a SVD model. For each data set, we looked at the average error across tasks, where each model is a task. What you can see is average error, for SVD, is significantly lower than global and local approaches. This makes sense because MTL is much more expressive; it lets you go between these extremes. What's interesting is that in these real data sets, it really helps. Reduction by half. This is a significant improvement in practice.</p>
<p>Given that we like to be using multitask learning to model data in federated environment, the next problem is figure out how to train this in distributed setting, thinking about massive distributed. In particular, the goal is to solve the following optimization objective. In looking how to solve this objective, we note that it's often common to solve for W and omega in an alternating fashion. When you solve for omega, it's centrally, you just need access to models. But W must be distributed because data is solved across devices. The key component how to solve this in practice is the W update. The challenge of doing this is communication is extremely expensive. And because of heterogeneity, you may have massive problems with stragglers and fault tolerance; e.g., someone who turns their phone off.</p>
<p>The high level idea for how we're doing this, take a communication efficient method that works well in data center, and modify it to work in federated setting. It will handle MTL as well as stragglers and fault tolerance.</p>
<p>What is the method we're using? The method we're using is COCOA, which is a state of the art method for empirical risk minimization problems. The thing that's nice about COCOa is it spans prior work of mini-batch and one-shot communication, by making communication a first class parameter of the method. Make it flexible as possible. It does it by not solving the primal formulation, but the dual. The dual is nice because we can easily approximate it by forming a quadratic approximation to the objective; and this more easily decomposes across machines.</p>
<p>To distribute this to federate setting, a key challenge is figuring out how to generalize it to the MTL framework. A second challenge; in COCOA, the subproblems are assumed to be solved to some accuracy theta. This is nice because theta varies from 0 to 1, where 0 is exact solve, and 1 is inexact. This can be thought of as how much time you do local communication versus communication. However, in fact, this is not as flexible as it should be in the federated setting. There is only one theta that is set for all iterations, a ll nodes. And because theta cannot be set exactly to one, it cannot handle fault tolerance, where there's no work performed at any iteration. Making this communication parameter much more flexible in practice.</p>
<p>JHow are we doing this? we developed MOCHA. The goal is to solve multitask learning framework; W and Omega in an alternating fashion. In particular, we're able to form the following dual formulation, similar to COCOA, so it decomposes. In comparison, we make this much more flexible assumption on subproblem parameter. This is important because of stragglers: statistical reasons, unbalance, different distributions, it can be very different in how difficult it is to solve subproblems. Additionally, there can be stragglers due to systems issues. And issues of fault tolerance. So this looks like a simple fix: we make this accuracy parameter more flexible: allow it to vary by node and iteration t, and let it be exactly 1. The hard thing is showing it converges to optimal solution.</p>
<p>Following this new assumption, and you can't have a device go down every single round, we show the following convergence guarantee. For L-Lipschitz loss, we get a convergence at 1/epsilon; for smooth models (logistic regression) we get a linear rate.</p>
<p>How does this perform in practice? The method is quite simple. The assumption is we have data stored at m different devices. We alternate between solving Omega, and W stored on each. While we're solving w update, it works by defining these local subproblems for machines, and calling solver that does approximate solution. This is flexible because it can vary by node and iteration.</p>
<p>In terms of comparing this to other methods, what we've seen is the following. Comparing MOCHA to CoCoA, compared to Mb-SDCA and Mb-SGD. We had simulation, with real data to see what would happen if we do it on wifi. We have simulated time and how close are to optimal. What you can see is that MoCHA is converging much more quickly to optimal solution, because MoCHA doesn't have the problem of statistical heterogeneity, and it's not bogged down by stragglers. This is true for all of the different types of networks; LET and 3G. The blue line and MOCHA and CoCOA, they work well in high communication settings, because they are more flexible. But compared to CoCOA, MOCHA is much more robust to statistical heterogeneity.</p>
<p>What's interesting is that if we impose some systems heterogeneity, some devices are slower than others, we looked at imposing low and high systems heterogeneity, MOCHA with this additional heterogeneity, it's a two orders of magnitude speedup to reach optimal solution.</p>
<p>And for MOCHA in particular, we looked at issue of fault tolerance. What we're showing here, we're increasing the probability a device will drop out at any distribution. Going up until there's half devices, we're still fairly robust to MOCHA converging, in almost the same amount of time. But what we see with green dotted line, of the same device drops out every iteration, it doesn't converge. This shows the assumption we made makes sense in practice.</p>
<p>The punchline is that in terms of thinking this new setting, training ML on these massive networks of devices, this is both a statistical and systems issue. We've addressed it in a holistic matter. Code at <a href="http://cs.berkeley.edu/~vsmith" class="reference external">http://cs.berkeley.edu/~vsmith</a> I also want to reiterate about SysML conference in February.</p>
<p>Q: When you compare global and local? Why is it always better than global?</p>
<p>A: The motivation why you want to use local model over global model, is that if you have a local data a lot, you might perform better. It boosts the overall sample size. I have some additional experiments where we took the original data, and skewed it even further than it already was. We took the local data, and there was less data locally, and they have global approaches. That's just a function of the data in the devices.</p>
<p>Q: I really like how your method has guarantees, but I'm wondering about an approach where you create a metalearning algorithm locally and have it work locally?</p>
<p>A: That's worth looking into empirically, since you can do fine tuning locally. What we were trying to do first was converge to exact optimal solution, but you might want to just work empirically well, would be good to compare to this setting.</p>
</div>
<img src="http://feeds.feedburner.com/~r/ezyang/~4/Ce01jTEf_80" alt="" height="1" width="1" />Fri, 08 Dec 2017 18:15:32 +0000Edward Z. Yang: A Machine Learning Approach to Database Indexes (Alex Beutel)http://blog.ezyang.com/?p=10022
http://feedproxy.google.com/~r/ezyang/~3/urDpEZ29r7g/
<div class="document">
<p>The below is a transcript of a talk by <a href="http://alexbeutel.com/" class="reference external">Alex Beutel</a> on <a href="https://arxiv.org/abs/1712.01208" class="reference external">machine learning database indexes</a>, at the <a href="https://nips.cc/Conferences/2017/Schedule?showEvent=8774" class="reference external">ML Systems Workshop</a> at NIPS'17.</p>
<hr class="docutils" />
<p>DB researchers think about there research differently. You have a system that needs to work for all cases. Where as in ML, we have a unique circumstance, I'll build a model that works well. In DB, you have to fit all.</p>
<p>To give an example of this is a B-tree. A B-tree works for range queries. We have records, key, we want to find all records for range of keys. 0-1000, you build tree on top of sorted array. To quickly look up starting point in range. What if all my data, all of the keys, from zero to million... it becomes clear, you don't need the whole tree above. You can use the key itself as an offset into the array. Your lookup is O(1), O(1) memory, no need for extra data structure.</p>
<p>Now, we can't go for each app, we can't make a custom implementation to make use of some pattern. DB scale to any application, we don't want to rebuild it any time.</p>
<p>But ML excels in this situation. It works well for a wide variety of distributions, learn and make use of them effectively.</p>
<p>This is the key insight we came to. Traditional data structures make no assumptions about your data. They work under any distribution, and generally scale O(n). Interestingly, learning, these data distributions, can offer a huge win. What we're trying to go to, is instead of scaling to size of data, we scale to complexity of it. With linear data, it's O(1). For other distributions, can we leverage this?</p>
<p>There are three dat structures underlying databases. There are B-Trees; range queries, similarity search. Main index. Hash maps for point lookups; individual records. This is more common throughout CS. And bloom filters, are really common for set-inclusion queries. Do I have a key. If your record is stored on disk, checking first if there's a record with that key is worthwhile. We're going to focus entirely on B-trees.</p>
<p>B-trees take a tree like structure with high branching factor. What makes it really effective is that it's cache efficient. You can store top level nodes in your cache where it's fast to look it up, maybe others in main memory, and the actual memory on disk. By caching the hierarchy appropriately, it makes it efficiently. At a high level, a B-tree maps a key to a page, some given place in memory. Once it finds that page, it will do some local search to find the particular range of that key. That could be a scan or binary search; we know the range will be the position from start of page to page size.</p>
<p>An abstract level, the Btree is just a model. It's taking the position of the key, and trying to estimate the position. What we have in this case, we want to search in this error range to find the ultimate record. At a high level, it would mean that we can't use any model. We need err_min and err_max. But we have all the data. If you have all the data, you know at index construction time, you know all the data you're executing against, and you can calculate what the model's min and max error is.</p>
<p>One interesting thing is this is just a regression problem. What you're really modeling is just the CDF. On the X axis on this plot here, the X axis is your keys, Ys your position. This is modeling where your probability mass is located; where your data is in the keyspace. CDFs are studied somewhat, but not a ton, in the literature. This is a nice new implication of research.</p>
<p>We thought, OK, let's try this out straightaway. Train a model, see how fast it is. We looked at 200M server logs, timestamp key, 2 layer NN, 32-width, relatively small by ML. We train to predict position, square error. A B-Tree executes in 300ns. Unfortunately, with the model, it takes 80000ns. By most ML model speeds, this is great. If you're looking at executing on server, great. But this doesn't work for a database.</p>
<p>There are a bunch of problems baked into this. TF is really designed for large models. Think about translation or superresolution images; these are hefty tasks. We need to make this fast for database level speed. Second, b-trees are great for overfitting. There's no risk of over-fitting in this context. They're also cache efficient; that's not looked at in ML. The last thing is local search in the end. Is that really the most effective way of ultimately finding that key? I'm skipping that part because it's fairly detailed, I'll focus on first three.</p>
<p>The first part is just the raw speed fo execution of ML model. This was built really by Tim, this Learning Index Framework program. What it does is it lets you create different indexes under different configurations. For one thing, it lets you do code compilation for TF, ideas from Tupleware, where you can take a linear model and execute it extremely quickly. We can also train simple models. Use TF for more complex gradient descent based learning; extract weights, and have inference graph be codegenned. And we can do a lot of autotuning, to find what the best model architecture is. We know ahead of time what the best training is. We can make pretty smart decisions about what works best.</p>
<p>The next problem is accuracy and sepeed. If I have 100M records, I narrow down quickly from 1.5M to 24K, with each step down this tree. Each one of those steps is 50-60 cycles to look through that page, and to find what the right branch is. So we have to get to an accurracy of 12000, within 500 mul/add, to beat these levels of hierarchy, which are in cache. This is a steep task. The question is what is the right model? a really wide network? Single hidden layer? This scales nicely, we can fit in 256 layer reasonably. We could go deeper... the challenge is we have width^2, which need to be parallelized somehow. The challenge is, how do we effectively scale this. We want to add capacity to the model, make it more and more accurate, with increased size, without becoming to.</p>
<p>We took a different approach, based on mixed experts. We'll have a key, have a really simple classifier. We get an estimate. Then we can use that estimate to find it at the next stage. Narrow down the CDF range, and try to be more accurate in the subset of space. It will still get key as input; given key, give position, but more narrow space of keys. We build this down, and we'll walk down this hierarchy. This decouples model size and complexity. We have a huge model, overfitting, but we don't have to execute all of the sparsity that you would have to do from a pure ML view. We can decouple it usefully. The nice thing we can do is fall back to B-trees for subsets that are difficult to learn in a model. The LIF framework lets us substitute it in easily. In the worst case, B-tree. Best case, more efficient.</p>
<p>The quick results version here, is we find we have four different data sets. Most are integer data sets; last one is string data set. We're trying to save memory and speed; we save memory hugely; these are really simple models. Linear with simple layer, with possibly two stages. We're able to get a significant speedup in these cases. Server logs one is interesting. It looks at a high level very linear, but there's actually daily patterns to this data accessed. Maps is more linear; it's longitudes of spaces. We created synthetic data that's log normal, and here we see we can model it effectively. Strings is an interesting challenge going forward; your data is larger and more complicated, building models that are efficient over a really long string is different; the overall patterns are harder to have intuition about. One thing really worth noting here, it's not using GPUs or TPUs; it's pureely CPU comparison. Apples-to-apples.</p>
<p>This is mostly going into the B-tree part. This is a regression model looking at CDF of data. We can use these exact same models for hash maps. With bloom filters, you can use binary classifiers. I have a bunch of results in the poster in the back.</p>
<p>A few minutes to talk about rooms for improvement. There are a bunch of directions that we're excited to explore. Obvious one is GPUs/TPUs. It's cPUs because that's when B-trees are most effective; but scaling is all about ML. Improving throughput and latency for models with GPUs, exciting going forward. Modeling themselves; there's no reason to believe hierarchy of models is the right or best choice; it's interesting to build model structures that match your hardware. Memory efficient, underlying architecture of GPUs. In the scale of ns we need for database. Multidimensional indexes; ML excels in high numbers of dimension; most things are not looking at a single integer feature. There's interesting question about how you map to multidimensional indexes that are difficult to scale. If we have a CDF, you can approximately sort it right there. And inserts and updates, assumed read-only databases. Large class of systems, but we get more data. How do we balance overfitting with accuracy; can we add some extra auxiliary data structures to balance this out?</p>
<p>Q: One thing is that when... this problem, we solved pretty well without ML. When we introduce ML, we should introduce new metrics. We shouldn't make our system more fragile, because distribution changes. What would be the worst case when distribution changes?</p>
<p>A: As the data becomes updated... in the case of inference and updates, there's a question about generalization. I think you could look at it from the ML point of view: statistically, test model today on tomorrows inserts. (It's a method. If I use this method, and then train it with data that I don't yet have... and do.) The typical extrapolation to future generalization of ML. Guarantees are hard. There will be a worst case that is awful... but the flip side, that's the ML side... generalization. There's also a point of view, I couple this with classic data structure. we coupled modeling with classic data structures: search, bloom filter case, so you don't actually have this work. You catch worst case.</p>
<p>Let me add to that. If you assume that the inserts follow the same distribution as trained model, then the inserts become all one operation. They're even better. Suppose they don't follow the same distribution? you can still do delta indexing. Most systems do do delta indexing. So inserts are not a big problem.</p>
<p>Q: (Robert) Most of the inputs were one or two real numbers, and outputs are a single real number. how does it work if you use a low degree polynomial, or a piecewise linear classifier on the different digits?</p>
<p>A: In the case of strings, it's not a single input. (Treat it as integer?) Well, it's possibly a thousand characters long. It's not the best representation. Different representations work really well. The last thing I want to say, piecewise linear could work, but when you run 10k, 100k submodels, it's slow. Hierarchy helps. Polynomials are interesting, depends on data source.</p>
<p>Q: Can you comment how bad your worst case is? Average numbers?</p>
<p>A: We specifically always have a spillover. The worst case is defaulting to typical database. We haven't had a case where you do worse, because we'll default to B-tree. (Deterministic execution?) Not inference time.</p>
</div>
<img src="http://feeds.feedburner.com/~r/ezyang/~4/urDpEZ29r7g" alt="" height="1" width="1" />Fri, 08 Dec 2017 18:11:02 +0000Edward Z. Yang: Ray: A Distributed Execution Framework for Emerging AI Applications (Ion Stoica)http://blog.ezyang.com/?p=10017
http://feedproxy.google.com/~r/ezyang/~3/g_GnDdjcuF8/
<div class="document">
<p>The below is a transcript of a talk by <a href="https://people.eecs.berkeley.edu/~istoica/" class="reference external">Ion Stoica</a> on <a href="https://github.com/ray-project/ray" class="reference external">Ray</a>, at the <a href="https://nips.cc/Conferences/2017/Schedule?showEvent=8774" class="reference external">ML Systems Workshop</a> at NIPS'17.</p>
<hr class="docutils" />
<p>We've been working on it at Berkeley for more than one year. Over the past years, there's been tremendous progress in AI. Ad targeting, image&speech, many more. Many applications are based on supervised learning with DNNs. Supervised plus unsupervised are the two dominant approaches.</p>
<p>However, the next generation of AI applications will be very different. They're deployed in mission critical scenarios, need to continually learn from a rapidly changing env. Robotics, self driving cars, unmanned drones, dialogue systems. Implementing this new generation of AI applications requires a broader range of techniques. Stochastic optimization, parallel simulations, many more.</p>
<p>Ray provides a unified platform for implementing these approaches. To motivate Ray, I'll use reinforcement learning. RL learns by interacting with env. A policy mapping from state/observation to action that maximizes a certain reward. What are the reqs of RL? Many applications exhibit nested parallelism: search, where they use data parallel SGD, which then calls a component that does policy evaluation with a model to simulate, that runs in parallel on multiple CPUs. Second, these workloads can be highly heterogenous in hardware and time. Many of these computations require not only CPUs, but GPUs TPUs and FPGAs. Second, this computation can take wildly different times. Simulate a chess game: 3 moves to lose, or 50 moves to win or draw. And in robotics, we need to process in real time, processing the data from sensors in parallel, tens of ms.</p>
<p>Meeting these requirements is not easy. To meet these requirements, you need a system that is flexible and performant. Flexible: it should create and schedule tasks dynamically, and support arbitrary dependencies. Perf: it should scale to hundreds of nodes, sub-millisecond latency, millions of task, and efficiently share numeric data.</p>
<p>Next, I'm going to say how we achieve these challenges. Flexibility? We provide a very flexible model: dynamic tasks graphs. On top of this, we give the two models: parallel tasks and actors.</p>
<p>To talk about parallel tasks, here is Python code: one reads an array from a file, and the other adds two arrays. The code is simple: it creates two arrays a and b from file1 and file2, and sum them up. So now, parallelizing this program is quite easy. If we want to parallelize a function, in order to do that, we need to add a ray.remote decorator to each function. When we invoke these functions, you need to invoke remote method. Remove doesn't return object itself, just the object id. This is very similar to the futures abstraction. To get the actual object, you must invoke ray.get on the object id.</p>
<p>To get a better idea of how Ray is executing, let's execute a simple program. Assumes files stored on different nodes. When read_array on file1, it schedules read_array on the appropriate node. The remote call returns immediately, before the actual read finishes. This allows the driver to run the second task in parallel, running on the node on file 2, and launch the add remote function. All functions have been scheduled remotely, but none of them have finished. To actually get the result, you have to call ray.get on the result. This is a blocking call, you'll wait for the entire computation graph to be executed.</p>
<p>Tasks are very general, but they are not enough. Consider that you want to run a simulator, and this simulator is closed source. In this case, you do not have access to the state. You have state, action, simulations, to set up state in simulator, you cannot do it. So to get around this, there is another use case, where the state is too expensive to create. For example, DNNs on GPUs, in this case, you want to initialize it once, and reinitialize for each simulation.</p>
<p>In order to address these use cases, we add Actor abstraction. An actor is just a remote class. If you have a Counter, you mark it ray.remote, and the when you create the class or invoke methods, you use remote keyword. This is a computation graph for this very simple example. Notice the method invocations also return object identifiers. To get the results, you need to call ray.get on object identifiers. Ray also allows you to specify the number of resources, for actors and tasks.</p>
<p>To put things together, and provide a more realistic example, evaluation strategy, a scalable form of RL, by Salimans et al in OpenAI. In a nutshell, evolution strategy, tries lots of policies, and tries to see which runs best. This is highly parallel. So here is pseudocode for parallel strategies. A worker that does simulation and returns the reward, create twenty workers, and then 200, do 200 simulations, update policy. Again, if you want to parallelize this code, we have to add a bunch of remote, and now on the right hand side, you'll notice I'm also sharing the computation graph. When you invoke now, the Worker.remote, you create 20 remote workers to do it in parallel. And you invoke with the remote keyword. Again, notice that in this case, the results are not the rewards themselves, but they're ids to the reward objects. In order to get the rewards to get policy, you have to call ray.get.</p>
<p>This hopefully gives you a flavor how to program in Ray. Next time, I switch gears, presents system design of Ray; how Ray gets high performance and scalability.</p>
<p>Like many classic computing frameworks, it has a driver, and a bunch of workers. Driver runs a program, worker runs task remotely. You can run and write a bunch of actors. The drivers actors on the same node, they share the data, on shared memory, and the workers and actors of cross nodes, share through distributed object store we built. Each node has a local scheduler, so when a driver wants to run another task, the local scheduler tries to schedule it locally. If it cannot schedule it locally, it invokes global scheduler, and it will schedule another node that has resources. Actor, remote method. Finally, what we do, and one essential part of the design, is we have a Global Control State. It takes all of the state of the system, and centralizes it. The metadata for the objects, in objects table, function. This allows system to be stateless. All these other components can fail, you can bring them up, get the most recent data from global control state. It also allows us to parallelize the global scheduler, because these replicas are going to share the same state in the GCS.</p>
<p>Another nice effect of having a GCS is that it makes it easy to build a bunch of profiling and debugging tools.</p>
<p>This design is highly scalable. Let me try to convince you why this is. To make GcS scalable, we just shard it. All these keys are pseudorandom, so it's easy to shard and load balance. The scheduler as you see is distributed; each node has a local scheduler, and Ray tries to schedule tasks which are spawned by a worker/driver on another task that is locally. The global scheduler, becomes a bottleneck, we can also replicate it. Finally, in systems, even if scheduler is super scalable, in Spark, there's another bottleneck: only the driver can launch new tasks. In order to get around that, we allow in Ray the workers and actors to launch tasks. Really, there is no single bottleneck point.</p>
<p>A few words about implementation. The GCS is implemented with Redis. For object store, we leverage Apache Arrow. For fault tolerance, we use lineage based fault tolerance like Spark. Actors are part of task graph; methods are treated as tasks, so we have a uniform model for providing fault tolerance.</p>
<p>So now some evaluation results. This plot represents the number of tasks per second, and you can see the number of nodes; it scales linearly. You can schedule over 1.8 M/s. Latency of local task execution is 300us, the latency of remote task is 1ms. This plot illustrates fault tolerance. You may ask why you care about fault tolerance? The problem is you need in your program that the simulation may not finish; this makes the program far more complicated, even if you're willing to ignore some results. Here, on this axis, you have the time in seconds, you have two y axes, number of nodes in system, and the throughput. As you can see, the number of nodes is starting at 50, then 25, then to 10, and goes back to 50. In the red area, you show the number of tasks per second; it follows as you may expect, the number of nodes in the system. If you look a little bit, there are some drops; every time, you have a drop in the number of tasks. It turns out this is because of the object reconstruction. When some nodes go away, you lose the objects on the node, so you have to reconstruct them. Ray and Spark reconstruct them transparently. With blue, you can see the re-executed tasks. If you add them, you get a very nice filling curve.</p>
<p>Finally, for evolution strategies, we compared with reference ES from... we followed the OpenAI, and on the X axis, you have number of CPUs, mean time to solve the particular problem; simulator, learning to run, there are three points to notice. One is, as expected, as you add more CPUs, the time to solve goes down. The second is that Ray is actually better than the reference ES, better results, even though the reference ES is specialized for beating. Third, for a very large number of CPUs, ref couldn't do it, but Ray could do better and better. I should add that Ray takes half the amount of code, and was implemented in a couple of hours.</p>
<p>Related work: look, in this area, there are a huge number of systems, that's why you are here, lots of systems. Ray is complimentary to TF, MXNet, PyTorch, etc. We use these systems to implement DNNs. We integrate with TF and PyT. There are more general systems, like MPI and Spark; these have limited support for nested parallelism; computation model, and they have much coarser grained tasks.</p>
<p>To conclude, Ray is a system for high performance and flexibility and scalability. We have two libraries on top of Ray: RLlib and Ray Tune. It's open source, please try, we'd love your feedback. Robert, Philip, Alex, Stephanie, Richard, Eric, Heng, William, and many thanks to my colleague Michael Jordan.</p>
<p>Q: In your system, you also use actor; actor is built up on shared memory. Do you have separate mailbox for actors? How do you do that?</p>
<p>A: No, the actors communicate by passing the argument to the shared object store.</p>
<p>Q: What is the granularity of parallelism? Is it task atomic, or do you split task?</p>
<p>A: The task granularity is given by what is the overhead for launching a task and scheduling the task. The task you see, we are targeting task, low and few ms. The task is not implementing something like activation function. we leave that job to much better frameworks. And a task is executing atomically, a method, in the actors, are serialized.</p>
<p>Q: Question about fault tolerance: in Spark, when you don't have a response for some time, it says this node died. Here, the task is much more, because NN, something like that. So we don't have the same time.</p>
<p>A: We do not do speculation; implicit speculation in Ray, for the reason you mentioned.</p>
<p>Q: Can you give me more details on the reference implementation, doesn't scale</p>
<p>A: The reference implementation, it's the OpenAI implementation, Robert here can provide you a lot more detailed answers to that question.</p>
</div>
<img src="http://feeds.feedburner.com/~r/ezyang/~4/g_GnDdjcuF8" alt="" height="1" width="1" />Fri, 08 Dec 2017 18:07:16 +0000Mark Jason Dominus: The Aeropresstag:,2017:/tech/aeropress-review
https://blog.plover.com/tech/aeropress-review.html
<p>I drink a lot of coffee at work. Folks there often make a pot of
coffee and leave it on the counter to share, but they never make decaf
and I drink a lot of decaf, so I make a lot of single cups of decaf,
which is time-consuming. More and more people swear by the
<a href="http://www.aeropress.com/">AeroPress</a>, which they say makes single
cups of excellent coffee very quickly. It costs about $30. I got one
and tried it out.</p>
<p align="center"><a href="https://pic.blog.plover.com/tech/aeropress-review/aeropress.jpeg"><img src="https://pic.blog.plover.com/tech/aeropress-review/aeropress-th.jpeg" border="0" /></a></p>
<p>The AeroPress works like this: There is a cylinder, open at the top,
closed but perforated at the bottom. You put a precut circle of
filter paper into the bottom and add ground coffee on top of it. You
put the cylinder onto your cup, then pour hot water into the cylinder.</p>
<p>So far this is just a regular single-cup drip process. But after a
minute, you insert a plunger into the cylinder and push it down gently
but firmly. The water is forced through the grounds and the filter
into the cup.</p>
<p>In theory the press process makes better coffee than drip, because
there is less opportunity to over-extract. The AeroPress coffee is
good, but I did not think it tasted better than drip. Maybe someone
else, fussier about coffee than I am, would be more impressed.</p>
<p>Another the selling points is that the process fully extracts the
grounds, but much more quickly than a regular pourover cone, because
you don't have to wait for all the dripping. One web site boasts:</p>
<blockquote>
<p>Aeropress method shortens brew time to 20 seconds or less.</p>
</blockquote>
<p>It does shorten the brew time. But you lose all the time again
washing out the equipment. The pourover cone is easier to clean and
dry. I would rather stand around watching the coffee drip through the
cone than spend the same amount of time washing the coffee press.</p>
<p>The same web site says:</p>
<blockquote>
<p>Lightweight, compact design saves on storage space.</p>
</blockquote>
<p>This didn't work for me. I can't put it in my desk because it is
still wet and it is difficult to dry. So it sits on a paper towel on
top of my desk, taking up space and getting in the way. The cone
dries faster.</p>
<p>The picture above makes it look very complicated, but the only
interesting part itself is the press itself, shown at upper left. All
the other stuff is unimportant. The intriguing hexagon thing is a a
funnel you can stick in the top of the cylinder if you're not sure you
can aim the water properly. The scoop is a scoop. The flat thing is
for stirring the coffee in the cylinder, in case you don't know how to
use a spoon. I threw mine away. The thing on the right is a holder
for the unused paper filters. I suspect they were afraid people
wouldn't want to pay $30 for just the press, so they bundled in all
this extra stuff to make it look like you are getting more than you
actually are. In the computer biz we call this “shovelware”.</p>
<p>My review: The AeroPress gets a solid “meh”. You can get a drip cone
for five bucks. The advantages of the $30 AeroPress did not
materialize for me, and are certainly not worth paying six times as
much.</p>Fri, 08 Dec 2017 14:13:00 +0000mjd@plover.com (Mark Dominus)FP Complete: Announcing Stack 1.6.1 releasehttps://www.fpcomplete.com/blog/2017/12/announcing-stack-1.6.1-release
https://www.fpcomplete.com/blog/2017/12/announcing-stack-1.6.1-release
<div class="hs-featured-image-wrapper">
<a href="https://www.fpcomplete.com/blog/2017/12/announcing-stack-1.6.1-release" class="hs-featured-image-link" title=""> <img src="https://www.fpcomplete.com/hubfs/haskell_logo.svg?t=1513366076380" alt="haskell_logo.svg" style="width: auto !important; float: left; margin: 0 15px 15px 0;" class="hs-featured-image" /> </a>
</div>
<p>See <a href="https://haskellstack.org">https://haskellstack.org</a> for installation and upgrade instructions.</p>
<img src="https://track.hubspot.com/__ptq.gif?a=2814979&k=14&r=https%3A%2F%2Fwww.fpcomplete.com%2Fblog%2F2017%2F12%2Fannouncing-stack-1.6.1-release&bu=https%253A%252F%252Fwww.fpcomplete.com%252Fblog&bvt=rss" alt="" height="1" style="width: 1px!important;" width="1" />Thu, 07 Dec 2017 17:30:00 +0000manny@fpcomplete.com (Emanuel Borsboom)Michael Snoyman: Stack and Nightly breakagehttps://www.snoyman.com/blog/2017/12/stack-and-nightly-breakage
https://www.snoyman.com/blog/2017/12/stack-and-nightly-breakage
<p>I'm sure a number of readers have already seen something about the
situation around Stack and Stackage Nightly/GHC 8.2. I tried to
clarify how this happened on
<a href="https://github.com/commercialhaskell/stack/issues/3624">the relevant Github issue</a>,
plus the
<a href="https://ghc.haskell.org/trac/ghc/ticket/14558">GHC trac ticket</a>, but
thought I'd reshare as a blog post for others who are interested.</p><p><b>EDIT</b> Right after publishing, I saw that Stack 1.6.1 was released, so you
should probably just run <code>stack upgrade</code>. Keep reading if you're curious on the
bug.</p><h2 id="the-problem">The problem</h2><p>When the first releases of Stackage Nightly for GHC 8.2.1 started
coming out some months back, they did not work with Stack 1.5.0, due
to an issue with the <code>ghc.cabal</code> file on Hackage. The reason for this
is explained below. We made a point release (Stack 1.5.1) which worked
around the issue temporarily, until Stack 1.6 was released with the
complete fix.</p><p>In the interim, GHC 8.2.2 was released, and Stackage Nightly switched
over to it. Contrary to my initial claims: this was a <i>red herring</i>
and unrelated to anything.</p><p>On December 4, integer-gmp-1.0.1.0 was uploaded to Hackage, which
reintroduced all of the breakage we had with Stack 1.5.0. Since our
point release had a very targetted workaround (specifically for
<code>ghc.cabal</code>), it did not work around the same bug occurring for
<code>integer-gmp.cabal</code>. Therefore, all versions of Stack before 1.6 will
fail to build a Stackage release with GHC 8.2.</p><h2 id="the-workaround">The workaround</h2><p>The best "workaround" is just a new release: Stack 1.6 was fortunately
already in release candidate mode, and as I type this up it's going
through the standard release process. By the time I hit publish, the
workaround may be to run <code>stack upgrade</code>.</p><p>If that's not the case, you can upgrade to the release candidate by
running:</p><pre><code>stack upgrade --binary-version 1.6.0.20171202</code></pre><h2 id="cabal-background">Cabal background</h2><p>In order to understand the explanation, you should be aware of a few
different things that are all called Cabal:</p><ul><li>cabal-install, the build tool. This is not relevant to the
explanation below</li><li>Cabal the library. This is a normal Haskell library which Stack
depends on, and is used for (among other things) parsing cabal
files.</li><li>Cabal the file format. If you open up virtually any cabal file
you'll see a <code>cabal-version: >= 1.10</code> looking field. This is stating
which version of the Cabal file format is being used. New versions
of Cabal-the-library may add new features to the Cabal file
format. The version of the format tracks the library version it was
released with, so that a cabal file stating <code>cabal-version: >= 1.24</code>
can only be parsed by Cabal-the-library 1.24 or later.</li></ul><p>There was an addition made to Cabal-the-file-format 2.0: a <code>^>=</code>
operator. This operator is not parseable by older versions of Cabal
the library (meaning: Cabal 1.24 or earlier). Stack 1.5 was built
against Cabal-the-library 1.24, and therefore cannot parse any Cabal
files using this new operator.</p><p>The Stackage build process prevents any such Cabal files from being
used yet to give tooling (like Stack) a chance to upgrade, something
I've requested of Hackage as well. However, there are some packages
which ship with GHC itself, and which Stackage has no control over in
the creation of a snapshot. This includes packages like <code>base</code>, <code>ghc</code>,
and <code>integer-gmp</code>.</p><h2 id="original-breakage">Original breakage</h2><p>There's a short explanation (and some code to demonstrate it!) for the
original breakage with GHC 8.2.1 in the pull request:</p><p><a href="https://github.com/commercialhaskell/stack/pull/3304/files">https://github.com/commercialhaskell/stack/pull/3304/files</a></p><p>Prior to Stack 1.6, there was a bug where Stack would try to get some
metadata about libraries that shipped with GHC from their cabal files
instead of directly from the package database. Historically, this has
never been a problem, which is why it's survived in Stack for so
long. The reason is that, historically, GHC-shipped packages did not
use bleeding-edge features in their cabal files.</p><p>When GHC 8.2.1 was released, the <code>ghc.cabal</code> file uploaded to Hackage
did something new: it used a feature of the newly released Cabal 2.0
file format (the <code>^>=</code> operator) and required the new Cabal 2.0
library to parse it. This occurred before Stack had a chance to
upgrade to Cabal-the-library 2.0, and for that matter before
cabal-install 2.0 was released. In other words: at the time the file
was placed on Hackage, no officially released version of any common
tool supported it.</p><p>For unrelated reasons, I'd already fixed this bug on master as part of
a refactoring. Strangely enough, that refactoring had to do with
problems with revisions. Thanks to the revision system, it's not
possible to rely on cabal files on Hackage to tell you anything about
GHC-installed packages, since we can't know for certain which revision
was used to build the package. (We'll get to integer-gmp in a moment,
which is slightly worse in this regard.)</p><p>The behavior of Stack at this time with regard to GHC-shipped packages
was the following (and this is a bug):</p><ul><li>If the cabal file cannot be found: ignore the package entirely. This
is necessary for packages like <code>rts</code>.</li><li>If the cabal file is found: try to parse it, and fail if the parse
fails.</li></ul><p>It was this second bullet which caused a problem. When we discovered
this, we released an emergency patch release of Stack to work around
this situation and simply ignore parse failures from <code>ghc.cabal</code>. We
did not embark on a bigger fix because:</p><ol><li>A bigger fix would involve much more code change, introducing the
chance for regressions</li><li>We already had a fix on master, and knew that Stack 1.6 would be
released before GHC 8.4</li></ol><p>This went out the door, and all users who upgraded to Stack 1.5.1 were
able to use the new Stackage Nightly snapshots based on GHC 8.2.2.</p><h2 id="december-4-2017">December 4, 2017</h2><p>One of the packages that ships with GHC 8.2 is
<code>integer-gmp-1.0.1.0</code>. Until December 4, this package was not uploaded
to Hackage. As a result, Stack 1.5.1 simply ignored the package
entirely, which worked fine. However, something we didn't anticipate
happened:</p><ul><li>Months after the GHC 8.2.1 release, <code>integer-gmp-1.0.1.0</code> was
uploaded to Hackage</li><li>The cabal file that was uploaded was manually modified to use
Cabal-the-format 2.0 features (again, the <code>^>=</code> operator).</li></ul><p>You can compare the
<a href="http://hackage.haskell.org/package/integer-gmp-1.0.1.0/integer-gmp.cabal">file on Hackage</a>
with the
<a href="https://github.com/ghc/ghc/blob/ghc-8.2.2-release/libraries/integer-gmp/integer-gmp.cabal">file on Github</a>. It's
unclear what the motivation was behind this modification, but this
modification is what broke Stack 1.5.1 and GHC 8.2.</p><p>Before this upload, the missing <code>integer-gmp.cabal</code> file was simply
ignored by Stack. Once it was uploaded, Stack (again, as a bug) tries
to parse it, fails, and gives up.</p><h2 id="the-future">The future</h2><p>Obviously there was a bug in Stack that needed to be fixed, and has
been fixed. However, the irregularities around the <code>ghc.cabal</code> and
<code>integer-gmp.cabal</code> files are a little troubling, and make it
difficult to predict future behavior. Hopefully some new policies from
GHC HQ will address these concerns.</p><p>And while this case is a bug in Stack, I want to clarify a general
point. It is entirely expected that over time, older releases of Stack
will not be able to use newer Stackage snapshots. At some point in the
future, Stackage will allow Cabal 2.0-formatted cabal files into
snapshots, and then by design Stack 1.5 and earlier will be unable to
parse those files. That's unfortunate, but expected. What's unexpected
in this case was that</p><ol><li>These cabal files slipped into a snapshot through the back door
(GHC's package database) so quickly, before Stack 1.6 was out the
door</li><li>That actions taken post-GHC release (a new upload of
integer-gmp.cabal) could affect existing snapshots.</li></ol><p>Both points will hopefully be hit both by the fixes that landed on
Stack 1.6 ensuring less eager parsing of cabal files, and changes in
GHC HQ policy.</p><h2 id="summary">Summary</h2><ol><li>There's a bug in Stack, triggered by new behavior not seen before
by GHC</li><li>That bug affects reproducibility, because an upload to Hackage in
the future (or a revision for that matter) can break existing build
plans</li><li>This bug is fixed on master fully (AFAICT, we've added an
integration test to check for regressions)</li><li>Instead of putting out another emergency Stack 1.5 patch for
integer-gmp.cabal, we're going to get Stack 1.6 out the door ASAP</li></ol><p>I hope that clarifies. This is definitely an unfortunate situation,
and I know it's screwed up people's development, so my apologies on
that front. I hope for all our sakes (mine included!) that the
situation is more stable going forward.</p>Thu, 07 Dec 2017 04:00:00 +0000Mark Jason Dominus: Shitpost roundup, 2017-11tag:,2017:/meta/shitpost/roundup-2017-11
https://blog.plover.com/meta/shitpost/roundup-2017-11.html
<p><a href="https://blog.plover.com/meta/shitpost.html">As I mentioned before, I have started another
blog</a>, called <code>Content-type:
text/shitpost</code>. While I don't recommend that you read it regularly,
you might want to scan over this list of the articles from November
2017 to see if anything catches your eye.</p>
<ul>
<li><a href="https://shitpost.plover.com/f/FIRST-POST.html">FIRST POST!!1!</a></li>
<li><a href="https://shitpost.plover.com/v/vampire-frogs.html">The Vampire Flying Frog</a></li>
<li><a href="https://shitpost.plover.com/c/crows.html">“As the crow flies”</a></li>
<li><a href="https://shitpost.plover.com/e/ephod.html">The ephod</a></li>
<li><a href="https://shitpost.plover.com/c/consistency-proofs.html">The uselessness of consistency proofs</a></li>
<li><a href="https://shitpost.plover.com/r/russell-slack-channel.html">Bertrand Russell's Slack channel</a></li>
<li><a href="https://shitpost.plover.com/p/peaceful-aplomb.html">The Zimbabwean coup</a></li>
<li><a href="https://shitpost.plover.com/m/multiplication-stars.html">Non-star multiplication operators</a></li>
<li><a href="https://shitpost.plover.com/c/canaan-banana.html">Canaan Banana</a></li>
<li><a href="https://shitpost.plover.com/m/math-se-shitposting.html">Shitposting on Math StackExchange</a></li>
<li><a href="https://shitpost.plover.com/b/bean-colors.html">Colored beans</a></li>
<li><a href="https://shitpost.plover.com/r/restaurant-ratings.html">How I rate restaurants</a></li>
<li><a href="https://shitpost.plover.com/m/math.easy-problem-that-looks-hard.html">A problem that looks harder than it is</a></li>
<li><a href="https://shitpost.plover.com/d/dirty-jokes.html">Dirty jokes that are orientation and gender nonspecific</a></li>
<li><a href="https://shitpost.plover.com/s/supervillain.html">My secret identity</a></li>
<li><a href="https://shitpost.plover.com/w/wild-fantasies.html">My cute fantasy</a></li>
<li><a href="https://shitpost.plover.com/w/wise-men-of-princeton.html">The Wise Men of Princeton</a></li>
<li><a href="https://shitpost.plover.com/h/hot-potato.html">The Hot Potato</a></li>
<li><a href="https://shitpost.plover.com/f/fiber-guys.html">The fiber guys are here!</a></li>
<li><a href="https://shitpost.plover.com/h/hot-potato-2.html">The Hot Potato (addendum)</a></li>
<li><a href="https://shitpost.plover.com/s/stealing-club.html">Stealing Club</a></li>
<li><a href="https://shitpost.plover.com/k/kids-return.html">The kids disappear and then come back</a></li>
<li><a href="https://shitpost.plover.com/h/head-over-feet.html">Head Over Feet</a></li>
<li><a href="https://shitpost.plover.com/t/twitter-trending.html">Intriguing trending hashtags</a></li>
<li><a href="https://shitpost.plover.com/d/dad-jokes.html">Character-building exercises</a></li>
<li><a href="https://shitpost.plover.com/a/annoying-nun-questions.html">What’s the most annoying question to ask a nun in 1967?</a></li>
<li><a href="https://shitpost.plover.com/g/garden-court-eatery.html">The Garden Court Eatery</a></li>
<li><a href="https://shitpost.plover.com/l/left-and-right.html">Mixing up left and right</a></li>
<li><a href="https://shitpost.plover.com/p/potatoes.html">Mmmm fries</a></li>
<li><a href="https://shitpost.plover.com/o/over.html">A vector space over a field of scalars</a></li>
<li><a href="https://shitpost.plover.com/p/piano-teacher.html">People are more than one person</a></li>
<li><a href="https://shitpost.plover.com/a/annoying-nun-questions-2.html">The most annoying question to ask a nun, explained</a></li>
<li><a href="https://shitpost.plover.com/e/eid-al-adha.html">Coma collective</a></li>
<li><a href="https://shitpost.plover.com/s/software-sucks.html">Computers suck: episode 17771 of 31279</a></li>
<li><a href="https://shitpost.plover.com/m/multiplication-stars-2.html">Abutment for multiplication and other things</a></li>
<li><a href="https://shitpost.plover.com/s/spam-419.html">Today's 419 scam is…</a></li>
<li><a href="https://shitpost.plover.com/a/abutting-do-flotchy.html">Abutments</a></li>
<li><a href="https://shitpost.plover.com/m/multiplication-stars-3.html">More multiplication by abutment</a></li>
<li><a href="https://shitpost.plover.com/c/convicted-rapist.html">Who is a convicted rapist?</a></li>
<li><a href="https://shitpost.plover.com/c/code-reviews.html">Code reviews</a></li>
<li><a href="https://shitpost.plover.com/m/multiplication-stars-4.html">Bjarne Stroustrup's many crimes against programming</a></li>
<li><a href="https://shitpost.plover.com/m/middle-school.html">Colorado Appeals Court goes to Middle School</a></li>
<li><a href="https://shitpost.plover.com/d/do-not-resuscitate.html">Do NOT Resuscitate</a></li>
</ul>
<p>I plan to continue to post monthly summaries here.</p>Thu, 07 Dec 2017 02:03:00 +0000mjd@plover.com (Mark Dominus)Jasper Van der Jeugt: Video: Getting things done in Haskellhttp://jaspervdj.be/posts/2017-12-07-getting-things-done-in-haskell.html
http://jaspervdj.be/posts/2017-12-07-getting-things-done-in-haskell.html
<p>Someone alerted me that the video of my talk at the Skills Matter <a href="https://skillsmatter.com/conferences/8522-haskell-exchange-2017">Haskell eXchange 2017</a> is now available. You can watch it <a href="https://skillsmatter.com/skillscasts/10832-how-to-architect-medium-to-large-scale-haskell-applications">on their website</a>.</p>
<p>The slides can be found <a href="https://github.com/jaspervdj/talks/blob/master/2017-haskell-exchange-getting-things-done/slides.md">here</a>.</p>
<p>It’s a talk aimed towards beginners. If you are writing a medium-sized Haskell application for the very first time, you will typically end up with three modules: <code>Types.hs</code>, <code>Utils.hs</code> and <code>Main.hs</code>. While this is a very clear split, it typically doesn’t scale very well as applications become larger.</p>
<p>I try to answer some questions like:</p>
<ul>
<li>When is it a good idea to use something like Monad/Applicative (and when is it not)?</li>
<li>When is it a good idea to invent my own typeclass (and when is it not)?</li>
<li>How do I design interfaces and services like in OOP?</li>
</ul>
<p>Thanks again to Skills Matter for putting together this excellent conference.</p>Thu, 07 Dec 2017 00:00:00 +0000FP Complete: Techniques for Success with Offshore Software Developmenthttps://www.fpcomplete.com/blog/techniques-for-success-with-offshore-software-development
https://www.fpcomplete.com/blog/techniques-for-success-with-offshore-software-development
<div class="hs-featured-image-wrapper">
<a href="https://www.fpcomplete.com/blog/techniques-for-success-with-offshore-software-development" class="hs-featured-image-link" title=""> <img src="https://www.fpcomplete.com/hubfs/Blog/Outsourcing%20Software%20-%20Smaller.jpg?t=1513366076380" alt="Outsourcing Software - Smaller.jpg" style="width: auto !important; float: left; margin: 0 15px 15px 0;" class="hs-featured-image" /> </a>
</div>
<div>
<span style="">When I ran Microsoft’s engineering tools group in the late 1990’s, remote engineering was uncommon and challenging. We had spent millions to relocate engineers to a central headquarters, and when that wasn’t enough to meet all our needs, we had to invent a lot of our own tools to keep offshore projects on track. Since then, the industry has evolved better methods and tools and, more recently, cloud </span>
<a style="" href="http://feeds.feedburner.com/devops">DevOps</a>
<span style=""> systems. Reliable remote engineering is now available to everyone.</span>
</div>
<img src="https://track.hubspot.com/__ptq.gif?a=2814979&k=14&r=https%3A%2F%2Fwww.fpcomplete.com%2Fblog%2Ftechniques-for-success-with-offshore-software-development&bu=https%253A%252F%252Fwww.fpcomplete.com%252Fblog&bvt=rss" alt="" height="1" style="width: 1px!important;" width="1" />Wed, 06 Dec 2017 22:07:00 +0000aaron@fpcomplete.com (Aaron Contorer)Jeremy Gibbons: Arithmetic Codinghttp://patternsinfp.wordpress.com/?p=330
https://patternsinfp.wordpress.com/2017/12/05/arithmetic-coding/
<p>
This post is about the data compression method called <em>arithmetic coding</em>, by which a text is encoded as a subinterval of the unit interval, which is then represented as a bit sequence. It can often encode more effectively than Huffman encoding, because it doesn’t have the restriction of Huffman that each symbol be encoded as a positive whole number of bits; moreover, it readily accommodates <em>adaptive</em> models of the text, which “learn” about the text being encoded while encoding it. It is based on <a href="http://www.cs.ox.ac.uk/publications/publication2333-abstract.html">lecture notes</a> that I wrote in 2002 with Richard Bird, although the presentation here is somewhat simplified; it is another application of <em>streaming</em>. There’s quite a lot to cover, so in this post I’ll just set up the problem by implementing a basic encoder and decoder. In the next post, I’ll show how they can both be streamed. (We won’t get into the intricacies of restricting to fixed-precision arithmetic—perhaps I can cover that in a later post.)</p>
<p>
The basic idea behind arithmetic coding is essentially to encode an input text as a subinterval of the unit interval, based on a <em>model</em> of the text symbols that assigns them to a partition of the unit interval into non-empty subintervals. For the purposes of this post, we will deal mostly with half-open intervals, so that the interval <img src="https://s0.wp.com/latex.php?latex=%7B%5Bl%2Cr%29%7D&bg=ffffff&fg=000000&s=0" alt="{[l,r)}" class="latex" title="{[l,r)}" /> contains values <img src="https://s0.wp.com/latex.php?latex=%7Bx%7D&bg=ffffff&fg=000000&s=0" alt="{x}" class="latex" title="{x}" /> such that <img src="https://s0.wp.com/latex.php?latex=%7Bl+%5Cle+x+%3C+r%7D&bg=ffffff&fg=000000&s=0" alt="{l \le x < r}" class="latex" title="{l \le x < r}" />, where <img src="https://s0.wp.com/latex.php?latex=%7Bl%2Cr%2Cx%7D&bg=ffffff&fg=000000&s=0" alt="{l,r,x}" class="latex" title="{l,r,x}" /> are rationals.</p>
<p>
For example, with just two symbols “a” and “b”, and a static model partitioning the unit interval into <img src="https://s0.wp.com/latex.php?latex=%7B%5B0%2C+%5Cfrac+1+3%29%7D&bg=ffffff&fg=000000&s=0" alt="{[0, \frac 1 3)}" class="latex" title="{[0, \frac 1 3)}" /> for “a” and <img src="https://s0.wp.com/latex.php?latex=%7B%5B%5Cfrac+1+3%2C+1%29%7D&bg=ffffff&fg=000000&s=0" alt="{[\frac 1 3, 1)}" class="latex" title="{[\frac 1 3, 1)}" /> for “b”, the symbols in the input text “aba” successively narrow the unit interval to <img src="https://s0.wp.com/latex.php?latex=%7B%5B0%2C%5Cfrac+1+3%29%2C+%5B%5Cfrac+1+9%2C+%5Cfrac+1+3%29%2C+%5B%5Cfrac+1+9%2C+%5Cfrac+5+%7B27%7D%29%7D&bg=ffffff&fg=000000&s=0" alt="{[0,\frac 1 3), [\frac 1 9, \frac 1 3), [\frac 1 9, \frac 5 {27})}" class="latex" title="{[0,\frac 1 3), [\frac 1 9, \frac 1 3), [\frac 1 9, \frac 5 {27})}" />, and the latter interval is the encoding of the whole input. And in fact, it suffices to pick any single value in this final interval, as long as there is some other way to determine the end of the encoded text (such as the length, or a special end-of-text symbol).</p>
<p></p><h2> Intervals </h2>
<p>
We introduce the following basic definitions for intervals: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathbf%7Btype%7D%5C%3B%5Cmathit%7BInterval%7D+%3D+%28%5Cmathit%7BRational%7D%2C+%5Cmathit%7BRational%7D%29+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bunit%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5C%5C+%5Cmathit%7Bunit%7D+%3D+%280%2C1%29+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bcontains%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5Crightarrow+%5Cmathit%7BBool%7D+%5C%5C+%5Cmathit%7Bcontains%7D%5C%3B%28l%2Cr%29%5C%3Bx+%3D+l+%5Cle+x+%5Cland+x+%3C+r+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bincludes%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BBool%7D+%5C%5C+%5Cmathit%7Bincludes%7D%5C%3B%28l%2Cr%29%5C%3B%28p%2Cq%29+%3D+l+%5Cle+p+%5Cland+q+%5Cle+r+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathbf{type}\;\mathit{Interval} = (\mathit{Rational}, \mathit{Rational}) \vrule width0pt depth2ex \\ \mathit{unit} :: \mathit{Interval} \\ \mathit{unit} = (0,1) \vrule width0pt depth2ex \\ \mathit{contains} :: \mathit{Interval} \rightarrow \mathit{Rational} \rightarrow \mathit{Bool} \\ \mathit{contains}\;(l,r)\;x = l \le x \land x < r \vrule width0pt depth2ex \\ \mathit{includes} :: \mathit{Interval} \rightarrow \mathit{Interval} \rightarrow \mathit{Bool} \\ \mathit{includes}\;(l,r)\;(p,q) = l \le p \land q \le r \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathbf{type}\;\mathit{Interval} = (\mathit{Rational}, \mathit{Rational}) \vrule width0pt depth2ex \\ \mathit{unit} :: \mathit{Interval} \\ \mathit{unit} = (0,1) \vrule width0pt depth2ex \\ \mathit{contains} :: \mathit{Interval} \rightarrow \mathit{Rational} \rightarrow \mathit{Bool} \\ \mathit{contains}\;(l,r)\;x = l \le x \land x < r \vrule width0pt depth2ex \\ \mathit{includes} :: \mathit{Interval} \rightarrow \mathit{Interval} \rightarrow \mathit{Bool} \\ \mathit{includes}\;(l,r)\;(p,q) = l \le p \land q \le r \end{array} " />
</p></blockquote>
<p> We’ll write “<img src="https://s0.wp.com/latex.php?latex=%7Bi+%5Cni+x%7D&bg=ffffff&fg=000000&s=0" alt="{i \ni x}" class="latex" title="{i \ni x}" />” for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bcontains%7D%5C%3Bi%5C%3Bx%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{contains}\;i\;x}" class="latex" title="{\mathit{contains}\;i\;x}" />, and “<img src="https://s0.wp.com/latex.php?latex=%7Bi+%5Csupseteq+j%7D&bg=ffffff&fg=000000&s=0" alt="{i \supseteq j}" class="latex" title="{i \supseteq j}" />” for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bincludes%7D%5C%3Bi%5C%3Bj%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{includes}\;i\;j}" class="latex" title="{\mathit{includes}\;i\;j}" />.</p>
<p>
A crucial operation on intervals is <em>narrowing</em> of one interval by another, where <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bnarrow%7D%5C%3Bi%5C%3Bj%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{narrow}\;i\;j}" class="latex" title="{\mathit{narrow}\;i\;j}" /> is to <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> as <img src="https://s0.wp.com/latex.php?latex=%7Bj%7D&bg=ffffff&fg=000000&s=0" alt="{j}" class="latex" title="{j}" /> is to the unit interval: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bnarrow%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BInterval%7D+%5C%5C+%5Cmathit%7Bnarrow%7D%5C%3Bi%5C%3B%28p%2Cq%29+%3D+%28%5Cmathit%7Bweight%7D%5C%3Bi%5C%3Bp%2C+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3Bq%29+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bweight%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5C%5C+%5Cmathit%7Bweight%7D%5C%3B%28l%2Cr%29%5C%3Bx+%3D+l+%2B+%28r-l%29+%5Ctimes+x+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{narrow} :: \mathit{Interval} \rightarrow \mathit{Interval} \rightarrow \mathit{Interval} \\ \mathit{narrow}\;i\;(p,q) = (\mathit{weight}\;i\;p, \mathit{weight}\;i\;q) \vrule width0pt depth2ex \\ \mathit{weight} :: \mathit{Interval} \rightarrow \mathit{Rational} \rightarrow \mathit{Rational} \\ \mathit{weight}\;(l,r)\;x = l + (r-l) \times x \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{narrow} :: \mathit{Interval} \rightarrow \mathit{Interval} \rightarrow \mathit{Interval} \\ \mathit{narrow}\;i\;(p,q) = (\mathit{weight}\;i\;p, \mathit{weight}\;i\;q) \vrule width0pt depth2ex \\ \mathit{weight} :: \mathit{Interval} \rightarrow \mathit{Rational} \rightarrow \mathit{Rational} \\ \mathit{weight}\;(l,r)\;x = l + (r-l) \times x \end{array} " />
</p></blockquote>
<p> We’ll write “<img src="https://s0.wp.com/latex.php?latex=%7Bi+%5Cmathbin%7B%5Ctriangleright%7D+j%7D&bg=ffffff&fg=000000&s=0" alt="{i \mathbin{\triangleright} j}" class="latex" title="{i \mathbin{\triangleright} j}" />” for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bnarrow%7D%5C%3Bi%5C%3Bj%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{narrow}\;i\;j}" class="latex" title="{\mathit{narrow}\;i\;j}" />. Thus, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%5C%3B%28l%2Cr%29%5C%3Bx%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}\;(l,r)\;x}" class="latex" title="{\mathit{weight}\;(l,r)\;x}" /> is “proportionately <img src="https://s0.wp.com/latex.php?latex=%7Bx%7D&bg=ffffff&fg=000000&s=0" alt="{x}" class="latex" title="{x}" /> of the way between <img src="https://s0.wp.com/latex.php?latex=%7Bl%7D&bg=ffffff&fg=000000&s=0" alt="{l}" class="latex" title="{l}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Br%7D&bg=ffffff&fg=000000&s=0" alt="{r}" class="latex" title="{r}" />“, and we have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%7D+i+%5Cni+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3Bx+%26+%5CLeftarrow%26+%5Cmathit%7Bunit%7D+%5Cni+x+%5C%5C+i+%5Csupseteq+i+%5Cmathbin%7B%5Ctriangleright%7D+j+%26%5CLeftarrow%26+%5Cmathit%7Bunit%7D+%5Csupseteq+j+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl} i \ni \mathit{weight}\;i\;x & \Leftarrow& \mathit{unit} \ni x \\ i \supseteq i \mathbin{\triangleright} j &\Leftarrow& \mathit{unit} \supseteq j \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl} i \ni \mathit{weight}\;i\;x & \Leftarrow& \mathit{unit} \ni x \\ i \supseteq i \mathbin{\triangleright} j &\Leftarrow& \mathit{unit} \supseteq j \end{array} " />
</p></blockquote>
<p> Conversely, we can <em>widen</em> one interval by another: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bwiden%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BInterval%7D+%5C%5C+%5Cmathit%7Bwiden%7D%5C%3Bi%5C%3B%28p%2Cq%29+%3D+%28%5Cmathit%7Bscale%7D%5C%3Bi%5C%3Bp%2C+%5Cmathit%7Bscale%7D%5C%3Bi%5C%3Bq%29+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bscale%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5C%5C+%5Cmathit%7Bscale%7D%5C%3B%28l%2Cr%29%5C%3Bx+%3D+%28x-l%29%2F%28r-l%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{widen} :: \mathit{Interval} \rightarrow \mathit{Interval} \rightarrow \mathit{Interval} \\ \mathit{widen}\;i\;(p,q) = (\mathit{scale}\;i\;p, \mathit{scale}\;i\;q) \vrule width0pt depth2ex \\ \mathit{scale} :: \mathit{Interval} \rightarrow \mathit{Rational} \rightarrow \mathit{Rational} \\ \mathit{scale}\;(l,r)\;x = (x-l)/(r-l) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{widen} :: \mathit{Interval} \rightarrow \mathit{Interval} \rightarrow \mathit{Interval} \\ \mathit{widen}\;i\;(p,q) = (\mathit{scale}\;i\;p, \mathit{scale}\;i\;q) \vrule width0pt depth2ex \\ \mathit{scale} :: \mathit{Interval} \rightarrow \mathit{Rational} \rightarrow \mathit{Rational} \\ \mathit{scale}\;(l,r)\;x = (x-l)/(r-l) \end{array} " />
</p></blockquote>
<p> We’ll write “<img src="https://s0.wp.com/latex.php?latex=%7Bi+%5Cmathbin%7B%5Ctriangleleft%7D+j%7D&bg=ffffff&fg=000000&s=0" alt="{i \mathbin{\triangleleft} j}" class="latex" title="{i \mathbin{\triangleleft} j}" />” for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bwiden%7D%5C%3Bi%5C%3Bj%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{widen}\;i\;j}" class="latex" title="{\mathit{widen}\;i\;j}" />. Note that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bscale%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{scale}}" class="latex" title="{\mathit{scale}}" /> is inverse to <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}}" class="latex" title="{\mathit{weight}}" />, in the sense </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++y+%3D+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3Bx+%5CLeftrightarrow+%5Cmathit%7Bscale%7D%5C%3Bi%5C%3By+%3D+x+&bg=ffffff&fg=000000&s=0" alt="\displaystyle y = \mathit{weight}\;i\;x \Leftrightarrow \mathit{scale}\;i\;y = x " class="latex" title="\displaystyle y = \mathit{weight}\;i\;x \Leftrightarrow \mathit{scale}\;i\;y = x " />
</p></blockquote>
<p> and consequently widening is inverse to narrowing: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++i+%5Cmathbin%7B%5Ctriangleleft%7D+%28i+%5Cmathbin%7B%5Ctriangleright%7D+j%29+%3D+j+&bg=ffffff&fg=000000&s=0" alt="\displaystyle i \mathbin{\triangleleft} (i \mathbin{\triangleright} j) = j " class="latex" title="\displaystyle i \mathbin{\triangleleft} (i \mathbin{\triangleright} j) = j " />
</p></blockquote>
<p></p><h2> Models </h2>
<p>
We work with inputs consisting of sequences of symbols, which might be characters or some higher-level tokens: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathbf%7Btype%7D%5C%3B%5Cmathit%7BSymbol%7D+%3D+%5Cmathit%7BChar%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathbf{type}\;\mathit{Symbol} = \mathit{Char} " class="latex" title="\displaystyle \mathbf{type}\;\mathit{Symbol} = \mathit{Char} " />
</p></blockquote>
<p> The type <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BModel%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{Model}}" class="latex" title="{\mathit{Model}}" /> then must provide the following operations: </p>
<ul>
<li> a way to look up a symbol, obtaining the corresponding interval:<br />
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7BencodeSym%7D+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5Cmathit%7BSymbol%7D+%5Crightarrow+%5Cmathit%7BInterval%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{encodeSym} :: \mathit{Model} \rightarrow \mathit{Symbol} \rightarrow \mathit{Interval} " class="latex" title="\displaystyle \mathit{encodeSym} :: \mathit{Model} \rightarrow \mathit{Symbol} \rightarrow \mathit{Interval} " />
</p></blockquote>
</li><li> conversely, a way to decode a value, retrieving a symbol:<br />
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7BdecodeSym%7D+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5Crightarrow+%5Cmathit%7BSymbol%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{decodeSym} :: \mathit{Model} \rightarrow \mathit{Rational} \rightarrow \mathit{Symbol} " class="latex" title="\displaystyle \mathit{decodeSym} :: \mathit{Model} \rightarrow \mathit{Rational} \rightarrow \mathit{Symbol} " />
</p></blockquote>
</li><li> an initial model:<br />
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Binitial%7D+%3A%3A+%5Cmathit%7BModel%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{initial} :: \mathit{Model} " class="latex" title="\displaystyle \mathit{initial} :: \mathit{Model} " />
</p></blockquote>
</li><li> a means to <em>adapt</em> the model on seeing a new symbol:<br />
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7BnewModel%7D+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5Cmathit%7BSymbol%7D+%5Crightarrow+%5Cmathit%7BModel%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{newModel} :: \mathit{Model} \rightarrow \mathit{Symbol} \rightarrow \mathit{Model} " class="latex" title="\displaystyle \mathit{newModel} :: \mathit{Model} \rightarrow \mathit{Symbol} \rightarrow \mathit{Model} " />
</p></blockquote>
</li></ul>
<p> The central property is that encoding and decoding are inverses, in the following sense: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7BdecodeSym%7D%5C%3Bm%5C%3Bx+%3D+s+%5Cquad+%5CLeftrightarrow+%5Cquad+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%5Cni+x+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{decodeSym}\;m\;x = s \quad \Leftrightarrow \quad \mathit{encodeSym}\;m\;s \ni x " class="latex" title="\displaystyle \mathit{decodeSym}\;m\;x = s \quad \Leftrightarrow \quad \mathit{encodeSym}\;m\;s \ni x " />
</p></blockquote>
<p> There are no requirements on <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Binitial%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{initial}}" class="latex" title="{\mathit{initial}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BnewModel%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{newModel}}" class="latex" title="{\mathit{newModel}}" />, beyond the latter being a total function.</p>
<p>
For example, we might support adaptive coding via a model that counts the occurrences seen so far of each of the symbols, represented as a histogram: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathbf%7Btype%7D%5C%3B%5Cmathit%7BModel%7D+%3D+%5B%28%5Cmathit%7BSymbol%7D%2C%5Cmathit%7BInteger%7D%29%5D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathbf{type}\;\mathit{Model} = [(\mathit{Symbol},\mathit{Integer})] " class="latex" title="\displaystyle \mathbf{type}\;\mathit{Model} = [(\mathit{Symbol},\mathit{Integer})] " />
</p></blockquote>
<p> This naive implementation works well enough for small alphabets. One might maintain the histogram in decreasing order of counts, so that the most likely symbols are at the front and are therefore found quickest. For larger alphabets, it is better to maintain the histogram as a binary search tree, ordered alphabetically by symbol, and caching the total counts of every subtree.</p>
<p></p><h2> Encoding </h2>
<p>
Now encoding is straightforward to define. The function <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BencodeSyms%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{encodeSyms}}" class="latex" title="{\mathit{encodeSyms}}" /> takes an initial model and a list of symbols, and returns the list of intervals obtained by looking up each symbol in turn, adapting the model at each step: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7BencodeSyms%7D+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5Crightarrow+%5B%5Cmathit%7BInterval%7D%5D+%5C%5C+%5Cmathit%7BencodeSyms%7D%5C%3B+m+%3D+%5Cmathit%7Bmap%7D%5C%3B%5Cmathit%7Bsnd%7D+%5Ccdot+%5Cmathit%7Btail%7D+%5Ccdot+%5Cmathit%7Bscanl%7D%5C%3B%5Cmathit%7Bnext%7D%5C%3B%28m%2C%5Cmathit%7Bunit%7D%29+%5C%5C+%5Cquad+%5Cmathbf%7Bwhere%7D%5C%3B+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dl%7D+%5Cmathit%7Bnext%7D+%3A%3A+%28%5Cmathit%7BModel%7D%2C%5Cmathit%7BInterval%7D%29+%5Crightarrow+%5Cmathit%7BSymbol%7D+%5Crightarrow+%28%5Cmathit%7BModel%7D%2C%5Cmathit%7BInterval%7D%29+%5C%5C+%5Cmathit%7Bnext%7D%5C%3B%28m%2Ci%29%5C%3Bs+%3D+%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%2C+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs%29+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{encodeSyms} :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow [\mathit{Interval}] \\ \mathit{encodeSyms}\; m = \mathit{map}\;\mathit{snd} \cdot \mathit{tail} \cdot \mathit{scanl}\;\mathit{next}\;(m,\mathit{unit}) \\ \quad \mathbf{where}\; \begin{array}[t]{@{}l} \mathit{next} :: (\mathit{Model},\mathit{Interval}) \rightarrow \mathit{Symbol} \rightarrow (\mathit{Model},\mathit{Interval}) \\ \mathit{next}\;(m,i)\;s = (\mathit{newModel}\;m\;s, \mathit{encodeSym}\;m\;s) \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{encodeSyms} :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow [\mathit{Interval}] \\ \mathit{encodeSyms}\; m = \mathit{map}\;\mathit{snd} \cdot \mathit{tail} \cdot \mathit{scanl}\;\mathit{next}\;(m,\mathit{unit}) \\ \quad \mathbf{where}\; \begin{array}[t]{@{}l} \mathit{next} :: (\mathit{Model},\mathit{Interval}) \rightarrow \mathit{Symbol} \rightarrow (\mathit{Model},\mathit{Interval}) \\ \mathit{next}\;(m,i)\;s = (\mathit{newModel}\;m\;s, \mathit{encodeSym}\;m\;s) \end{array} \end{array} " />
</p></blockquote>
<p> That is, </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%7D+%5Cmathit%7BencodeSyms%7D%5C%3Bm%5C%3B%5B%5C%2C%5D+%26%3D%26+%5B%5C%2C%5D+%5C%5C+%5Cmathit%7BencodeSyms%7D%5C%3Bm%5C%3B%28s%3Ass%29+%26%3D%26+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%3A+%5Cmathit%7BencodeSyms%7D%5C%3B%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%29%5C%3Bss+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl} \mathit{encodeSyms}\;m\;[\,] &=& [\,] \\ \mathit{encodeSyms}\;m\;(s:ss) &=& \mathit{encodeSym}\;m\;s : \mathit{encodeSyms}\;(\mathit{newModel}\;m\;s)\;ss \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl} \mathit{encodeSyms}\;m\;[\,] &=& [\,] \\ \mathit{encodeSyms}\;m\;(s:ss) &=& \mathit{encodeSym}\;m\;s : \mathit{encodeSyms}\;(\mathit{newModel}\;m\;s)\;ss \end{array} " />
</p></blockquote>
<p> We then narrow the unit interval by each of these subintervals, and pick a single value from the resulting interval: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bencode%7D_0+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5Crightarrow+%5Cmathit%7BRational%7D+%5C%5C+%5Cmathit%7Bencode%7D_0%5C%3Bm+%3D+%5Cmathit%7Bpick%7D+%5Ccdot+%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bnarrow%7D%5C%3B%5Cmathit%7Bunit%7D+%5Ccdot+%5Cmathit%7BencodeSyms%7D%5C%3Bm+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{encode}_0 :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow \mathit{Rational} \\ \mathit{encode}_0\;m = \mathit{pick} \cdot \mathit{foldr}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{encode}_0 :: \mathit{Model} \rightarrow [\mathit{Symbol}] \rightarrow \mathit{Rational} \\ \mathit{encode}_0\;m = \mathit{pick} \cdot \mathit{foldr}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{encodeSyms}\;m \end{array} " />
</p></blockquote>
<p> All we require of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D+%3A%3A+%5Cmathit%7BInterval%7D+%5Crightarrow+%5Cmathit%7BRational%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick} :: \mathit{Interval} \rightarrow \mathit{Rational}}" class="latex" title="{\mathit{pick} :: \mathit{Interval} \rightarrow \mathit{Rational}}" /> is that <img src="https://s0.wp.com/latex.php?latex=%7Bi+%5Cni+%5Cmathit%7Bpick%7D%5C%3Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i \ni \mathit{pick}\;i}" class="latex" title="{i \ni \mathit{pick}\;i}" />; then <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bencode%7D_0%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{encode}_0}" class="latex" title="{\mathit{encode}_0}" /> yields a fraction in the unit interval. For example, we might set <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D+%3D+%5Cmathit%7Bmidpoint%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick} = \mathit{midpoint}}" class="latex" title="{\mathit{pick} = \mathit{midpoint}}" />, where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Ctextstyle+%5Cmathit%7Bmidpoint%7D%5C%3Bi+%3D+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cfrac+1+2%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \textstyle \mathit{midpoint}\;i = \mathit{weight}\;i\;(\frac 1 2) " class="latex" title="\displaystyle \textstyle \mathit{midpoint}\;i = \mathit{weight}\;i\;(\frac 1 2) " />
</p></blockquote>
<p></p><h2> Decoding </h2>
<p>
So much for encoding; how do we retrieve the input text? In fact, we can retrieve the first symbol simply by using <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BdecodeSym%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{decodeSym}}" class="latex" title="{\mathit{decodeSym}}" />. Expanding the encoding of a non-empty text, we have: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dcl%7D+%26+%5Cmathit%7Bencode%7D_0%5C%3Bm%5C%3B%28s%3Ass%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bencode%7D_0+%5Cmbox%7B+and+%7D+%5Cmathit%7BencodeSyms%7D+%5Cmbox%7B%2C+as+above%3B+let+%7D+i+%3D+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%5C%7D+%5C%5C+%26+%5Cmathit%7Bpick%7D%5C%3B%28%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bnarrow%7D%5C%3B%5Cmathit%7Bunit%7D%5C%3B%28i+%3A+%5Cmathit%7BencodeSyms%7D%5C%3B%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%29%5C%3Bss%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmbox%7Bfold%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7Bpick%7D%5C%3B%28i+%5Cmathbin%7B%5Ctriangleright%7D+%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bnarrow%7D%5C%3B%5Cmathit%7Bunit%7D%5C%3B%28%5Cmathit%7BencodeSyms%7D%5C%3B%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%29%5C%3Bss%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bpick%7D%5C%3B%28i+%5Cmathbin%7B%5Ctriangleright%7D+j%29+%3D+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bpick%7D%5C%3Bj%29+%5Cmbox%7B+%28see+below%29%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bpick%7D%5C%3B%28%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bnarrow%7D%5C%3B%5Cmathit%7Bunit%7D%5C%3B%28%5Cmathit%7BencodeSyms%7D%5C%3B%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%29%5C%3Bss%29%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bencode%7D_0+%5Cmbox%7B+and+%7D+%5Cmathit%7BencodeSyms%7D+%5Cmbox%7B+again%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bencode%7D_0%5C%3B%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%29%5C%3Bss%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}cl} & \mathit{encode}_0\;m\;(s:ss) \\ = & \qquad \{ \mathit{encode}_0 \mbox{ and } \mathit{encodeSyms} \mbox{, as above; let } i = \mathit{encodeSym}\;m\;s \} \\ & \mathit{pick}\;(\mathit{foldr}\;\mathit{narrow}\;\mathit{unit}\;(i : \mathit{encodeSyms}\;(\mathit{newModel}\;m\;s)\;ss)) \\ = & \qquad \{ \mbox{fold} \} \\ & \mathit{pick}\;(i \mathbin{\triangleright} \mathit{foldr}\;\mathit{narrow}\;\mathit{unit}\;(\mathit{encodeSyms}\;(\mathit{newModel}\;m\;s)\;ss)) \\ = & \qquad \{ \mathit{pick}\;(i \mathbin{\triangleright} j) = \mathit{weight}\;i\;(\mathit{pick}\;j) \mbox{ (see below)} \} \\ & \mathit{weight}\;i\;(\mathit{pick}\;(\mathit{foldr}\;\mathit{narrow}\;\mathit{unit}\;(\mathit{encodeSyms}\;(\mathit{newModel}\;m\;s)\;ss))) \\ = & \qquad \{ \mathit{encode}_0 \mbox{ and } \mathit{encodeSyms} \mbox{ again} \} \\ & \mathit{weight}\;i\;(\mathit{encode}_0\;(\mathit{newModel}\;m\;s)\;ss) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}cl} & \mathit{encode}_0\;m\;(s:ss) \\ = & \qquad \{ \mathit{encode}_0 \mbox{ and } \mathit{encodeSyms} \mbox{, as above; let } i = \mathit{encodeSym}\;m\;s \} \\ & \mathit{pick}\;(\mathit{foldr}\;\mathit{narrow}\;\mathit{unit}\;(i : \mathit{encodeSyms}\;(\mathit{newModel}\;m\;s)\;ss)) \\ = & \qquad \{ \mbox{fold} \} \\ & \mathit{pick}\;(i \mathbin{\triangleright} \mathit{foldr}\;\mathit{narrow}\;\mathit{unit}\;(\mathit{encodeSyms}\;(\mathit{newModel}\;m\;s)\;ss)) \\ = & \qquad \{ \mathit{pick}\;(i \mathbin{\triangleright} j) = \mathit{weight}\;i\;(\mathit{pick}\;j) \mbox{ (see below)} \} \\ & \mathit{weight}\;i\;(\mathit{pick}\;(\mathit{foldr}\;\mathit{narrow}\;\mathit{unit}\;(\mathit{encodeSyms}\;(\mathit{newModel}\;m\;s)\;ss))) \\ = & \qquad \{ \mathit{encode}_0 \mbox{ and } \mathit{encodeSyms} \mbox{ again} \} \\ & \mathit{weight}\;i\;(\mathit{encode}_0\;(\mathit{newModel}\;m\;s)\;ss) \end{array} " />
</p></blockquote>
<p> The proof obligation, left as an exercise, is to show that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bpick%7D%5C%3B%28i+%5Cmathbin%7B%5Ctriangleright%7D+j%29+%3D+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bpick%7D%5C%3Bj%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{pick}\;(i \mathbin{\triangleright} j) = \mathit{weight}\;i\;(\mathit{pick}\;j) " class="latex" title="\displaystyle \mathit{pick}\;(i \mathbin{\triangleright} j) = \mathit{weight}\;i\;(\mathit{pick}\;j) " />
</p></blockquote>
<p> which holds when <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bpick%7D%5C%3Bi%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{pick}\;i}" class="latex" title="{\mathit{pick}\;i}" /> is of the form <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%5C%3Bi%5C%3Bx%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}\;i\;x}" class="latex" title="{\mathit{weight}\;i\;x}" /> for some <img src="https://s0.wp.com/latex.php?latex=%7Bx%7D&bg=ffffff&fg=000000&s=0" alt="{x}" class="latex" title="{x}" />.</p>
<p>
Now </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dcl%7D+%26+%5Cmathit%7BdecodeSym%7D%5C%3Bm%5C%3B%28%5Cmathit%7Bencode%7D_0%5C%3Bm%5C%3B%28s%3Ass%29%29+%3D+s+%5C%5C+%5CLeftrightarrow+%26+%5Cqquad+%5C%7B+%5Cmbox%7Bexpansion+of+%7D+%5Cmathit%7Bencode%7D_0+%5Cmbox%7B%2C+as+above%3B+let+%7D+i+%3D+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%5C%7D+%5C%5C+%26+%5Cmathit%7BdecodeSym%7D%5C%3Bm%5C%3B%28%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bencode%7D_0%5C%3B%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%29%5C%3Bss%29%29+%3D+s+%5C%5C+%5CLeftrightarrow+%26+%5Cqquad+%5C%7B+%5Cmbox%7Brequirement+on+models%7D+%5C%7D+%5C%5C+%26+i+%5Cni+%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bencode%7D_0%5C%3B%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%29%5C%3Bss%29+%5C%5C+%5CLeftarrow+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bweight%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7Bunit%7D+%5Cni+%5Cmathit%7Bencode%7D_0%5C%3B%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%29%5C%3Bss+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}cl} & \mathit{decodeSym}\;m\;(\mathit{encode}_0\;m\;(s:ss)) = s \\ \Leftrightarrow & \qquad \{ \mbox{expansion of } \mathit{encode}_0 \mbox{, as above; let } i = \mathit{encodeSym}\;m\;s \} \\ & \mathit{decodeSym}\;m\;(\mathit{weight}\;i\;(\mathit{encode}_0\;(\mathit{newModel}\;m\;s)\;ss)) = s \\ \Leftrightarrow & \qquad \{ \mbox{requirement on models} \} \\ & i \ni \mathit{weight}\;i\;(\mathit{encode}_0\;(\mathit{newModel}\;m\;s)\;ss) \\ \Leftarrow & \qquad \{ \mathit{weight} \} \\ & \mathit{unit} \ni \mathit{encode}_0\;(\mathit{newModel}\;m\;s)\;ss \end{array} " class="latex" title="\displaystyle \begin{array}{@{}cl} & \mathit{decodeSym}\;m\;(\mathit{encode}_0\;m\;(s:ss)) = s \\ \Leftrightarrow & \qquad \{ \mbox{expansion of } \mathit{encode}_0 \mbox{, as above; let } i = \mathit{encodeSym}\;m\;s \} \\ & \mathit{decodeSym}\;m\;(\mathit{weight}\;i\;(\mathit{encode}_0\;(\mathit{newModel}\;m\;s)\;ss)) = s \\ \Leftrightarrow & \qquad \{ \mbox{requirement on models} \} \\ & i \ni \mathit{weight}\;i\;(\mathit{encode}_0\;(\mathit{newModel}\;m\;s)\;ss) \\ \Leftarrow & \qquad \{ \mathit{weight} \} \\ & \mathit{unit} \ni \mathit{encode}_0\;(\mathit{newModel}\;m\;s)\;ss \end{array} " />
</p></blockquote>
<p> and indeed, encoding yields a fraction in the unit interval, so this recovers the first symbol correctly. This is the foothold that allows the decoding process to make progress; having obtained the first symbol using <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BdecodeSym%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{decodeSym}}" class="latex" title="{\mathit{decodeSym}}" />, it can adapt the model in precisely the same way that the encoding process does, then retrieve the second symbol using that adapted model, and so on. The only slightly tricky part is that when decoding an initial value <img src="https://s0.wp.com/latex.php?latex=%7Bx%7D&bg=ffffff&fg=000000&s=0" alt="{x}" class="latex" title="{x}" />, having obtained the first symbol <img src="https://s0.wp.com/latex.php?latex=%7Bs%7D&bg=ffffff&fg=000000&s=0" alt="{s}" class="latex" title="{s}" />, decoding should continue on some modified value <img src="https://s0.wp.com/latex.php?latex=%7Bx%27%7D&bg=ffffff&fg=000000&s=0" alt="{x'}" class="latex" title="{x'}" />; what should the modification be? It turns out that the right thing to do is to scale <img src="https://s0.wp.com/latex.php?latex=%7Bx%7D&bg=ffffff&fg=000000&s=0" alt="{x}" class="latex" title="{x}" /> by the interval associated in the model with symbol <img src="https://s0.wp.com/latex.php?latex=%7Bs%7D&bg=ffffff&fg=000000&s=0" alt="{s}" class="latex" title="{s}" />, since scaling is the inverse operation to the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bweight%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{weight}}" class="latex" title="{\mathit{weight}}" />s that take place during encoding. That is, we define: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dl%7D+%5Cmathit%7Bdecode%7D_0+%3A%3A+%5Cmathit%7BModel%7D+%5Crightarrow+%5Cmathit%7BRational%7D+%5Crightarrow+%5B%5Cmathit%7BSymbol%7D%5D+%5C%5C+%5Cmathit%7Bdecode%7D_0%5C%3Bm%5C%3Bx+%3D+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D%5C%3B%28m%2Cx%29+%5Cvrule+width0pt+depth2ex+%5C%5C+%5Cmathit%7Bstep%7D+%3A%3A+%28%5Cmathit%7BModel%7D%2C+%5Cmathit%7BRational%7D%29+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cmathit%7BSymbol%7D%2C+%28%5Cmathit%7BModel%7D%2C%5Cmathit%7BRational%7D%29%29+%5C%5C+%5Cmathit%7Bstep%7D%5C%3B%28m%2Cx%29+%3D+%5Cmathit%7BJust%7D%5C%3B%28s%2C+%28%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs%2C+%5Cmathit%7Bscale%7D%5C%3B%28%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs%29%5C%3Bx%29%29+%5C%5C+%5Cquad+%5Cmathbf%7Bwhere%7D%5C%3Bs+%3D+%5Cmathit%7BdecodeSym%7D%5C%3Bm%5C%3Bx+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}l} \mathit{decode}_0 :: \mathit{Model} \rightarrow \mathit{Rational} \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_0\;m\;x = \mathit{unfoldr}\;\mathit{step}\;(m,x) \vrule width0pt depth2ex \\ \mathit{step} :: (\mathit{Model}, \mathit{Rational}) \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, (\mathit{Model},\mathit{Rational})) \\ \mathit{step}\;(m,x) = \mathit{Just}\;(s, (\mathit{newModel}\;m\;s, \mathit{scale}\;(\mathit{encodeSym}\;m\;s)\;x)) \\ \quad \mathbf{where}\;s = \mathit{decodeSym}\;m\;x \end{array} " class="latex" title="\displaystyle \begin{array}{@{}l} \mathit{decode}_0 :: \mathit{Model} \rightarrow \mathit{Rational} \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_0\;m\;x = \mathit{unfoldr}\;\mathit{step}\;(m,x) \vrule width0pt depth2ex \\ \mathit{step} :: (\mathit{Model}, \mathit{Rational}) \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, (\mathit{Model},\mathit{Rational})) \\ \mathit{step}\;(m,x) = \mathit{Just}\;(s, (\mathit{newModel}\;m\;s, \mathit{scale}\;(\mathit{encodeSym}\;m\;s)\;x)) \\ \quad \mathbf{where}\;s = \mathit{decodeSym}\;m\;x \end{array} " />
</p></blockquote>
<p> (Of course, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs+%5Cni+x%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{encodeSym}\;m\;s \ni x}" class="latex" title="{\mathit{encodeSym}\;m\;s \ni x}" />, by the inverse requirement on models, and so the new scaled value is again within the unit interval.)</p>
<p>
Note that decoding yields an infinite list of symbols; the function <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstep%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{step}}" class="latex" title="{\mathit{step}}" /> is always productive. Nevertheless, that infinite list starts with the encoded text, as we shall now verify. Define the round-trip function </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bround%7D_0%5C%3Bm+%3D+%5Cmathit%7Bdecode%7D_0%5C%3Bm+%5Ccdot+%5Cmathit%7Bencode%7D_0%5C%3Bm+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{round}_0\;m = \mathit{decode}_0\;m \cdot \mathit{encode}_0\;m " class="latex" title="\displaystyle \mathit{round}_0\;m = \mathit{decode}_0\;m \cdot \mathit{encode}_0\;m " />
</p></blockquote>
<p> Then we have: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dcl%7D+%26+%5Cmathit%7Bround%7D_0%5C%3Bm%5C%3B%28s%3Ass%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmbox%7Bdefinition+of+%7D+%5Cmathit%7Bround%7D_0+%5C%7D+%5C%5C+%26+%5Cmathit%7Bdecode%7D_0%5C%3Bm%5C%3B%28%5Cmathit%7Bencode%7D_0%5C%3Bm%5C%3B%28s%3Ass%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bencode%7D_0+%5Cmbox%7B%3B+let+%7D+i+%3D+%5Cmathit%7BencodeSym%7D%5C%3Bm%5C%3Bs%2C+m%27+%3D+%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bs+%5C%7D+%5C%5C+%26+%5Cmathit%7Bdecode%7D_0%5C%3Bm%5C%3B%28%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bencode%7D_0%5C%3Bm%27%5C%3Bss%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bdecode%7D_0+%5Cmbox%7B%3B+first+decoded+symbol+is+correct%2C+as+above%7D+%5C%7D+%5C%5C+%26+s+%3A+%5Cmathit%7Bdecode%7D_0%5C%3Bm%27%5C%3B%28%5Cmathit%7Bscale%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bweight%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bencode%7D_0%5C%3Bm%27%5C%3Bss%29%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bscale%7D%5C%3Bi%5C%3B%28%5Cmathit%7Bweight%7D%5C%3Bi%5C%3Bx%29+%3D+x+%5C%7D+%5C%5C+%26+s+%3A+%5Cmathit%7Bdecode%7D_0%5C%3Bm%27%5C%3B%28%5Cmathit%7Bencode%7D_0%5C%3Bm%27%5C%3Bss%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmbox%7Bdefinition+of+%7D+%5Cmathit%7Bround%7D_0+%5C%7D+%5C%5C+%26+s+%3A+%5Cmathit%7Bround%7D_0%5C%3Bm%27%5C%3Bss+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}cl} & \mathit{round}_0\;m\;(s:ss) \\ = & \qquad \{ \mbox{definition of } \mathit{round}_0 \} \\ & \mathit{decode}_0\;m\;(\mathit{encode}_0\;m\;(s:ss)) \\ = & \qquad \{ \mathit{encode}_0 \mbox{; let } i = \mathit{encodeSym}\;m\;s, m' = \mathit{newModel}\;m\;s \} \\ & \mathit{decode}_0\;m\;(\mathit{weight}\;i\;(\mathit{encode}_0\;m'\;ss)) \\ = & \qquad \{ \mathit{decode}_0 \mbox{; first decoded symbol is correct, as above} \} \\ & s : \mathit{decode}_0\;m'\;(\mathit{scale}\;i\;(\mathit{weight}\;i\;(\mathit{encode}_0\;m'\;ss))) \\ = & \qquad \{ \mathit{scale}\;i\;(\mathit{weight}\;i\;x) = x \} \\ & s : \mathit{decode}_0\;m'\;(\mathit{encode}_0\;m'\;ss) \\ = & \qquad \{ \mbox{definition of } \mathit{round}_0 \} \\ & s : \mathit{round}_0\;m'\;ss \end{array} " class="latex" title="\displaystyle \begin{array}{@{}cl} & \mathit{round}_0\;m\;(s:ss) \\ = & \qquad \{ \mbox{definition of } \mathit{round}_0 \} \\ & \mathit{decode}_0\;m\;(\mathit{encode}_0\;m\;(s:ss)) \\ = & \qquad \{ \mathit{encode}_0 \mbox{; let } i = \mathit{encodeSym}\;m\;s, m' = \mathit{newModel}\;m\;s \} \\ & \mathit{decode}_0\;m\;(\mathit{weight}\;i\;(\mathit{encode}_0\;m'\;ss)) \\ = & \qquad \{ \mathit{decode}_0 \mbox{; first decoded symbol is correct, as above} \} \\ & s : \mathit{decode}_0\;m'\;(\mathit{scale}\;i\;(\mathit{weight}\;i\;(\mathit{encode}_0\;m'\;ss))) \\ = & \qquad \{ \mathit{scale}\;i\;(\mathit{weight}\;i\;x) = x \} \\ & s : \mathit{decode}_0\;m'\;(\mathit{encode}_0\;m'\;ss) \\ = & \qquad \{ \mbox{definition of } \mathit{round}_0 \} \\ & s : \mathit{round}_0\;m'\;ss \end{array} " />
</p></blockquote>
<p> From this it follows that indeed the round-trip recovers the initial text, in the sense that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bround%7D_0%5C%3Bm%5C%3Bss%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{round}_0\;m\;ss}" class="latex" title="{\mathit{round}_0\;m\;ss}" /> yields an infinite sequence that starts with <img src="https://s0.wp.com/latex.php?latex=%7Bss%7D&bg=ffffff&fg=000000&s=0" alt="{ss}" class="latex" title="{ss}" />; in fact, </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bround%7D_0%5C%3Bm%5C%3Bss+%3D+ss+%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D+%5Cmathit%7Bround%7D_0%5C%3B%28%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bss%29%5C%3B%5B%5C%2C%5D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{round}_0\;m\;ss = ss \mathbin{{+}\!\!\!{+}} \mathit{round}_0\;(\mathit{foldl}\;\mathit{newModel}\;m\;ss)\;[\,] " class="latex" title="\displaystyle \mathit{round}_0\;m\;ss = ss \mathbin{{+}\!\!\!{+}} \mathit{round}_0\;(\mathit{foldl}\;\mathit{newModel}\;m\;ss)\;[\,] " />
</p></blockquote>
<p> yielding the original input followed by some junk, the latter obtained by decoding the fraction <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac+1+2%7D&bg=ffffff&fg=000000&s=0" alt="{\frac 1 2}" class="latex" title="{\frac 1 2}" /> (the encoding of <img src="https://s0.wp.com/latex.php?latex=%7B%5B%5C%2C%5D%7D&bg=ffffff&fg=000000&s=0" alt="{[\,]}" class="latex" title="{[\,]}" />) from the final model <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7BnewModel%7D%5C%3Bm%5C%3Bss%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}\;\mathit{newModel}\;m\;ss}" class="latex" title="{\mathit{foldl}\;\mathit{newModel}\;m\;ss}" /> that results from adapting the initial model to each symbol in <img src="https://s0.wp.com/latex.php?latex=%7Bss%7D&bg=ffffff&fg=000000&s=0" alt="{ss}" class="latex" title="{ss}" /> in turn. To actually retrieve the input text with no junk suffix, one could transmit the length separately (although that doesn’t sit well with streaming), or append a distinguished end-of-text symbol.</p>
<p></p><h2> What’s next </h2>
<p>
So far we have an encoder and a decoder, and a proof that the decoder successfully decodes the encoded text. In the next post, we’ll see how to reimplement both as streaming processes.</p>
<p><img src="https://pixel.wp.com/b.gif?host=patternsinfp.wordpress.com&blog=15593982&post=330&subd=patternsinfp&ref=&feed=1" alt="" height="1" border="0" width="1" /></p>Tue, 05 Dec 2017 15:58:56 +0000Joachim Breitner: Finding bugs in Haskell code by proving ithttp://www.joachim-breitner.de/blog/734-Finding_bugs_in_Haskell_code_by_proving_it
http://www.joachim-breitner.de/blog/734-Finding_bugs_in_Haskell_code_by_proving_it
<p>Last week, I wrote a small nifty tool called <a href="https://github.com/nomeata/bisect-binary"><code>bisect-binary</code></a>, which semi-automates answering the question “To what extent can I fill this file up with zeroes and still have it working”. I wrote it it in Haskell, and part of the Haskell code, in the <a href="https://github.com/nomeata/bisect-binary/blob/48f9b9f05509a8b0c15c654f790fefd4e0c22676/src/Intervals.hs">Intervals.hs</a> module, is a data structure for “subsets of a file” represented as a sorted list of intervals:</p>
<pre><code>data Interval = I { from :: Offset, to :: Offset }
newtype Intervals = Intervals [Interval]</code></pre>
<p>The code is the kind of Haskell code that I like to write: A small local recursive function, a few guards to case analysis, and I am done:</p>
<pre><code>intersect :: Intervals -> Intervals -> Intervals
intersect (Intervals is1) (Intervals is2) = Intervals $ go is1 is2
where
go _ [] = []
go [] _ = []
go (i1:is1) (i2:is2)
-- reorder for symmetry
| to i1 < to i2 = go (i2:is2) (i1:is1)
-- disjoint
| from i1 >= to i2 = go (i1:is1) is2
-- subset
| to i1 == to i2 = I f' (to i2) : go is1 is2
-- overlapping
| otherwise = I f' (to i2) : go (i1 { from = to i2} : is1) is2
where f' = max (from i1) (from i2)</code></pre>
<p>But clearly, the code is already complicated enough so that it is easy to make a mistake. I could have put in some QuickCheck properties to test the code, I was in proving mood...</p>
<h3 id="now-available-formal-verification-for-haskell">Now available: Formal Verification for Haskell</h3>
<p>Ten months ago I complained that there was <a href="http://www.joachim-breitner.de/blog/717-Why_prove_programs_equivalent_when_your_compiler_can_do_that_for_you_">no good way to verify Haskell code</a> (and created the nifty hack <a href="https://github.com/nomeata/ghc-proofs"><code>ghc-proofs</code></a>). But things have changed since then, as a group at UPenn (mostly Antal Spector-Zabusky, Stephanie Weirich and myself) has created <a href="https://github.com/antalsz/hs-to-coq"><code>hs-to-coq</code></a>: a translator from Haskell to the theorem prover Coq.</p>
<p>We have used <code>hs-to-coq</code> on various examples, as described in our <a href="https://arxiv.org/abs/1711.09286">CPP'18 paper</a>, but it is high-time to use it for real. The easiest way to use <code>hs-to-coq</code> at the moment is to clone the repository, copy one of the example directories (e.g. <code>examples/successors</code>), place the Haskell file to be verified there and put the right module name into the <code>Makefile</code>. I also commented out parts of the Haskell file that would drag in non-base dependencies.</p>
<h3 id="massaging-the-translation">Massaging the translation</h3>
<p>Often, <code>hs-to-coq</code> translates Haskell code without a hitch, but sometimes, a bit of help is needed. In this case, I had to specify <a href="https://github.com/antalsz/hs-to-coq/blob/8f84d61093b7be36190142c795d6cd4496ef5aed/examples/intervals/edits">three so-called <em>edits</em></a>:</p>
<ul>
<li><p>The Haskell code uses <code>Intervals</code> both as a name for a type and for a value (the constructor). This is fine in Haskell, which has separate value and type namespaces, but not for Coq. The line</p>
<pre><code>rename value Intervals.Intervals = ival</code></pre>
<p>changes the constructor name to <code>ival</code>.</p></li>
<li><p>I use the <code>Int64</code> type in the Haskell code. The Coq version of Haskell’s base library that comes with <code>hs-to-coq</code> does not support that yet, so I change that via</p>
<pre><code>rename type GHC.Int.Int64 = GHC.Num.Int</code></pre>
<p>to the normal <code>Int</code> type, which itself is mapped to <a href="https://coq.inria.fr/library/Coq.Numbers.BinNums.html">Coq’s <code>Z</code> type</a>. This is not a perfect fit, and my verification would not catch problems that arise due to the boundedness of <code>Int64</code>. Since none of my code does arithmetic, only comparisons, I am fine with that.</p></li>
<li><p>The biggest hurdle is the recursion of the local <code>go</code> functions. Coq requires all recursive functions to be obviously (i.e. structurally) terminating, and the <code>go</code> above is not. For example, in the first case, the arguments to <code>go</code> are simply swapped. It is very much not obvious why this is not an infinite loop.</p>
<p>I can specify a termination measure, i.e. a function that takes the arguments <code>xs</code> and <code>ys</code> and returns a “size” of type <code>nat</code> that decreases in every call: Add the lengths of <code>xs</code> and <code>ys</code>, multiply by two and add one if the the first interval in <code>xs</code> ends before the first interval in <code>ys</code>.</p>
<p>If the problematic function were a top-level function I could tell <code>hs-to-coq</code> about this termination measure and it would use this information to define the function using <code>Program Fixpoint</code>.</p>
<p>Unfortunately, <code>go</code> is a local function, so this mechanism is not available to us. If I care more about the verification than about preserving the exact Haskell code, I could easily change the Haskell code to make <code>go</code> a top-level function, but in this case I did not want to change the Haskell code.</p>
<p>Another way out offered by <code>hs-to-coq</code> is to translate the recursive function using an axiom <code>unsafeFix : forall a, (a -> a) -> a</code>. This looks scary, but as I explain in the previous blog post, <a href="http://www.joachim-breitner.de/blog/733-Existence_and_Termination">this axiom can be used in a safe way</a>.</p>
<p>I should point out it is my dissenting opinion to consider this a valid verification approach. The official stand of the <code>hs-to-coq</code> author team is that using <code>unsafeFix</code> in the verification can only be a temporary state, and eventually you’d be expected to fix (heh) this, for example by moving the functions to the top-level and using <code>hs-to-coq</code>’s the support for <code>Program Fixpoint</code>.</p></li>
</ul>
<p>With these edits in place, <code>hs-to-coq</code> splits out a faithful Coq copy of my Haskell code.</p>
<h3 id="time-to-prove-things">Time to prove things</h3>
<p>The rest of the work is mostly straight-forward use of Coq. I define the invariant I expect to hold for these lists of intervals, namely that they are sorted, non-empty, disjoint and non-adjacent:</p>
<pre><code>Fixpoint goodLIs (is : list Interval) (lb : Z) : Prop :=
match is with
| [] => True
| (I f t :: is) => (lb <= f)%Z /\ (f < t)%Z /\ goodLIs is t
end.
Definition good is := match is with
ival is => exists n, goodLIs is n end.</code></pre>
<p>and I give them meaning as Coq type for sets, <a href="https://coq.inria.fr/library/Coq.Sets.Ensembles.html"><code>Ensemble</code></a>:</p>
<pre><code>Definition range (f t : Z) : Ensemble Z :=
(fun z => (f <= z)%Z /\ (z < t)%Z).
Definition semI (i : Interval) : Ensemble Z :=
match i with I f t => range f t end.
Fixpoint semLIs (is : list Interval) : Ensemble Z :=
match is with
| [] => Empty_set Z
| (i :: is) => Union Z (semI i) (semLIs is)
end.
Definition sem is := match is with
ival is => semLIs is end.</code></pre>
<p>Now I prove for every function that it preserves the invariant and that it corresponds to the, well, corresponding function, e.g.:</p>
<pre><code>Lemma intersect_good : forall (is1 is2 : Intervals),
good is1 -> good is2 -> good (intersect is1 is2).
Proof. … Qed.
Lemma intersection_spec : forall (is1 is2 : Intervals),
good is1 -> good is2 ->
sem (intersect is1 is2) = Intersection Z (sem is1) (sem is2).
Proof. … Qed.</code></pre>
<p>Even though I punted on the question of termination while defining the functions, I do not get around that while verifying this, so I formalize the termination argument above</p>
<pre><code>Definition needs_reorder (is1 is2 : list Interval) : bool :=
match is1, is2 with
| (I f1 t1 :: _), (I f2 t2 :: _) => (t1 <? t2)%Z
| _, _ => false
end.
Definition size2 (is1 is2 : list Interval) : nat :=
(if needs_reorder is1 is2 then 1 else 0) + 2 * length is1 + 2 * length is2.</code></pre>
<p>and use it in my inductive proofs.</p>
<p>As I intend this to be a write-once proof, I happily copy’n’pasted proof scripts and did not do any cleanup. Thus, the <a href="https://github.com/antalsz/hs-to-coq/blob/8f84d61093b7be36190142c795d6cd4496ef5aed/examples/intervals/Proofs.v">resulting Proof file</a> is big, ugly and repetitive. I am confident that judicious use of Coq tactics could greatly condense this proof.</p>
<h3 id="using-program-fixpoint-after-the-fact">Using Program Fixpoint after the fact?</h3>
<p>This proofs are also an experiment of how I can actually do induction over a locally defined recursive function without too ugly proof goals (hence the line <code>match goal with [ |- context [unsafeFix ?f _ _] ] => set (u := f) end.</code>). One could improve upon this approach by following these steps:</p>
<ol style="">
<li><p>Define copies (say, <code>intersect_go_witness</code>) of the local <code>go</code> using <code>Program Fixpoint</code> with the above termination measure. The termination argument needs to be made only once, here.</p></li>
<li><p>Use this function to prove that the argument <code>f</code> in <code>go = unsafeFix f</code> actually has a fixed point:</p>
<pre><code>Lemma intersect_go_sound:</code></pre>
<p>f intersect_go_witness = intersect_go_witness</p>
<p>(This requires functional extensionality). This lemma indicates that my use of the axioms <code>unsafeFix</code> and <code>unsafeFix_eq</code> are actually sound, as discussed in the previous blog post.</p></li>
<li><p>Still prove the desired properties for the <code>go</code> that uses <code>unsafeFix</code>, as before, but using the <a href="https://coq.inria.fr/refman/schemes.html#sec655">functional induction scheme</a> for <code>intersect_go</code>! This way, the actual proofs are free from any noisy termination arguments.</p>
<p>(The trick to define a recursive function just to throw away the function and only use its induction rule is one I learned in Isabelle, and is very useful to separate the meat from the red tape in complex proofs. Note that the induction rule for a function does not actually mention the function!)</p></li>
</ol>
<p>Maybe I will get to this later.</p>
<p><strong>Update:</strong> I experimented a bit in that direction, and it does not quite work as expected. In step 2 I am stuck because <code>Program Fixpoint</code> does not create a fixpoint-unrolling lemma, and in step 3 I do not get the induction scheme that I was hoping for. Both problems <a href="https://stackoverflow.com/a/46995609/946226">would not exist if I use the <code>Function</code> command</a>, although that needs some tickery to support a termination measure on multiple arguments. The induction lemma is not quite as polished as I was hoping for, so <a href="https://github.com/antalsz/hs-to-coq/blob/b7efc7a8dbacca384596fc0caf65e62e87ef2768/examples/intervals/Proofs_Function.v">he resulting proof</a> is still somewhat ugly, and it requires copying code, which does not scale well.</p>
<h3 id="efforts-and-gains">Efforts and gains</h3>
<p>I spent exactly 7 hours working on these proofs, according to <a href="http://arbtt.nomeata.de/"><code>arbtt</code></a>. I am sure that writing these functions took me much less time, but I cannot calculate that easily, as they were originally in the <code>Main.hs</code> file of <code>bisect-binary</code>.</p>
<p>I did <a href="https://github.com/nomeata/bisect-binary/commit/48f9b9f05509a8b0c15c654f790fefd4e0c22676#diff-38999f20f11fe6a93fa194587e8ad507">find and fix three bugs</a>:</p>
<ul>
<li>The <code>intersect</code> function would not always retain the invariant that the intervals would be non-empty.</li>
<li>The <code>subtract</code> function would prematurely advance through the list intervals in the second argument, which can lead to a genuinely wrong result. (This occurred twice.)</li>
</ul>
<p><strong>Conclusion:</strong> Verification of Haskell code using Coq is now practically possible!</p>
<p><strong>Final rant:</strong> Why is the Coq standard library so incomplete (compared to, say, Isabelle’s) and requires me to prove <a href="https://github.com/antalsz/hs-to-coq/blob/8f84d61093b7be36190142c795d6cd4496ef5aed/examples/intervals/Ensemble_facts.v">so many lemmas about basic functions on <code>Ensembles</code></a>?</p>Tue, 05 Dec 2017 14:17:43 +0000mail@joachim-breitner.de (Joachim Breitner)Roman Cheplyaka: Introduction to golden testinghttp://ro-che.info//articles/2017-12-04-golden-tests.html
http://ro-che.info/articles/2017-12-04-golden-tests
<p>Golden tests are like unit tests, except the expected output is stored in a separate file. I learned about them in 2010 from Max Grigorev at <a href="https://wiki.haskell.org/ZuriHac2010">ZuriHac</a>.</p>
<p>Let’s say you want to test Python’s <code>json</code> module. One way to do that would be to encode an object and compare the result to a reference string:</p>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> json
<span class="cf">assert</span>(json.dumps([<span class="dv">1</span>,<span class="dv">2</span>,<span class="dv">3</span>]) <span class="op">==</span> <span class="st">"[1, 2, 3]"</span>)</code></pre></div>
<p>Alternatively, you could create a file with contents</p>
<pre><code>[1, 2, 3]</code></pre>
<p>and read it to know the expected output:</p>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="im">import</span> json
<span class="cf">with</span> <span class="bu">open</span>(<span class="st">"example1.json"</span>, <span class="st">"r"</span>) <span class="im">as</span> ex1_file:
ex1 <span class="op">=</span> ex1_file.read().rstrip()
<span class="cf">assert</span>(json.dumps([<span class="dv">1</span>,<span class="dv">2</span>,<span class="dv">3</span>]) <span class="op">==</span> ex1)</code></pre></div>
<p>The file <code>example1.json</code> is called a <em>golden</em> file.</p>
<p>Here are some advantages of golden tests over ordinary unit tests:</p>
<ol type="1">
<li>If the expected output is <strong>large in size</strong>, it may be impractical to put it inside the source code.</li>
<li>No need to <strong>escape quotes or binary data</strong> in the expected output.</li>
<li><p>When you add a <strong>new test</strong>, your testing framework can <strong>generate the missing golden file</strong> from the current output of the function.</p>
<p>It is best if you can write down the expected output without looking at the actual output, but it is not always possible. The output may be too big to type it character by character, or it may be hard to predict. For instance, in the json example, you couldn’t tell in advance whether there would be spaces between array elements or not. So often what you do is launch an interactive interpreter (if your language of choice even has one), run the function, and then copy-paste its output into the test code.</p>
<p>This process can be easily automated if you use golden files.</p></li>
<li><p>The expected output can be <strong>automatically updated</strong>.</p>
<p>Say you changed your json module to replace some of the spaces with newlines to make the output more aesthetically pleasing. You have 40 test cases that need updating. Can you imagine doing this by hand?</p>
With golden tests, you can tell your test framework to update all golden files from the current outputs, then check <code>git diff</code> to ensure that all changes are valid, and commit them.</li>
<li><p>If some of your tests suddently started failing, <strong>you can use <code>diff</code></strong> or other such tools to compare the golden file to the actual file and figure out what exactly changed. Perhaps your testing framework could even show the diff automatically on test failure?</p></li>
</ol>
<p>While advantages 1-2 are automatic, 3-5 require special support from your testing framework. The rest of this article will be focused on a Haskell testing framework <a href="https://github.com/feuerbach/tasty">tasty</a> and its add-on package for golden tests, <a href="https://github.com/feuerbach/tasty-golden">tasty-golden</a>.</p>
<h2 id="basic-usage">Basic usage</h2>
<p>To illustrate how tasty-golden works, consider this yaml-to-json conversion module:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="ot">{-# LANGUAGE TypeApplications #-}</span>
<span class="kw">module</span> <span class="dt">YamlToJson</span> <span class="kw">where</span>
<span class="kw">import qualified</span> <span class="dt">Data.Yaml</span> <span class="kw">as</span> <span class="dt">Y</span>
<span class="kw">import </span><span class="dt">Data.Aeson</span> <span class="kw">as</span> <span class="dt">J</span>
<span class="kw">import qualified</span> <span class="dt">Data.ByteString.Lazy</span> <span class="kw">as</span> <span class="dt">LBS</span>
<span class="ot">yamlToJson ::</span> <span class="dt">LBS.ByteString</span> <span class="ot">-></span> <span class="dt">LBS.ByteString</span>
yamlToJson <span class="fu">=</span> J.encode <span class="fu">.</span> Y.decode <span class="fu">@</span><span class="dt">Value</span> <span class="fu">.</span> LBS.toStrict</code></pre></div>
<p>Because JSON contains quotes and YAML spans multiple lines, it is not very practical to store them as string literals in the source code file. Instead, you will keep them both in files.</p>
<p>Note that the name “golden file” only refers to the file containing the <em>output</em>, not the <em>input</em>. There is no requirement that the input is stored in a file or that there even is any “input” at all; but in practice it is often convenient to store them both in files so that there is an input file for every output file and vice versa.</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">import </span><span class="dt">Test.Tasty</span> (defaultMain, <span class="dt">TestTree</span>, testGroup)
<span class="kw">import </span><span class="dt">Test.Tasty.Golden</span> (goldenVsString, findByExtension)
<span class="kw">import qualified</span> <span class="dt">Data.ByteString.Lazy</span> <span class="kw">as</span> <span class="dt">LBS</span>
<span class="kw">import </span><span class="dt">YamlToJson</span> (yamlToJson)
<span class="kw">import </span><span class="dt">System.FilePath</span> (takeBaseName, replaceExtension)
<span class="ot">main ::</span> <span class="dt">IO</span> ()
main <span class="fu">=</span> defaultMain <span class="fu">=<<</span> goldenTests
<span class="ot">goldenTests ::</span> <span class="dt">IO</span> <span class="dt">TestTree</span>
goldenTests <span class="fu">=</span> <span class="kw">do</span>
yamlFiles <span class="ot"><-</span> findByExtension [<span class="st">".yaml"</span>] <span class="st">"."</span>
return <span class="fu">$</span> testGroup <span class="st">"YamlToJson golden tests"</span>
[ goldenVsString
(takeBaseName yamlFile) <span class="co">-- test name</span>
jsonFile <span class="co">-- golden file path</span>
(yamlToJson <span class="fu"><$></span> LBS.readFile yamlFile) <span class="co">-- action whose result is tested</span>
<span class="fu">|</span> yamlFile <span class="ot"><-</span> yamlFiles
, <span class="kw">let</span> jsonFile <span class="fu">=</span> replaceExtension yamlFile <span class="st">".json"</span>
]</code></pre></div>
<p>This is all the code you need to support one, two, or a thousand test cases. When run, this code will:</p>
<ol type="1">
<li>find all <code>.yaml</code> files in the current directory</li>
<li>for each <code>.yaml</code> file, construct a golden test that evaluates <code>yamlToJson</code> on the input read from file and compares the result to the golden file, which has the name and the <code>.json</code> extension</li>
<li>put all individual tests in a test group and pass it to <code>defaultMain</code> for execution</li>
</ol>
<p>To see how this works in practice, create an input file, <code>fruits.yaml</code>, with the following contents:</p>
<div class="sourceCode"><pre class="sourceCode yaml"><code class="sourceCode yaml"><span class="kw">-</span> orange
<span class="kw">-</span> apple
<span class="kw">-</span> banana</code></pre></div>
<p>Now run your test suite (note: in a proper cabalized project, you’d run <code>cabal test</code> or <code>stack test</code> instead):</p>
<pre><code>% stack runghc test.hs
YamlToJson golden tests
fruits: OK
Golden file did not exist; created
All 1 tests passed (0.00s)</code></pre>
<p>tasty-golden realized that this is a new test case because the golden file was absent, so it went ahead and initialized the golden file based on the function’s output. You can now examine the file to see if it makes sense:</p>
<pre><code>% cat fruits.json
["orange","apple","banana"]</code></pre>
<p>If you are happy with it, check in both input and output files to git. This is important so that your collaborators can run the tests, but it also helps when dealing with failing tests, as you’ll see next.</p>
<pre><code>% git add fruits.yaml fruits.json && git commit -m "fruits test case"</code></pre>
<h2 id="dealing-with-test-failures">Dealing with test failures</h2>
<p>Occasionally, your tests will fail. A test that cannot fail is a useless test.</p>
<p>A golden test fails when the actual output does not match the contents of the golden file. You then need to figure out whether this is a bug or an intentional code change.</p>
<p>Let’s say you decide that the output of <code>yamlToJson</code> should end with a newline.</p>
<p>The new function definition is</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">yamlToJson <span class="fu">=</span> (<span class="fu"><></span> <span class="st">"\n"</span>) <span class="fu">.</span> J.encode <span class="fu">.</span> Y.decode <span class="fu">@</span><span class="dt">Value</span> <span class="fu">.</span> LBS.toStrict</code></pre></div>
<p>Now run the test suite:</p>
<pre><code>% stack runghc test.hs
YamlToJson golden tests
fruits: FAIL
Test output was different from './fruits.json'. It was: "[\"orange\",\"apple\",\"banana\"]\n"
1 out of 1 tests failed (0.00s)</code></pre>
<p>Ok, this is not very helpful. There are two main ways to get better diagnostics. One is to use the <code>goldenVsStringDiff</code> function as an alternative to <code>goldenVsString</code>. This will include the diff right in the tasty output.</p>
<p>But my preferred workflow is to use git for this. First, rerun the tests and pass the <code>--accept</code> option. This will update the golden files with the new output:</p>
<pre><code>% stack runghc -- test.hs --accept
YamlToJson golden tests
fruits: OK
Accepted the new version
All 1 tests passed (0.00s)</code></pre>
<p>Now, because your golden file is tracked by git, you can examine the differences between the old and new golden files with <code>git diff</code>:</p>
<pre><code>% git diff
diff --git fruits.json fruits.json
index c244c0a..ed447d4 100644
--- fruits.json
+++ fruits.json
@@ -1 +1 @@
-["orange","apple","banana"]
\ No newline at end of file
+["orange","apple","banana"]</code></pre>
<p>Because this is the change you expected, you can now commit the updated file to git.</p>
<p>This workflow lets you use all the powerful <code>git diff</code> options like <code>--color-words</code>, or even launch a graphical diff tool like kdiff3 with <code>git difftool</code>.</p>
<h2 id="see-also">See also</h2>
<p><a href="https://kseo.github.io/posts/2016-12-15-golden-tests-are-tasty.html">Golden tests are tasty</a> by Kwang Yul Seo</p>Mon, 04 Dec 2017 20:00:00 +0000Manuel M T Chakravarty: Here is the video of my Functional Conf 2017 talk Haskell...http://justtesting.org/post/168174539581
http://justtesting.org/post/168174539581
<iframe allow="encrypted-media" allowfullscreen="allowfullscreen" frameborder="0" gesture="media" height="225" id="youtube_iframe" src="https://www.youtube.com/embed/kd8mlbN0Mws?feature=oembed&amp;enablejsapi=1&amp;origin=http://safe.txmblr.com&amp;wmode=opaque" width="400"></iframe><br /><br /><p>Here is the video of my <a href="https://functionalconf.com">Functional Conf 2017</a> talk <a href="https://functionalconf.com/proposal.html?id=3939">Haskell SpriteKit — a Purely Functional API for a Stateful Animation System and Physics Engine</a>. In this talk, I am explaining how to wrap an OOish game engine API based on a mutable scene graph into a purely functional API based on an immutable algebraic data type.</p>Mon, 04 Dec 2017 05:23:31 +0000Mark Jason Dominus: Slaughter electric needle injectortag:,2017:/tech/slaughter-electric-needle-injector
https://blog.plover.com/tech/slaughter-electric-needle-injector.html
<p><img src="https://pic.blog.plover.com/tech/slaughter-electric-needle-injector/cake-sm.jpg" align="right" /></p>
<p>[ This article appeared yesterday on <a href="https://shitpost.plover.com/"><code>Content-type:
text/shitpost</code></a> but I decided later
there was nothing wrong with it, so I have moved it here. Apologies
if you are reading it twice. ]</p>
<p>At the end of the game <em>Portal</em>, one of the AI cores you must destroy
starts reciting <a href="http://half-life.wikia.com/wiki/Cake">GLaDOS's cake
recipe</a>. Like GLaDOS herself,
it starts reasonably enough, and then goes wildly off the rails. One
of the more memorable ingredients from the end of the list is
“slaughter electric needle injector”.</p>
<p>I looked into this a bit and I learned that there really is a
slaughter electric needle injector. It is not nearly as ominous as it
sounds. The needles themselves are not electric, and it has nothing to
do with slaughter. Rather, it is a handheld electric-powered needle
injector tool that happens to be manufactured by the <a href="http://www.slaughtercoinc.com/">Slaughter
Instrument Company, Inc</a>, founded more
than a hundred years ago by Mr. George Slaughter.</p>
<p align="center"><a href="https://pic.blog.plover.com/tech/slaughter-electric-needle-injector/needle-injector.jpg"><img src="https://pic.blog.plover.com/tech/slaughter-electric-needle-injector/needle-injector-th.jpg" border="0" /></a></p>
<p>Slaughter Co. manufactures tools for morticians and enbalmers
preparing bodies for burial. The <a href="http://www.mcssl.com/store/theslaughterinstrumentcompanyinc/needle-injectors/703500-electric-needle-injector">electric needle
injector</a>
is one such tool; they also manufacture a <a href="http://www.mcssl.com/store/theslaughterinstrumentcompanyinc/needle-injectors/703555-cordless-needle-injector">cordless electric needle
injector</a>,
mentioned later as part of the same cake recipe.
<br clear="all" /></p>
<p><img src="https://pic.blog.plover.com/tech/slaughter-electric-needle-injector/injector-needles.jpg" align="right" /></p>
<p>The needles themselves are quite benign. They are small, with
delicate six-inch brass wires attached, and cost about twenty-five
cents each. The needles and the injector are used for securing a
corpse's mouth so that it doesn't yawn open during the funeral. One
needle is injected into the upper jaw and one into the lower, and then
the wires are twisted together, holding the mouth shut. The mortician
clips off the excess wire and tucks the ends into the mouth. Only two
needles are needed per mouth.</p>
<p>There are a number of explanatory videos on YouTube, but I was not
able to find any actual demonstrations.</p>Fri, 01 Dec 2017 16:10:00 +0000mjd@plover.com (Mark Dominus)FP Complete: NAT Gateways in Amazon GovCloudhttps://www.fpcomplete.com/blog/nat-gateways-in-amazon-govcloud
https://www.fpcomplete.com/blog/nat-gateways-in-amazon-govcloud
<div class="hs-featured-image-wrapper">
<a href="https://www.fpcomplete.com/blog/nat-gateways-in-amazon-govcloud" class="hs-featured-image-link" title=""> <img src="https://www.fpcomplete.com/hubfs/Blog/govcloud.png?t=1513366076380" alt="govcloud.png" style="width: auto !important; float: left; margin: 0 15px 15px 0;" class="hs-featured-image" /> </a>
</div>
<h2><span style="font-weight: 400;">NAT Gateways in Amazon GovCloud</span></h2>
<p><span style="font-weight: 400;">So you’re deploying your government-sensitive data and services on <a href="http://feeds.feedburner.com/blog/intro-to-devops-on-govcloud">GovCloud</a>, or </span><span style="font-weight: 400;">planning to a</span><span style="font-weight: 400;">nd you want your data to be protected against third-party access, so you configure your subnets as private resources, without internet access. In other AWS regions, you could then add a managed NAT Gateway and instances would have, once configured, egress available for internet access. This allows them to update their software and run smoothly pulling necessary external information.</span></p>
<img src="https://track.hubspot.com/__ptq.gif?a=2814979&k=14&r=https%3A%2F%2Fwww.fpcomplete.com%2Fblog%2Fnat-gateways-in-amazon-govcloud&bu=https%253A%252F%252Fwww.fpcomplete.com%252Fblog&bvt=rss" alt="" height="1" style="width: 1px!important;" width="1" />Thu, 30 Nov 2017 22:25:08 +0000yghor@fpcomplete.com (Yghor Kerscher)Douglas M. Auclair (geophf): November 2017 1HaskellADay problems and solutionstag:blogger.com,1999:blog-4650294074444534066.post-5982168440952577214
http://logicaltypes.blogspot.com/2017/11/november-2017-1haskelladay-problems-and.html
<ul><li>November 30th, 2017: For Thursday's #haskell problem we've published our articles to the PostgreSQL database, now let's <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D30/Exercise.hs">extract them as JSON</a>. Today's #haskell solution is a simple SQL query and applying some work from previous exercise <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D30/Solution.hs">to fetch and JSONify recommended articles to print</a>! </li><li>November 29th, 2017: For Wednesday's #haskell problem, now that we've selected the recommended articles, let's <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D29/Exercise.hs">save that for review</a> to the PostgreSQL database. For today's #haskell solution, we delete the old recommendations-to-be-published, we insert the new set. <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D29/Solution.hs">Voilà</a>! </li><li>November 28th, 2017: Monday we added articles from a PostgreSQL database using #haskell; Tuesday we <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D28/Exercise.hs">delete articles</a>. WHEEEEE! Today's #haskell solution shows that deleting article recommendations is very much like adding recommendations, ... in reverse. <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D28/Solution.hs">WHODATHUNK</a>! </li><li>November 27th, 2017: Monday's problem: "<a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D27/Exercise.hs">It's not rocket science!</a>" Adding articles to a recommendation set in PostgreSQL with #haskell. A little bit of #haskell; a little bit of #PHP and we have an <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D27/Solution.hs">article-adder-webservice ... thingie</a>! </li><li>November 24th, 2017: Friday's #haskell problem: PostgreSQL database, JSON, Haskell: <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D24/Exercise.hs">you've got yourself a webservice</a>. Friday's #haskell solution uses the <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D24/Solution.hs">Brief-structure, this time to show article recommendations</a>.</li><li>November 23rd, 2017: For Thursday, a fun little #haskell <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D23/Exercise.hs">JSON-y exercise</a> on Thanksgiving Day from the USA to you! Y'know, Nietzsche says: "<a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D23/Solution.hs">Out of chaos comes JSON</a>." ... no ... wait. </li><li>November 22nd, 2017: Wednesday's #haskell problem: given a NYT article index, <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D22/Exercise.hs">extract the article full text from PostgreSQL</a>. Simple, eh? ... 'maybe.' <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D22/Solution.hs">Full text of NYT articles archived in PostgreSQL as JSON</a>. </li><li>November 21st, 2017: Tuesday's #haskell problem we slim down the article JSON returned from yesterday by <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D21/Exercise.hs">providing article briefs</a>. Today's #haskell solution goes from a set of recommendations <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D21/Solution.hs">from NYT articles to briefs</a>. </li><li>November 20th, 2017: Monday's #haskell problem moves <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D20/Exercise.hs">keyword/key-phrase-article retrieval and matching</a> to the PostgreSQL database. Today's #haskell solution: we build our <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D20/Solution.hs">keyword-key-phrase dictionary from SQL</a> and filter articles indexing from those keywords. </li><li>November 17th, 2017: Thursday we linked keywords to key-phrases; for Friday's #haskell problem, we'll <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D17/Exercise.hs">upload that linkage information to a PostgreSQL data store</a>. Today's #haskell solution <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D17/Solution.hs">stores 26,000+ keywords</a>, with almost 240,000 cross-references into nearly 10,000 NYT articles. Whoa. </li><li>November 16th, 2017: Thursday's #haskell problem: <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D16/Exercise.hs">provide a mapping from unique keywords to key-phrases</a> containing them. <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D16/Solution.hs">Mapping keywords to key-phrases</a>, it's what we do for today's #haskell solution. </li><li>November 15th, 2017: Wednesday's #haskell problem is to <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D15/Exercise.hs">parse a CSV file of articles to upload to an article MongoDB</a>. Today's #haskell solution has <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D15/Solution.hs">a little bit of CSVing and a little bit of JSONification to upload article information into MongoDB</a>. </li><li>November 14th, 2017: Tuesday's #haskell exercise is a parsing exercise of a different sort: <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D14/Exercise.hs">parsing CSV, but with embedded quote within columns</a>! Ooh! We now can <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D14/Solution.hs">parse CSV files with embedded quotes</a>. YES! Adding that to the old CSV parsing library. </li><li>November 13th, 2017: Indexing articles by keyword and then <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D13/Exercise.hs">searching articles by keyword</a> for Monday's #haskell problem. For today's #haskell solution we <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D13/Solution.hs">intersect sets of articles to do fast keyword searches</a>. </li><li>November 10th, 2017: Friday's #haskell exercise is to <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D10/Exercise.hs">load the keywords and recommended articles into the PostgreSQL database</a>. Thanks to today's #haskell solution <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D10/Solution.hs">we have articles and keyphrases indexed by keyword</a>. </li><li>November 9th, 2017: Thursday's #haskell problem <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D09/Exercise.hs">ties NYT article data stored in PostgreSQL together with the recommended articles from JSON</a>. Today's #haskell solution is <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D09/Solution.hs">reading NYT article recommendations from a PostgreSQL database</a>. </li><li>November 8th, 2017: Wednesday's #haskell problem is to <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D08/Exercise.hs">combine article recommendations with their key-phrases</a> to output as JSON. Wednesday's #haskell solution: <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D08/Solution.hs">JSON: GET</a>. (p.s. I love Data.Aeson.Encode.Pretty)</li><li>November 7th, 2017: Tuesday's #haskell problem is to<a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D07/Exercise.hs"> parse JSON of NYT articles and their metadata</a>. Today's #haskell solution: we have <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D07/Solution.hs">articles stored as JSON, and, voilà! we materialize those articles!</a> </li><li>November 6th, 2017: Monday's #haskell exercise is to <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D06/Exercise.hs">parse a CSV file of recommended articles</a> from the NYT archive. Today's #haskell solution <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D06/Solution.hs">uses the reads-function to guide the parsing</a> of values from a CSV file.</li><li>November 3rd, 2017: Today is Friday! YAY! Do you know what that means? It's <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D03/Exercise.hs">PARSING DAY</a> is #haskell-land! YAY! ... no ... wait. What? TIL that <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D03/Solution.hs">parsing keywords and parsing LISTS of keywords</a> can be very different things for today's #haskell solution.</li><li>November 2nd, 2017: Today's #haskell problem: <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D02/Exercise.hs">'Hello, world' in Haskell</a>. Snark on the side FO' FREE! I don't recall Eliza being this snarky, but today's #haskell solution <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D02/Solution.hs">says differently</a>. </li><li>November 1st, 2017: Wednesday #haskell problem is <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D01/Exercise.hs">scan a new archive</a> and update the special character file or correct the new archive. The #haskell solution to replacing special characters... <a href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M11/D01/Solution.hs">WHAT special characters? They GONE!</a> #ETL</li></ul>Thu, 30 Nov 2017 21:58:28 +0000noreply@blogger.com (geophf)Tweag I/O: Making two garbage collectors be good neighbours <br/> (using linear types)http://www.tweag.io/posts/2017-11-29-linear-jvm.html
http://www.tweag.io/posts/2017-11-29-linear-jvm.html
<div>Facundo Domínguez and Mathieu Boespflug</div><p>Foreign function interfaces (FFI) allow fast interop between
languages. Unlike other approaches, like performing RPC calls between
different components written in different languages, using the FFI
allows for all manner of data to be shared between each language
runtime, in the same address space. This reduces memory consumption
and obviates marshalling costs. But when two garbaged-collected
languages share references to the same values, each garbage collector
(GC) needs to be careful to not collect these values while the other
language has references to them. This is a problem we ran into when
building both <a href="https://www.stackage.org/package/inline-r">inline-r</a> and <a href="https://www.stackage.org/package/inline-java">inline-java</a>. In
this post, we'll survey this very generic problem in all fast language
interop, using Java interop as a case study.</p>
<p>Bonus: we'll show you how linear types can help solve the problem
safely.</p>
<h2>Unsafe bindings to Java</h2>
<p>The Java Virtual Machine (JVM) offers a foreign interface to
manipulate Java objects, known as the Java Native Interface (JNI).
This is a C interface, which we can readily bind in Haskell
using <a href="https://www.stackage.org/package/inline-c">inline-c</a> or similar. This is what the <a href="https://www.stackage.org/package/jni">jni</a>
package does.</p>
<p>The JNI is a low-level interface that is painful to use. No programmer
wants to invoke Java methods through the JNI using stringly typed
class names, method names and argument types. Doing so is very
error-prone and verbose. So we built higher-level abstractions on
top, <a href="https://www.stackage.org/package/jvm">jvm</a> and <a href="https://www.stackage.org/package/inline-java">inline-java</a>, that run every method
invocation through the Java type checker as well as the Haskell type
checker. Think of <code>inline-java</code> as a pretty good typo detector.</p>
<p>In fact, <code>inline-java</code> does even more than that. It checks that
Haskell types and Java types line up. It catches at compile time many
common bugs that could cause the program to crash or fail, but a few
remain. Notably,</p>
<ul>
<li>it is possible to use references to Java objects by mistake after
they have been collected, and</li>
<li>it is possible to accidentally retain large amounts of memory in the
Java heap with references that live in the memory managed by Haskell.</li>
</ul>
<p>Here's a case study: the conversion of Java <code>Iterator</code>s to Haskell
<code>Stream</code>s (as defined in the <a href="https://www.stackage.org/package/streaming">streaming</a> package).</p>
<pre><code class="language-haskell">import Foreign.JNI
import Language.Java as Java
import Language.Java.Inline
import Streaming
iteratorToStream
:: Reify a
=> J ('Iface "java.util.Iterator")
-> IO (Stream (Of a) IO ())
iteratorToStream it = do
return $ Streaming.untilRight $ do
[Inline.java| $it.hasNext() |] >>= \case
False -> return (Right ())
True -> do
obj <- [Inline.java| $it.next() |]
Left <$> Java.reify obj
</code></pre>
<p>See <a href="http://www.tweag.io/posts/2017-09-15-inline-java-tutorial.html">previous posts</a> for an intro to
<code>inline-java</code>, but here's the gist. The input to this function is any
Java object that conforms to the <code>java.util.Iterator</code> interface. The
output is a <code>Stream</code> yielding values of some type <code>a</code>. The Java
objects are pulled from the iterator as the stream is consumed. The
constraint <code>Reify a</code> states that we know how to convert Java objects
to Haskell values of type <code>a</code>. We do this on the last line by calling
<code>reify</code>.</p>
<p>Like in Java, <code>it</code> and <code>obj</code> above are actually <em>references</em> to
objects. But it's a special type of reference provided by the JNI,
which can be used by foreign code (such as C or Haskell). These JNI
references need to be deleted explicitly once they are no longer
needed, otherwise JVM objects cannot be reclaimed by the JVM GC.</p>
<p>The above implementation of <code>iteratorToStream</code> is not deleting the
references to Java objects. That's a leak! Indeed, an object reference
acts as a root in the graph of all objects in the heap, as far as the
JVM garbage collector is concerned. Adding to the problem, the JVM
can't deal very well with large and unknown amounts of references. The
JNI expects native calls to use only a few references and expects the
programmer to say in advance how many references will be needed.
Failing to do so affects performance and can lead to failures.</p>
<p>A straightforward fix to this situation is to delete the reference
after the Haskell value has been obtained.</p>
<pre><code class="language-haskell"> ...
bracket [Inline.java| $it.next() |]
JNI.deleteLocalRef
(\jNext -> Left <$> Java.reify jNext)
</code></pre>
<p>There are two problems with this approach:</p>
<ul>
<li>this puts the burden on the programmer to remember to delete the
reference and to be careful not to use it afterwards (or risk
a segfault). Moreover,</li>
<li>JNI references are usually <em>local</em>, meaning that they are only valid
on the thread that created them. So the programmer has to be careful
to not share them with other threads.</li>
</ul>
<p>Could we possibly ask the compiler to perform these checks?</p>
<h2>Garbage Collector Finalizers</h2>
<p>One way to avoid needing these checks in the first place is to just
let the Haskell GC delete Java references automatically when they
become unreachable. We attach to each reference a finalizer that
deletes it, which is going to be called by the Haskell GC. Such
references are no longer <em>local</em> references, but <em>global</em> references.
Unlike local references, a global reference can be used in any thread
and it is not destroyed when control returns to Java. Since the JNI
provides a facility to promote any local reference to a global one,
couldn't we just turn all local references into global ones and then
have them be managed by the GC? A global reference is more expensive
than a local one, so performance suffers. But it mostly works. Until
you run out of memory...</p>
<p>A major problem with letting the GC run the show completely is that
counter intuitively, sometimes memory might never be reclaimed, even
when many objects are long dead. Suppose that the Java heap is
crowded, the Garbage Collector of the JVM is desperate to kick some
objects out of existence, and yet there is a good chunk of references
from Haskell-land to the Java Heap. The Haskell portion of the
application is already done with the references, but since there is
plenty of space in the Haskell heap, the Haskell's Garbage Collector
is basking in the sun, with no pressure to run the finalizers that
would delete the unused references.</p>
<p>Sometimes, the application is lucky and the Haskell GC runs the
finalizers just in time, which lets the Java GC clean
the Java heap. Unfortunately, sometimes, the Haskell GC won't run and
the JVM will fail with an <code>OutOfMemory</code> exception.</p>
<h2>Dynamic scopes</h2>
<p>Another solution is to define dynamic scopes. When a program's control
flow enters a scope, we open a new buffer. We keep track of all newly
created references in the buffer, until the control flow leaves the
scope, at which point we discard all recorded references all at once.
In general, scopes are not allowed to overlap arbitrarily, but they
can be nested.</p>
<p>In Haskell,
the <a href="https://www.stackage.org/package/resourcet">resourcet</a> package
neatly encapsulates this idea. The JNI natively supports a similar
idea with
using
<a href="https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#push_local_frame"><code>pushLocalFrame</code></a> and
<a href="https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#pop_local_frame"><code>popLocalFrame</code></a>.
<code>pushLocalFrame (n :: Int)</code> creates a new scope in which at least <code>n</code>
local references can be created. Exceeding the given capacity might
cause performance issues or errors. <code>popLocalFrame j</code> copies the
reference <code>j</code> to the parent frame and deletes the current frame, which
causes all references of the frame to be deleted.</p>
<p>We are still running the risk of accidentally using a local reference
after deletion, and to use it in threads where it is invalid. But
programmers no longer need to remember to delete <em>individual</em> local
references. Still, in practice we found difficult finding a hierarchy
of nested scopes that keeps the counts of local references low.
It is
a problem that worsens with the size of the application. When building
a complex server application that made many invocations to Java, we
started with a scope per client request, and then a scope per test,
and then we added scopes within the scopes when we were creating more
local references than anticipated. Eventually, it did get very
difficult for multiple teams of programmers of varying experience
levels to be sure that the number of extant references stayed bounded
for all possible code paths and inputs.</p>
<h2>Linear Types</h2>
<p>We would really prefer to delete a reference exactly
when we know it to be no longer useful. In this way, memory becomes
reclaimable by Java GC immediately. The problem is: it's easy to
forget doing so at all, leading to multiple leaks in an application.
The key invariant we want checked by the
compiler is that once we have a reference, it should be deleted
<em>exactly once</em>, and never referred to after that. That is, we want to
use references <em>linearly</em>.</p>
<p>What if we used the GHC proposal for
<a href="https://github.com/ghc-proposals/ghc-proposals/pull/91">linear types</a>
to treat our local references linearly? It would look something like this:</p>
<pre><code class="language-haskell">import Foreign.JNI
import Language.Java as Java
import Language.Java.Inline as Inline.
import Streaming
iteratorToStream
:: Reify a
=> J ('Iface "java.util.Iterator" <> [Interp a])
->. IOL (Stream (Of a) IOL ())
iteratorToStream itLocal = do
return $ Streaming.untilRight $ do
[Inline.java| $it.hasNext() |] >>= \case
False -> return (Right ())
True -> do
obj0 <- [Inline.java| $it.next() |]
(obj1, Unrestricted a) <- Java.reify obj0
JNI.deleteLocalRef obj1
return a
Java.reify :: J (Interp a) ->. IOL (J (Interp a), Unrestricted a)
-- | A linear value of type `Unrestricted a` holds a value of
-- type `a` which can be used non-linearly or unrestrictly.
data Unrestricted a where
Unrestricted :: a -> Unrestricted a
</code></pre>
<p>We are assuming that we have a restricted form of the <code>IO</code> monad,
called <code>IOL</code>, with the following operations.</p>
<pre><code>return :: a ->. IOL a
(>>=) :: IOL a ->. (a ->. IOL b) ->. IOL b
liftIO :: IO a -> IOL a
data IOL a where
IOL :: IO a -> IOL a
runIOL :: IOL (Unrestricted a) -> IO a
runIOL (IOL io) =
Unrestricted a <-
bracket_ (JNI.pushLocalFrame capacity)
(JNI.popLocalFrame JNI.jnull)
io
return a
where
capacity = ...
</code></pre>
<p>Compared to dynamic scopes, the major feature of <code>IOL</code> is that
programmers can delete local references promptly, inside a single
global scope, when they are no longer needed. The programmer doesn't
have to be concerned with guessing a scope hierarchy anymore.</p>
<p><code>IOL</code> introduces local references as linear values. Operations that do
not delete the reference, like <code>reify</code>, now have to return a copy of
it, and the operations that delete the value, like <code>deleteLocalRef</code>,
produce no copy. This means both that references cannot be used after
they are deleted (since they can't be used more than once), and that
the compiler will require them to be deleted eventually (they must be
used at least once). Finally, local references cannot be allowed to
escape the scope of <code>runIOL</code>, as they become invalid before <code>runIOL</code>
returns. This is achieved by constraining its argument to yield an
unrestricted value <code>Unrestricted a</code>. Local references are released
promptly even if an exception arises, thanks to the <code>bracket</code> inside
<code>runIOL</code> and the fact that there is no way to catch exceptions in <code>IOL</code>.</p>
<p>Admittedly, if exceptions need to be caught, it has to be done by the
caller of <code>runIOL</code>. In our experience, many applications need
to catch exceptions in a few places only, so this is a modest price to
pay.</p>
<h2>Summary</h2>
<p>Each the local and global references we create via the JNI is
effectively a GC root for the Java GC. The JNI was designed with the
assumption that programmers ensure that very few such roots are in
flight at any one time. The R native interface and others make similar
assumptions. In this post, we discussed the tension that arises
between releasing early and frequently, and doing so safely without
increasing the risk of use-after-free bugs. With linear types, we can
get both.</p>
<p>A competing approach that we haven't discussed is the lightweight
monadic regions of
<a href="http://okmij.org/ftp/Haskell/regions.html#light-weight">Kiselyov and Shan</a>.
This is an incarnation of dynamic scopes that, like linear types, have
the type checker guarantee that resources aren't used after released
and that they aren't used in other threads. However, they still demand
from the programmer to not insert too many or too few scopes.</p>
<p>Some have suggested introducing affine types instead of linear types
in Haskell. But for the particular use case discussed in this post,
affine types would do no better than these monadic regions. That's
because affine types provide a weaker guarantee to the caller: we can
return to the caller having used the argument at most once, but also
never at all. We'd need nested scopes all over again to ensure that
references <em>do</em> get disposed of in a timely fashion.</p>
<p>In our discussion of linear types, we brought streams to a linear
monad without delving into the details of whether it is possible and how
it would work. This will be the topic for a future post.</p>Wed, 29 Nov 2017 00:00:00 +0000Gabriel Gonzalez: Compare Nix derivations using nix-difftag:blogger.com,1999:blog-1777990983847811806.post-6867092241397990691
http://www.haskellforall.com/2017/11/compare-nix-derivations-using-nix-diff.html
<head><meta charset="UTF-8"/></head><p>I'm announcing a small <code>nix-diff</code> utility I wrote for comparing Nix derivations. This post will walk through two use cases for how you might use this utility.</p><h2 id="background">Background</h2><p>This section provides some required background for understanding this post if you're new to Nix.</p><p>There are three stages to a Nix build:</p><ul><li>Nix source code (i.e. <code>*.nix</code> files) <ul><li>This corresponds to a source distribution in a typical package manager</li></ul></li><li>Nix derivations (i.e. <code>/nix/store/*.drv</code> files) <ul><li>This is the stage that caching works at</li></ul></li><li>Nix build products (i.e. <code>/nix/store/*</code> files that are not derivations) <ul><li>This corresponds to a binary distribution in a typical package manager</li></ul></li></ul><p>You can convert between these stages using the following command-line tools:</p><ul><li><code>nix-instantiate</code> converts Nix source code to Nix derivations <ul><li>i.e. <code>*.nix → /nix/store/*.drv</code></li></ul></li><li><code>nix-store --realise</code> converts Nix derivations to Nix build products <ul><li>i.e. <code>/nix/store/*.drv → /nix/store/*</code></li></ul></li><li><code>nix-build</code> is a convenience utility which combines the two preceding steps to go straight from source code to build products <ul><li>i.e. <code>*.nix → /nix/store/*</code></li></ul></li></ul><p>Nix supports caching binary build products so if you try to build the same derivation twice then the second build will reuse the result of the first build (i.e. a "cache hit"). If the derivation changes in any way, you get a "cache miss" and you need to build the derivation.</p><p>Carefully note that caching works at the level of Nix derivations and not at the level of Nix source code. For example, the following two Nix files differ at the source code level:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="fu">cat</span> example0.nix <br /><span class="bu">let</span><br /> <span class="ex">pkgs</span> = import <span class="op"><</span>nixpkgs<span class="op">></span> { };<br /><br /><span class="kw">in</span><br /> <span class="ex">pkgs.hello</span><br /><br />$ <span class="fu">cat</span> example1.nix <br /><span class="kw">(</span><span class="ex">import</span> <span class="op"><</span>nixpkgs<span class="op">></span> { }<span class="kw">)</span><span class="ex">.hello</span></code></pre></div><p>... but they produce the exact same derivation file:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="ex">nix-instantiate</span> example0.nix <br /><span class="ex">/nix/store/ajypjz54a8rn1qxsnhyr8m87w6hd7ghp-hello-2.10.drv</span><br /><br />$ <span class="ex">nix-instantiate</span> example1.nix <br /><span class="ex">/nix/store/ajypjz54a8rn1qxsnhyr8m87w6hd7ghp-hello-2.10.drv</span></code></pre></div><p>... which means that if you try to build both <code>example0.nix</code> and <code>example1.nix</code> the build will only occur once since they share the same derivation.</p><p>You can think of the derivation file as a language-independent description of how to build something:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="ex">fold</span> /nix/store/ajypjz54a8rn1qxsnhyr8m87w6hd7ghp-hello-2.10.drv <br /><span class="ex">Derive</span>([(<span class="st">"out"</span>,<span class="st">"/nix/store/1ijp0xy3s6idns5047lxky6nlj4lrcap-hello-2.10"</span>,<span class="st">""</span>,<span class="st">""</span>)],<br />[<span class="kw">(</span><span class="st">"/nix/store/3ma3q2qf60gvsqs4w0k2krcyikr1pvhf-bash-4.4-p12.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix</span><br /><span class="st">/store/8g65wh5ng9dc68mz07wlznzg4f2zqhlh-stdenv-darwin.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix/store</span><br /><span class="st">/gqwk1j05s2zfykfj4y9k15gs4zl0lynr-hello-2.10.tar.gz.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>],[<span class="st">"/nix/store/</span><br /><span class="st">9krlzvny65gdc8s7kpb6lkx8cd02c25b-default-builder.sh"</span>,<span class="st">"/nix/store/z347hsajryw593h</span><br /><span class="st">802ggb63lbr3gpv2b-standard-sandbox.sb"</span>],<span class="st">"x86_64-darwin"</span>,<span class="st">"/nix/store/axikcsz4wh2q</span><br /><span class="st">pi5zmlfsmm4jx8wm8s1g-bash-4.4-p12/bin/bash"</span>,[<span class="st">"-e"</span>,<span class="st">"/nix/store/9krlzvny65gdc8s7kp</span><br /><span class="st">b6lkx8cd02c25b-default-builder.sh"</span>],[<span class="kw">(</span><span class="st">"__impureHostDeps"</span>,<span class="st">"/System/Library/Framew</span><br /><span class="st">orks/CoreFoundation.framework/CoreFoundation /dev/zero /dev/random /dev/urandom </span><br /><span class="st">/bin/sh"</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"__propagatedImpureHostDeps"</span>,<span class="st">""</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"__propagatedSandboxProfile"</span>,<span class="st">""</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"</span><br /><span class="st">__sandboxProfile"</span>,<span class="st">"(allow file-read* (literal </span><span class="dt">\"</span><span class="st">/usr/lib/libncurses.5.4.dylib</span><span class="dt">\"</span><span class="st">)</span><br /><span class="st">)\n(import </span><span class="dt">\"</span><span class="st">/nix/store/z347hsajryw593h802ggb63lbr3gpv2b-standard-sandbox.sb</span><span class="dt">\"</span><span class="st">)\</span><br /><span class="st">n"</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"buildInputs"</span>,<span class="st">""</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"builder"</span>,<span class="st">"/nix/store/axikcsz4wh2qpi5zmlfsmm4jx8wm8s1g-b</span><br /><span class="st">ash-4.4-p12/bin/bash"</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"configureFlags"</span>,<span class="st">""</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"doCheck"</span>,<span class="st">"1"</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"name"</span>,<span class="st">"hello-2.10</span><br /><span class="st">"</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"nativeBuildInputs"</span>,<span class="st">""</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"out"</span>,<span class="st">"/nix/store/1ijp0xy3s6idns5047lxky6nlj4lrcap-</span><br /><span class="st">hello-2.10"</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"propagatedBuildInputs"</span>,<span class="st">""</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"propagatedNativeBuildInputs"</span>,<span class="st">""</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"s</span><br /><span class="st">rc"</span>,<span class="st">"/nix/store/3x7dwzq014bblazs7kq20p9hyzz0qh8g-hello-2.10.tar.gz"</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"stdenv"</span>,<span class="st">"</span><br /><span class="st">/nix/store/dl508ngmyfglplp338np4lnx98prwsbd-stdenv-darwin"</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"system"</span>,<span class="st">"x86_64-da</span><br /><span class="st">rwin"</span><span class="kw">)</span>])</code></pre></div><p>These <code>*.drv</code> files use the ATerm file format and are Nix-independent. Conceptually, Nix is just a domain-specific language for generating these <code>ATerm</code> files. That means, for example, that you could replace Nix with any front-end language or tool that can generate these ATerm files. In fact, this is how <a href="https://www.gnu.org/software/guix/">Guix</a> works, by replacing Nix with Guile Scheme as the front-end language.</p><p>Understanding how Nix derivations work is fundamental to understanding the Nix ecosystem. <code>nix-diff</code> is one tool that aids this learning process as the following sections will illustrate.</p><h1 id="cache-misses">Cache misses</h1><p><code>nix-diff</code> is a tool that I wish I had back when <a href="https://awakesecurity.com/">Awake Security</a> first adopted Nix. We frequently ran into cache misses when using Nix because of subtle differences in Nix derivations in different development environments.</p><p>We can understand why we got cache misses by referring back to the three stages of a Nix build:</p><ul><li>Nix source code (i.e. <code>*.nix</code> files)</li><li>Nix derivations (i.e. <code>/nix/store/*.drv</code> files)</li><li>Nix build products (i.e. <code>/nix/store/*</code> files that are not derivations)</li></ul><p>For production we prefer to distribute Nix build products (i.e. binary distributions), but internally for development we distribute Nix source code. We prefer Nix code internally because this gives developers complete control over all of their transitive dependencies. For example, a developer can easily patch the <code>systemd</code> executable used on the virtual machine that runs their integration tests.</p><p>However, this flexibility comes at a price: if you don't know what you are doing you can easily accidentally change the derivation. This is because Nix and Nixpkgs are customizable to a fault and they have all sorts of "impure" defaults that change depending on the development environment. If you trip over one of these pitfalls you end up with a cache miss, which is a poor user experience.</p><p>The most common pitfalls we ran into early on in our Nix adoption were:</p><ul><li>Not pinning <code>nixpkgs</code><ul><li>Note: We publicly shared our recipe for pinning <code>nixpkgs</code> <a href="https://nixos.wiki/wiki/How_to_fetch_Nixpkgs_with_an_empty_NIX_PATH">here</a></li></ul></li><li>Not pinning the <code>system</code> field for a derivation <ul><li>This field defaults to the impure <code>builtins.currentSystem</code> in many cases</li></ul></li><li>Impure surprises in <code>nixpkgs</code><ul><li>... such as <a href="https://github.com/NixOS/nixpkgs/blob/master/lib/trivial.nix#L62">this impure logic</a> to compute the <code>nixpkgs</code> version</li></ul></li></ul><p>Let's motivate this with a real example. Suppose that I have the following derivation to build the Glasgow Haskell compiler (<code>ghc</code>):</p><pre><code>$ cat example0.nix<br />let<br /> pkgs = import <nixpkgs> { };<br /><br />in<br /> pkgs.ghc</code></pre><p>This Nix expression is "impure" because the expression depends on the ambient <code>nixpkgs</code> channel that the user has installed. Compare this to the following expression which pins <code>nixpkgs</code> to a specific revision protected by a hash:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="fu">cat</span> example1.nix<br /><span class="bu">let</span><br /> <span class="co"># https://nixos.wiki/wiki/How_to_fetch_Nixpkgs_with_an_empty_NIX_PATH</span><br /> <span class="ex">fetchNixpkgs</span> = import ./fetchNixpkgs.nix<span class="kw">;</span><br /><br /> <span class="ex">nixpkgs</span> = fetchNixpkgs {<br /> <span class="fu">rev</span> = <span class="st">"76d649b59484607901f0c1b8f737d8376a904019"</span><span class="kw">;</span><br /> <span class="ex">sha256</span> = <span class="st">"01c2f4mj4ahir0sxk9kxbymg2pki1pc9a3y6r9x6ridry75fzb8h"</span><span class="kw">;</span><br /> };<br /><br /> <span class="ex">pkgs</span> = import nixpkgs { };<br /><br /><span class="kw">in</span><br /> <span class="ex">pkgs.ghc</span></code></pre></div><p>Let's instantiate the two expressions to compute their derivations:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="ex">nix-instantiate</span> example0.nix <br /><span class="ex">/nix/store/9shbgc70h32f99nasdd6f8fd7cf9c645-ghc-8.0.2.drv</span><br />$ <span class="ex">nix-instantiate</span> example1.nix <br /><span class="ex">/nix/store/fx0xn9djgvvw3h5jdmwybg0ga5qk844d-ghc-8.0.2.drv</span></code></pre></div><p>Note that you may get a different result for the first derivation depending on what version of the <code>nixpkgs</code> channel you have installed.</p><p>Visually comparing the two derivation files is tedious and time-consuming:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="ex">fold</span> /nix/store/9shbgc70h32f99nasdd6f8fd7cf9c645-ghc-8.0.2.drv <span class="kw">|</span> <span class="fu">head</span><br /><span class="ex">Derive</span>([(<span class="st">"doc"</span>,<span class="st">"/nix/store/x3hcyy01kb980yiirjjb3svzrdb0pqdy-ghc-8.0.2-doc"</span>,<span class="st">""</span>,<span class="st">""</span><br />),<span class="kw">(</span><span class="st">"man"</span>,<span class="st">"/nix/store/l1ws9nypjg4xh8jj47dapx71cmgfb97a-ghc-8.0.2-man"</span>,<span class="st">""</span>,<span class="st">""</span><span class="kw">)</span>,<span class="kw">(</span><span class="st">"ou</span><br /><span class="st">t"</span>,<span class="st">"/nix/store/76b5ryd9wsc0iimlfz6f4n8kgawf8cli-ghc-8.0.2"</span>,<span class="st">""</span>,<span class="st">""</span><span class="kw">)</span>],[<span class="kw">(</span><span class="st">"/nix/store</span><br /><span class="st">/1ncnhkd9r4k3wmlwbymccfhlqp3bk2cp-python2.7-Sphinx-1.6.5.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix/st</span><br /><span class="st">ore/2zdlq3dj3mk91ccya7k9z6d5i7lag912-clang-wrapper-4.0.1.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix/st</span><br /><span class="st">ore/3ma3q2qf60gvsqs4w0k2krcyikr1pvhf-bash-4.4-p12.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix/store/5mp</span><br /><span class="st">3qjkbzvmi4yvin1dbfdr1bkzgq9dl-perl-5.24.3.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix/store/8g65wh5ng9d</span><br /><span class="st">c68mz07wlznzg4f2zqhlh-stdenv-darwin.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix/store/9z3ykw788f50yhi4f</span><br /><span class="st">nn3s1ldyyg5s99x-ncurses-5.9.drv"</span>,[<span class="st">"dev"</span>,<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix/store/hw59y7rf8w28s123b51</span><br /><span class="st">ac57kbd0azjvh-coreutils-8.28.drv"</span>,[<span class="st">"out"</span>]<span class="kw">)</span>,<span class="kw">(</span><span class="st">"/nix/store/km0zhgg5ykpnwnrczinggxs5</span><br /><br /><span class="st">$ fold /nix/store/fx0xn9djgvvw3h5jdmwybg0ga5qk844d-ghc-8.0.2.drv | head</span><br /><span class="st">Derive([("</span><span class="ex">doc</span><span class="st">","</span>/nix/store/qlg3a9923hbcb1vhhaka90c33vrfgbrv-ghc-8.0.2-doc<span class="st">","",""</span><br /><span class="st">),("</span>out<span class="st">","</span>/nix/store/69spfrh96hc6y3hcb7w4i0l6s25pslkd-ghc-8.0.2<span class="st">","","")],[("</span>/nix<br /><span class="ex">/store/0ci2jv8sygw63hyl48ac6caw7fn3jrd7-ncurses-5.9.drv</span><span class="st">",["</span><span class="ex">dev</span><span class="st">","</span>out<span class="st">"]),("</span>/nix/s<br /><span class="ex">tore/1ksvs625n8lwjhjxld446gn9ql23v5k8-bash-4.4-p5.drv</span><span class="st">",["</span><span class="ex">out</span><span class="st">"]),("</span>/nix/store/dqj<br /><span class="ex">rkys7d0c2z4ggny27a0vzpbzvz8y2-ghc-8.0.2-src.tar.xz.drv</span><span class="st">",["</span>out<span class="st">"]),("</span>/nix/store/dw<br /><span class="ex">srl4iqnc3ij79h2xfn8fl3xnnk2zrg-gmp-6.1.1.drv</span><span class="st">",["</span>dev<span class="st">","</span>out<span class="st">"]),("</span>/nix/store/gk2ng3<br /><span class="ex">j3ixx6diq5s4xmysj670k62lly-perl-5.22.3.drv</span><span class="st">",["</span>out<span class="st">"]),("</span>/nix/store/i00ja8b4y0yv9b<br /><span class="ex">aj7qd0caj6az0c8phj-ghc-7.10.3.drv</span><span class="st">",["</span>out<span class="st">"]),("</span>/nix/store/k82idwsbgby27nkjrwr9bhq<br /><span class="ex">64c95irgf-coreutils-8.26.drv</span><span class="st">",["</span>out<span class="st">"]),("</span>/nix/store/nmkqpzlahvmpsnn0s5knc6wspy6b<br /><span class="ex">305l-stdenv-darwin.drv</span><span class="st">",["</span>out<span class="st">"]),("</span>/nix/store/qv0cpl2g4bk5nn5l2hx5fyc2dw6xdjc9-c</code></pre></div><p>If we use <code>nix-diff</code>, then we can pull out the differences immediately:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="ex">nix-diff</span> /nix/store/fx0xn9djgvvw3h5jdmwybg0ga5qk844d-ghc-8.0.2.drv /nix/store/9shbgc70h32f99nasdd6f8fd7cf9c645-ghc-8.0.2.drv <br /><span class="ex">-</span> /nix/store/fx0xn9djgvvw3h5jdmwybg0ga5qk844d-ghc-8.0.2.drv:<span class="dt">{out}</span><br /><span class="ex">+</span> /nix/store/9shbgc70h32f99nasdd6f8fd7cf9c645-ghc-8.0.2.drv:<span class="dt">{out}</span><br />• <span class="ex">The</span> set of outputs do not match:<br /> <span class="ex">+</span> <span class="dt">{man}</span><br />• <span class="ex">The</span> builders do not match<br /> <span class="ex">-</span> /nix/store/hsk82g493i7r496ghs0y61m6yvknxcml-bash-4.4-p5/bin/bash<br /> <span class="ex">+</span> /nix/store/axikcsz4wh2qpi5zmlfsmm4jx8wm8s1g-bash-4.4-p12/bin/bash<br />• <span class="ex">The</span> set of input names do not match:<br /> <span class="ex">-</span> bash-4.4-p5<br /> <span class="ex">-</span> clang-wrapper-3.7.1<br /> <span class="ex">-</span> coreutils-8.26<br /> <span class="ex">-</span> gmp-6.1.1<br /> <span class="ex">-</span> perl-5.22.3<br /> <span class="ex">-</span> python2.7-Sphinx-1.5.2<br /> <span class="ex">+</span> bash-4.4-p12<br /> <span class="ex">+</span> clang-wrapper-4.0.1<br /> <span class="ex">+</span> coreutils-8.28<br /> <span class="ex">+</span> gmp-6.1.2<br /> <span class="ex">+</span> perl-5.24.3<br /> <span class="ex">+</span> python2.7-Sphinx-1.6.5</code></pre></div><p>Now we can see at a glance that the versions of several dependencies changed and GHC has split out its <code>man</code> pages into a new <code>man</code> output for better granularity of the build graph.</p><p>Note that these are not the only differences between the two derivations. However, all of the other differences are downstream of the above differences. For example, the two derivations have different <code>out</code> paths, but we expect them to differ for any two derivations that are not identical so there's no point including that in the diff. <code>nix-diff</code> makes an effort to highlight the root cause of the difference.</p><h2 id="understanding-differences">Understanding differences</h2><p>Nix is more than just a package manager. You can use Nix to build and deploy an entire machine, which is how NixOS (the Nix operating system) works. The machine configuration is a Nix expression that you can instantiate and build like any other Nix expression.</p><p>This means that we can also use <code>nix-diff</code> to compare two machine configurations and understand how they differ. For example, when we change our production systems at <a href="https://awakesecurity.com/">Awake Security</a> we sometimes run the change through <code>nix-diff</code> during code review to ensure that reviewers understand every change being made to the system.</p><p>We can illustrate this with a small example comparing two NixOS system specifications. The first system specification is a mostly blank system:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="fu">cat</span> example0.nix<br /><span class="bu">let</span><br /> <span class="ex">nixos</span> = import <span class="op"><</span>nixpkgs/nixos<span class="op">></span> {<br /> <span class="ex">system</span> = <span class="st">"x86_64-linux"</span><span class="kw">;</span><br /><br /> <span class="ex">configuration</span> = {<br /> <span class="ex">boot.loader.grub.devices</span> = [ <span class="st">"/dev/sda"</span> ]<span class="kw">;</span><br /><br /> <span class="ex">fileSystems.</span><span class="st">"/"</span> = {<br /> <span class="ex">device</span> = <span class="st">"/dev/sda"</span><span class="kw">;</span><br /> };<br /> };<br /> };<br /><br /><span class="kw">in</span><br /> <span class="ex">nixos.system</span></code></pre></div><p>... and the second specification enables Kafka on the system:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="fu">cat</span> example1.nix<br /><span class="bu">let</span><br /> <span class="ex">nixos</span> = import <span class="op"><</span>nixpkgs/nixos<span class="op">></span> {<br /> <span class="ex">system</span> = <span class="st">"x86_64-linux"</span><span class="kw">;</span><br /><br /> <span class="ex">configuration</span> = {<br /> <span class="ex">boot.loader.grub.devices</span> = [ <span class="st">"/dev/sda"</span> ]<span class="kw">;</span><br /><br /> <span class="ex">fileSystems.</span><span class="st">"/"</span> = {<br /> <span class="ex">device</span> = <span class="st">"/dev/sda"</span><span class="kw">;</span><br /> };<br /><br /> <span class="ex">services.apache-kafka.enable</span> = true<span class="kw">;</span><br /> };<br /> };<br /><br /><span class="kw">in</span><br /> <span class="ex">nixos.system</span></code></pre></div><p>We can differentiate the two derivations in one step like this:</p><div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="ex">nix-diff</span> <span class="va">$(</span><span class="ex">nix-instantiate</span> example0.nix<span class="va">)</span> <span class="va">$(</span><span class="ex">nix-instantiate</span> example1.nix<span class="va">)</span><br /><span class="ex">-</span> /nix/store/6z9nr5pzs4j1v9mld517dmlcz61zy78z-nixos-system-nixos-18.03pre119245.<br /><span class="ex">5cfd049a03.drv</span>:<span class="dt">{out}</span><br /><span class="ex">+</span> /nix/store/k05ibijg0kknvwrgfyb7dxwjrs8qrlbj-nixos-system-nixos-18.03pre119245.<br /><span class="ex">5cfd049a03.drv</span>:<span class="dt">{out}</span><br />• <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">etc</span><span class="kw">`</span> differs<br /> <span class="ex">-</span> /nix/store/05c0v10pla0v8rfl44rs744m6wr729jy-etc.drv:<span class="dt">{out}</span><br /> <span class="ex">+</span> /nix/store/8waqvzjg7bazzfzr49m89q299kz972wv-etc.drv:<span class="dt">{out}</span><br /> • <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">dbus-1</span><span class="kw">`</span> differs<br /> <span class="ex">-</span> /nix/store/a16j2snzz25dhh96jriv3p6cgkc0vhxr-dbus-1.drv:<span class="dt">{out}</span><br /> <span class="ex">+</span> /nix/store/mliabzdkqaayya67xiwfhwkg4gs9k0cg-dbus-1.drv:<span class="dt">{out}</span><br /> • <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">system-path</span><span class="kw">`</span> differs<br /> <span class="ex">-</span> /nix/store/jcf6q7na01j8k9xcmqxykl62k4x6zwiv-system-path.drv:<span class="dt">{out}</span><br /> <span class="ex">+</span> /nix/store/kh4kgsms24d02bxlrxb062pgsbs3riws-system-path.drv:<span class="dt">{out}</span><br /> • <span class="ex">The</span> set of input names do not match:<br /> <span class="ex">+</span> apache-kafka-2.12-0.10.2.0<br /> • <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">system-path</span><span class="kw">`</span> differs<br /> • <span class="ex">These</span> two derivations have already been compared<br /> • <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">system-units</span><span class="kw">`</span> differs<br /> <span class="ex">-</span> /nix/store/yqnqdajd4664rvycrnwxwaj0mxp7602c-system-units.drv:<span class="dt">{out}</span><br /> <span class="ex">+</span> /nix/store/2p5c4arwqphdz5wsvz6dbrgv0vhgf5qh-system-units.drv:<span class="dt">{out}</span><br /> • <span class="ex">The</span> set of input names do not match:<br /> <span class="ex">+</span> unit-apache-kafka.service<br /> • <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">user-units</span><span class="kw">`</span> differs<br /> <span class="ex">-</span> /nix/store/x34dqw5y34dq6fj5brj2b5qf0nvglql9-user-units.drv:<span class="dt">{out}</span><br /> <span class="ex">+</span> /nix/store/4iplnk260q2dpr8b8ajrjkrn44yk06aq-user-units.drv:<span class="dt">{out}</span><br /> • <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">unit-dbus.service</span><span class="kw">`</span> differs<br /> <span class="ex">-</span> /nix/store/fd6j972zn1hfvqslxc8c64xxaf1wg475-unit-dbus.service.drv:<span class="dt">{out}</span><br /> <span class="ex">+</span> /nix/store/s7rpgwbald9qx8rwlw4v276wj2x3ld8r-unit-dbus.service.drv:<span class="dt">{out}</span><br /> • <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">dbus-1</span><span class="kw">`</span> differs<br /> • <span class="ex">These</span> two derivations have already been compared<br />• <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">system-path</span><span class="kw">`</span> differs<br /> • <span class="ex">These</span> two derivations have already been compared<br />• <span class="ex">The</span> input named <span class="kw">`</span><span class="ex">users-groups.json</span><span class="kw">`</span> differs<br /> <span class="ex">-</span> /nix/store/x6c7pqx40wfdzwf96jfi1l0hzxjgypri-users-groups.json.drv:<span class="dt">{out}</span><br /> <span class="ex">+</span> /nix/store/gk5yyjw579hgyxgwbrh1kzb3hbdbzgbq-users-groups.json.drv:<span class="dt">{out}</span><br /> • <span class="ex">The</span> environments do not match:<br /> <span class="va">text=</span><span class="st">''</span><br /> <span class="dt">{"groups":[{"gid":55,"members":[],"name":"adm"}</span>,{<span class="st">"gid"</span>:<span class="ex">17</span>,<span class="st">"members"</span>:[]<br />,<span class="st">"name"</span>:<span class="st">"audio"</span>},<span class="dt">{"gid":24,"members":[],"name":"cdrom"}</span>,{<span class="st">"gid"</span>:<span class="ex">27</span>,<span class="st">"members"</span>:[],<span class="st">"</span><br /><span class="st">name"</span>:<span class="st">"dialout"</span>},<span class="dt">{"gid":6,"members":[],"name":"disk"}</span>,{<span class="st">"gid"</span>:<span class="ex">18</span>,<span class="st">"members"</span>:[],<span class="st">"na</span><br /><span class="st">me"</span>:<span class="st">"floppy"</span>},<span class="dt">{"gid":174,"members":[],"name":"input"}</span>,{<span class="st">"gid"</span>:<span class="ex">96</span>,<span class="st">"members"</span>:[],<span class="st">"na</span><br /><span class="st">me"</span>:<span class="st">"keys"</span>},<span class="dt">{"gid":2,"members":[],"name":"kmem"}</span>,{<span class="st">"gid"</span>:<span class="ex">20</span>,<span class="st">"members"</span>:[],<span class="st">"name"</span>:<span class="st">"</span><br /><span class="st">lp"</span>},<span class="dt">{"gid":4,"members":[],"name":"messagebus"}</span>,{<span class="st">"gid"</span>:<span class="ex">30000</span>,<span class="st">"members"</span>:[<span class="st">"nixbld1</span><br /><span class="st">"</span>,<span class="st">"nixbld10"</span>,<span class="st">"nixbld11"</span>,<span class="st">"nixbld12"</span>,<span class="st">"nixbld13"</span>,<span class="st">"nixbld14"</span>,<span class="st">"nixbld15"</span>,<span class="st">"nixbld16"</span>,<span class="st">"</span><br /><span class="st">nixbld17"</span>,<span class="st">"nixbld18"</span>,<span class="st">"nixbld19"</span>,<span class="st">"nixbld2"</span>,<span class="st">"nixbld20"</span>,<span class="st">"nixbld21"</span>,<span class="st">"nixbld22"</span>,<span class="st">"nixb</span><br /><span class="st">ld23"</span>,<span class="st">"nixbld24"</span>,<span class="st">"nixbld25"</span>,<span class="st">"nixbld26"</span>,<span class="st">"nixbld27"</span>,<span class="st">"nixbld28"</span>,<span class="st">"nixbld29"</span>,<span class="st">"nixbld3</span><br /><span class="st">"</span>,<span class="st">"nixbld30"</span>,<span class="st">"nixbld31"</span>,<span class="st">"nixbld32"</span>,<span class="st">"nixbld4"</span>,<span class="st">"nixbld5"</span>,<span class="st">"nixbld6"</span>,<span class="st">"nixbld7"</span>,<span class="st">"nixb</span><br /><span class="st">ld8"</span>,<span class="st">"nixbld9"</span>],<span class="st">"name"</span>:<span class="st">"nixbld"</span>},<span class="dt">{"gid":65534,"members":[],"name":"nogroup"}</span>,{<span class="st">"g</span><br /><span class="st">id"</span>:<span class="ex">0</span>,<span class="st">"members"</span>:[],<span class="st">"name"</span>:<span class="st">"root"</span>},{<span class="st">"gid"</span>:<span class="ex">62</span>,<span class="st">"members"</span>:[],<span class="st">"name"</span>:<span class="st">"systemd-journal</span><br /><span class="st">"</span>},<span class="dt">{"gid":110,"members":[],"name":"systemd-journal-gateway"}</span>,{<span class="st">"gid"</span>:<span class="ex">152</span>,<span class="st">"members</span><br /><span class="st">"</span>:[],<span class="st">"name"</span>:<span class="st">"systemd-network"</span>},<span class="dt">{"gid":153,"members":[],"name":"systemd-resolve"}</span><br />,<span class="dt">{"gid":154,"members":[],"name":"systemd-timesync"}</span>,{<span class="st">"gid"</span>:<span class="ex">25</span>,<span class="st">"members"</span>:[],<span class="st">"name</span><br /><span class="st">"</span>:<span class="st">"tape"</span>},<span class="dt">{"gid":3,"members":[],"name":"tty"}</span>,{<span class="st">"gid"</span>:<span class="ex">100</span>,<span class="st">"members"</span>:[],<span class="st">"name"</span>:<span class="st">"us</span><br /><span class="st">ers"</span>},<span class="dt">{"gid":29,"members":[],"name":"utmp"}</span>,{<span class="st">"gid"</span>:<span class="ex">19</span>,<span class="st">"members"</span>:[],<span class="st">"name"</span>:<span class="st">"uucp"</span><br />},<span class="dt">{"gid":26,"members":[],"name":"video"},{"gid":1,"members":[],"name":"wheel"}</span>],<br /><span class="st">"mutableUsers"</span>:<span class="ex">true</span>,<span class="st">"users"</span>:[{<span class="st">"createHome"</span>:false,<span class="st">"description"</span>:<span class="st">"→Apache Kafka </span><br /><span class="st">daemon user"</span>,<span class="st">"group"</span>:<span class="st">"nogroup"</span>,<span class="st">"hashedPassword"</span>:null,<span class="st">"home"</span>:<span class="st">"/tmp/kafka-logs"</span>,<span class="st">"i</span><br /><span class="st">nitialHashedPassword"</span>:null,<span class="st">"initialPassword"</span>:null,<span class="st">"isSystemUser"</span>:false,<span class="st">"name"</span>:<span class="st">"a</span><br /><span class="st">pache-kafka"</span>,<span class="st">"password"</span>:null,<span class="st">"passwordFile"</span>:null,<span class="st">"shell"</span>:<span class="st">"/run/current-system/sw</span><br /><span class="st">/bin/nologin"</span>,<span class="st">"uid"</span>:169},{<span class="st">"createHome"</span>:<span class="ex">false</span>,<span class="st">"description"</span>:<span class="st">"→D-Bus system mess</span><br /><span class="st">...</span></code></pre></div><p>However, this doesn't do the diff justice because the output is actually colorized, like this:</p><p><img src="https://i.imgur.com/KUB4rXx.png" style="width: 500px;" /></p><p>From the diff we can see that:</p><ul><li>This change adds Kafka executables to the system <code>PATH</code></li><li>This change adds a new <code>apache-kafka</code> <code>systemd</code> service</li><li>This change adds a new <code>apache-kafka</code> user to the system</li></ul><p>Note how <code>nix-diff</code> does more than diffing the two root derivations. If the two derivations differ on a shared input then <code>nix-diff</code> will descend into that input and diff that and repeat the process until the root cause of the change is found. This works because Nix's dependency graph is complete and reachable from the root derivation.</p><h2 id="conclusion">Conclusion</h2><p>You can find the <code>nix-diff</code> utility on <a href="https://hackage.haskell.org/package/nix-diff">Hackage</a> or <a href="https://github.com/Gabriel439/nix-diff">GitHub</a> if you would like to use this in your own development workflow. Hopefully <code>nix-diff</code> will help you better understand how Nix works under the hood and also help you pin Nix derivations more robustly.</p>Mon, 27 Nov 2017 15:58:40 +0000noreply@blogger.com (Gabriel Gonzalez)Neil Mitchell: Haskell exceptions and FFI wrapperstag:blogger.com,1999:blog-7094652.post-6091389602868600279
http://neilmitchell.blogspot.com/2017/11/haskell-exceptions-and-ffi-wrappers.html
<p><em>Summary: If you create a C function pointer from a Haskell function with "wrapper", and it throws an exception, bad things happen.</em></p><p>The Haskell FFI is incredibly powerful, allowing you to convert Haskell functions into C function pointers. In this post I'll give a quick example, then go into what happens if the Haskell function throws an exception. First, let's define a C function (and put it in a file called <code>c.c</code>):</p><pre><code>int apply(int(*f)(int), int x)<br />{<br /> return f(x);<br />}<br /></code></pre><p>The piece <code>int(*f)(int)</code> says <code>f</code> is a function of type <code>Int -> Int</code>. The function <code>apply</code> is equivalent to <code>$</code>, restricted to <code>int</code> - it applies the first argument <code>f</code> to the second argument <code>x</code> and returns the result. We can call that in Haskell with:</p><pre><code>foreign import ccall apply :: FunPtr (CInt -> IO CInt) -> CInt -> IO CInt<br />foreign import ccall "wrapper" wrap :: (CInt -> IO CInt) -> IO (FunPtr (CInt -> IO CInt))<br /><br />main :: IO ()<br />main = do<br /> f <- wrap $ \x -> return $ x + 20<br /> res <- apply f 22<br /> print res<br /></code></pre><p>On the first line we wrap <code>apply</code> into a Haskell definition, turning a C function pointer into <code>FunPtr</code>. In the second we define a special <code>"wrapper"</code> FFI definition - the name <code>"wrapper"</code> is a specific string which is part of the FFI spec - it converts a Haskell function into a C function pointer. In <code>main</code> we put these pieces together, and other than the pervasive IO, it looks like the equivalent Haskell.</p><p><em>Note:</em> In real code you should always call <code>freeHaskellFunPtr</code> after you have finished using a <code>"wrapper"</code> function, usually using <code>bracket</code>.</p><h3 id="consequences-of-exceptions">Consequences of Exceptions</h3><p>What happens if the function we pass to <code>wrap</code> throws an exception? If you read the GHC manual, you'll find an incomplete link to the FFI spec, which stays silent on the subject. Thinking it through, Haskell has exceptions, but C does not - if the Haskell throws an exception it can't be passed back through C. Haskell can't provide a return value, so it can never resume the C code that called it. The GHC runtime can block indefinitely or kill the thread, both of which are fairly fatal for a program. As a consequence, I strongly recommend never throwing an exception from a function generated by <code>"wrapper"</code> - but what if we do?</p><p><em>Suggestion: most of the FFI addendum should probably be reproduced in the GHC manual with details around corner cases and exceptions.</em></p><h3 id="testing-exceptions">Testing Exceptions</h3><p>First, let's change our wrapped function to <code>wrap $ \x -> fail "finish"</code>. Running that prints out:</p><pre><code>bug.exe: user error (finish)<br /></code></pre><p>That seems like a standard exception. However, let's go further and put the entire program inside a <code>finally</code>, to show we have a normal Haskell exception:</p><pre><code>main = flip finally (print "done") $ do<br /> ...<br /></code></pre><p>The output doesn't change - we never print out <code>"done"</code>. It seems the exception thrown inside <code>wrap</code> aborts the program rather than bubbling up.</p><p><em>Suggestion: This error looks like a normal exception, but really isn't. It should say you have violated the wrapper invariant and your program has been violently aborted.</em></p><p>We've encountered bad behaviour, but can we go worse? Yes we can, by adding threads:</p><pre><code>main = do<br /> replicateM_ 100 $ do<br /> forkIO $ do<br /> ff <- wrap $ \_ -> fail "die"<br /> print =<< apply ff 12<br /> threadDelay 10000000<br /></code></pre><p>Here we spawn 100 threads, each of which does an <code>apply</code> with an exception, then we wait for 10 seconds. The output is:</p><pre><code>bug.exe: user error (die)<br />bug.exe: user error (die)<br />bug.exe: warning: too many hs_exit()s<br /></code></pre><p>It looks like there is a race condition with the exit path, causing two fatal wrapper exceptions to try and take down the runtime twice.</p><p><em>Suggestion: The <code>hs_exit</code> bug should be fixed.</em></p><h3 id="avoiding-exceptions">Avoiding Exceptions</h3><p>Now we know we need to avoid throwing exceptions inside <code>"wrapper"</code> functions, the obvious approach is to wrap them in a <code>catch</code>, e.g.:</p><pre><code>wrap $ \x -> ... `catch` \(_ :: SomeException) -> return (-1)<br /></code></pre><p>Namely catch all exceptions, and replace them with <code>-1</code>. As usual with <code>catch</code>, it is important to force evaluation of the <code>...</code> inside the <code>catch</code> (e.g. using <code>catchDeep</code> from <a href="https://hackage.haskell.org/package/safe-exceptions"><code>safe-exceptions</code></a>). If you want to recover the original exception you can capture it in an <code>IORef</code> and throw it after leaving C:</p><pre><code>ref <- newIORef Nothing<br />f <- wrap $ \x -> ... `catch` \(e :: SomeException) -> do<br /> writeIORef ref $ Just e<br /> return (-1)<br />res <- apply f 22<br />whenJustM (readIORef ref) throwIO<br /></code></pre><p>However, what if there is an asynchronous exception after we leave the <code>catch</code> but before we return to C? From my experiments, this doesn't appear to be possible. Even though <code>getMaskingState</code> returns <code>Unmasked</code> exceptions thrown to the function inside <code>wrapper</code> appear to be deferred until the C code returns.</p><p><em>Suggestion: The documentation should clarify if my experiments are correct. Should <code>getMaskingState</code> return <code>MaskedUninterruptible</code>?</em></p>Sun, 26 Nov 2017 21:39:00 +0000noreply@blogger.com (Neil Mitchell)Joachim Breitner: Existence and Terminationhttp://www.joachim-breitner.de/blog/733-Existence_and_Termination
http://www.joachim-breitner.de/blog/733-Existence_and_Termination
<p>I recently had some intense discussions that revolved around issues of existence and termination of functions in Coq, about axioms and what certain proofs actually mean. We came across some interesting questions and thoughts that I’ll share with those of my blog readers with an interest in proofs and interactive theorem proving.</p>
<h3 id="tldr">tl;dr</h3>
<ul>
<li>It can be meaningful to assume the <em>existence</em> of a function in Coq, and under that assumption prove its <em>termination</em> and other properties.</li>
<li>Axioms and assumptions are logically equivalent.</li>
<li>Unsound axioms do not necessary invalidate a theory development, when additional meta-rules govern their use.</li>
</ul>
<h3 id="preparation">Preparation</h3>
<p>Our main running example is the infamous Collatz series. Starting at any natural number, the next is calculated as follow:</p>
<pre><code>Require Import Coq.Arith.Arith.
Definition next (n : nat) :nat :=
if Nat.even n then n / 2 else 3*n + 1.</code></pre>
<p>If you start with some positive number, you are going to end up reaching 1 eventually. Or are you? So far nobody has found a number where that does not happen, but we also do not have a proof that it never happens. It is one of the <a href="https://en.wikipedia.org/wiki/Collatz_conjecture">great mysteries of Mathematics</a>, and if you can solve it, you’ll be famous.</p>
<h3 id="a-failed-definition">A failed definition</h3>
<p>But assume we had an idea on how to prove that we are always going to reach 1, and tried to formalize this in Coq. One attempt might be to write</p>
<pre><code>Fixpoint good (n : nat) : bool :=
if n <=? 1
then true
else good (next n).
Theorem collatz: forall n, good n = true.
Proof. (* Insert genius idea here.*) Qed.</code></pre>
<p>Unfortunately, this does not work: Coq rejects this recursive definition of the function <code>good</code>, because it does not see how that is a terminating function, and Coq requires all such recursive function definitions to be obviously terminating – without this check there would be a risk of Coq’s type checking becoming incomplete or its logic being unsound.</p>
<p>The idiomatic way to avoid this problem is to state <code>good</code> as an inductive predicate... but let me explore another idea here.</p>
<h3 id="working-with-assumptions">Working with assumptions</h3>
<p>What happens if we just assume that the function <code>good</code>, described above, exists, and then perform our proof:</p>
<pre><code>Theorem collatz
(good : nat -> bool)
(good_eq : forall n,
good n = if n <=? 1 then true else good (next n))
: forall n, good n = true.
Proof. (* Insert genius idea here.*) Qed.</code></pre>
<p>Would we accept this as a proof of Collatz’ conjecture? Or did we just assume what we want to prove, in which case the theorem is vacuously true, but we just performed useless circular reasoning?</p>
<p>Upon close inspection, we find that the assumptions of the theorem (<code>good</code> and <code>good_eq</code>) are certainly satisfiable:</p>
<pre><code>Definition trivial (n: nat) : bool := true.
Lemma trivial_eq: forall n,
trivial n = if n <=? 1 then true else trivial (next n).
Proof. intro; case (n <=? 1); reflexivity. Qed.
Lemma collatz_trivial: forall n, trivial n = true.
Proof.
apply (collatz trivial trivial_eq).
Qed.</code></pre>
<p>So clearly there exists a function of type <code>nat -> bool</code> that satisfies the assumed equation. This is good, because it means that the <code>collatz</code> theorem is not simply assuming <code>False</code>!</p>
<p>Some (including me) might already be happy with this theorem and proof, as it clearly states: “Every function that follows the Collatz series eventually reaches 1”.</p>
<p>Others might still not be at ease with such a proof. Above we have seen that we cannot define the real collatz series in Coq. How can the <code>collatz</code> theorem say something that is not definable?</p>
<h3 id="classical-reasoning">Classical reasoning</h3>
<p>One possible way of getting some assurance it to define <code>good</code> as a classical function. The logic of Coq can be extended with the law of the excluded middle without making it inconsistent, and with that axiom, we can define a version of <code>good</code> that is pretty convincing (sorry for the slightly messy proof):</p>
<pre><code>Require Import Coq.Logic.ClassicalDescription.
Require Import Omega.
Definition classical_good (n:nat) : bool :=
if excluded_middle_informative (exists m, Nat.iter m next n <= 1)
then true else false.
Lemma iter_shift:
forall a f x (y:a), Nat.iter x f (f y) = f (Nat.iter x f y).
Proof.
intros. induction x. reflexivity. simpl. rewrite IHx. reflexivity. Qed.
Lemma classical_good_eq: forall n,
classical_good n = if n <=? 1 then true else classical_good (next n).
Proof.
intros.
unfold classical_good at 1.
destruct (Nat.leb_spec n 1).
* destruct (excluded_middle_informative _); try auto.
contradict n0. exists 0. simpl. assumption.
* unfold classical_good.
destruct (Nat.eqb_spec (next n) 0); try auto.
destruct (excluded_middle_informative _), (excluded_middle_informative _); auto.
- contradict n0.
destruct e0.
destruct x; simpl in *. omega.
exists x. rewrite iter_shift. assumption.
- contradict n0.
destruct e0.
exists (S x). simpl. rewrite iter_shift in H0. assumption.
Qed.
Lemma collatz_classical: forall n, classical_good n = true.
Proof. apply (collatz classical_good classical_good_eq). Qed.</code></pre>
<p>The point of this is not so much to use this particular definition of <code>good</code>, but merely to convince ourselves that the assumptions of the <code>collatz</code> theorem above encompass “the” Collatz series, and thus constitutes a proof of the Collatz conjecture.</p>
<p>The main take-away so far is that <strong>existence and termination of a function</strong> are two separate issues, and it is possible to assume the former, prove the latter, and not have done a vacuous proof.</p>
<h3 id="the-ice-gets-thinner">The ice gets thinner</h3>
<h4 id="sections">Sections</h4>
<p>Starting with the above <code>Theorem collatz</code>, there is another train of thought I invite to to follow along.</p>
<p>Probably the “genius idea” proof will be more than a few lines long, and we probably to be able to declare helper lemmas and other things along the way. Doing all that in the body of the <code>collatz</code> proof is not very convenient, so instead of using assumptions, we might write</p>
<pre><code>Section collatz:
Variable good : nat -> bool.
Variable good_eq : forall n,
good n = if n <=? 1 then true else good (next n)
Theorem collatz2 : forall n, good n = true.
Proof. (* Insert genius idea here.*) Qed.
End collatz.</code></pre>
<p>So far so good: Clearly, I just refactored my code a bit, but did not make any significant change. The theorems <code>collatz2</code> and <code>collatz</code> are equivalent.</p>
<h4 id="sound-axioms">Sound axioms</h4>
<p>But note that we do not really intend to instantiate <code>collatz2</code>. We know that the assumptions are satisfiable (e.g. since we can define <code>trivial</code> or <code>classical_good</code>). So maybe, we would rather avoid the <code>Section</code> mechanism and simply write</p>
<pre><code>Axiom good : nat -> bool.
Axiom good_eq : forall n,
good n = if n <=? 1 then true else good (next n)
Theorem collatz3 : forall n, good n = true.
Proof. (* Insert genius idea here.*) Qed.</code></pre>
<p>I assume this will make a few of my readers’ eyebrows go up: How can I dare to start with such Axioms? Do they not invalidate my whole development?</p>
<p>On the other hand, all that a Coq axiom is doing is saying “the following theorems are under the assumption that the axiom holds”. In that sense, <code>collatz3</code> and <code>collatz2</code> are essentially equivalent.</p>
<h4 id="unsound-axioms">Unsound axioms</h4>
<p>Let me take it one step further, and change that to:</p>
<pre><code>Axiom unsafeFix : forall a, (a -> a) -> a.
Axiom unsafeFix_eq : forall f, unsafeFix f = f (unsafeFix f).
Definition good : nat -> bool :=
unsafeFix (fun good n => if n <=? 1 then true else good (next n)).
Theorem collatz4 : forall n, good n = true.
Proof. (* Insert genius idea here.*) Qed.</code></pre>
<p>At this point, the majority of my readers <em>will</em> cringe. The axiom <code>unsafeFix</code> is so blatantly unsound (in Coq), how do I even dare to think of using it. But bear with me for a moment: I did not change the proof. So maybe the <code>collatz4</code> theorem is still worth something?</p>
<p>I want to argue that it is: Both <code>unsafeFix</code> and <code>unsafeFix_eq</code> are unsound in their full generality. But as long as I instantiate them only with functions <code>f</code> which have a fixpoint, then I cannot prove <code>False</code> this way. So while “Coq + <code>unsafeFix</code>” is unsound, “Coq + <code>unsafeFix</code> + <code>unsafeFix_eq</code> + metarule that these axioms are only called with permissible <code>f</code>” is not.</p>
<p>In that light, my <code>collatz4</code> proof carries the same meaning as the <code>collatz3</code> proof, it is just less convenient to check: If I were to check the validity of <code>collatz3</code>, I have to maybe look for uses of <code>admit</code>, or some misleading use of syntax or other tricks, or other smells. When I have to check the validity of <code>collatz4</code>, I also have to additionally check the meta-rule -- tedious, but certainly possible (e.g. by inspecting the proof term).</p>
<h3 id="beyond-collatz">Beyond Collatz</h3>
<p>The questions discussed here did not come up in the context of the Collatz series (for which I unfortunately do not have a proof), but rather the verification of Haskell code in Coq using <a href="https://github.com/antalsz/hs-to-coq"><code>hs-to-coq</code></a>. I started with the idiomatic Haskell definition of “Quicksort”:</p>
<pre class="hs"><code>quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (p:xs) = quicksort lesser ++ [p] ++ quicksort greater
where (lesser, greater) = partition (<p) xs</code></pre>
<p>This function is not terminating in a way that is obvious to the Coq type checker. Conveniently, <code>hs-to-coq</code> can optionally create the Coq code using the <code>unsafeFix</code> axiom above, producing (roughly):</p>
<pre><code>Definition quicksort {a} `{Ord a} : list a -> list a :=
unsafeFix (fun quicksort xs =>
match xs with
| nil => nil
| p :: xs => match partition (fun x => x <? p) xs with
| (lesser, greater) => quicksort lesser ++ [p] ++ quicksort greater
end
end).</code></pre>
<p>I <a href="https://github.com/antalsz/hs-to-coq/tree/a8cfb747cee2dbe7ce77b3a118958af99c090768/examples/ghc-base/quicksort">then proved</a> (roughly)</p>
<pre><code>Theorem quicksort_sorted:
forall a `(Ord a) (xs : list a), StronglySorted (quicksort xs).</code></pre>
<p>and</p>
<pre><code>Theorem quicksort_permutation:
forall a `(Ord a) (xs : list a), Permutation (quicksort xs) xs.</code></pre>
<p>These proofs proceed by well-founded induction on the length of the argument <code>xs</code>, and hence encompass a termination proof of <code>quicksort</code>. Note that with a only <em>partially</em> correct but non-terminating definition of <code>quicksort</code> (e.g. <code>quicksort := unsafeFix (fun quicksort xs => quicksort xs)</code>) I would not be able to conclude these proofs.</p>
<p>My (not undisputed) claim about the meaning of these theorems is therefore</p>
<blockquote>
<p>If the Haskell equations for <code>quicksort</code> actually have a fixed point, then the use of <code>unsafeFix</code> in its definition does not introduce any inconsistency. Under this assumption, we showed that <code>quicksort</code> always terminates and produces a sorted version of the input list.</p>
</blockquote>
<p>Do you agree?</p>Sat, 25 Nov 2017 20:54:57 +0000mail@joachim-breitner.de (Joachim Breitner)Functional Jobs: Software Engineer (Haskell, Full Stack) at Capital Match (Full-time)urn:uuid:5d1677df-15b7-b0f1-8023-97d7dede2f49
https://functionaljobs.com/jobs/9053-software-engineer-haskell-full-stack-at-capital-match
<p><strong>About Us:</strong>
Capital Match is a leading P2P lending platform based in Singapore, founded in 2014, backed by alternative investment management firm with more than US$ 5 bn AUM.
We are looking for experienced developers to lead our tech growth in the Fintech space, expand into surrounding countries and develop new products on the platform. </p>
<p><strong>Job Description:</strong>
We are inviting developers with a minimum of 5 years coding experience. The candidate should have functional programming experience as well as experience in developing server and web applications. An interest in all aspects of the creation, growth and operations of a secure web-based platform: front-to-back feature development, distributed deployment and automation in the cloud, build and test automation, is highly desirable. A background in fintech and especially the lending space would be an advantage (but not essential).</p>
<p><strong>Job Requirements:</strong>
Our platform is primarily developed in Haskell with a ClojureScript frontend. Candidates should ideally have production experience with Haskell, or strong experience with at least one other functional programming language.
(For example: OCaml/F#/Scala/Clojure/Lisp/Erlang)</p>
<p>We use Docker containers and standard cloud infrastructure systems to manage our production rollouts, so familiarity with Linux systems, command-line environments and cloud-based deployments is highly desirable. Exposure to and understanding of XP practices such as TDD, CI, Emergent Design, Refactoring, Peer Review and Continuous Improvement is highly desirable.</p>
<p>We are inviting developers with at least 5 years of software engineering experience.</p>
<p><strong>Offer:</strong>
We offer a combination of salary and equity depending on experience and skills of the candidate. Most expats who relocate to Singapore do not have to pay their home country taxes and the local tax rate in Singapore is more or less 5%.
Visa sponsorship will be provided.
Singapore is a great place to live, a vibrant city rich with diverse cultures, a very strong financial sector and a central location in Southeast Asia.</p>
<p>www.capital-match.com</p>
<p>Get information on <a href="https://functionaljobs.com/jobs/9053-software-engineer-haskell-full-stack-at-capital-match">how to apply</a> for this position.</p>Fri, 24 Nov 2017 08:40:48 +0000Sandy Maguire: Gentle Theorems: Difference of Squareshttp://reasonablypolymorphic.com//blog/difference-of-squares
http://reasonablypolymorphic.com//blog/difference-of-squares
<div class="main">
<article>
<header>
<h1><a href="http://reasonablypolymorphic.com/blog/difference-of-squares">Gentle Theorems: Difference of Squares</a></h1>
</header>
<p class="meta">
<span class="prev">
<a href="http://reasonablypolymorphic.com/blog/type-directed-code-generation">←</a>
</span>
<time>November 24, 2017</time>
<span class="tags">
<a href="http://reasonablypolymorphic.com/tags/math.html">math</a>
</span>
</p>
<div class="content">
<p>I have a (not very controversial) feeling that people don’t feel as though algebra is actually a thing you can use for stuff. I fall into this trap myself often, despite being someone who does math for a living, and so I suspect this is a pretty wide-spread phenomenon. Let me explain.</p>
<p>For example, consider the equation:</p>
<p><span class="math display">\[
(x + y)(x - y) = x^2 - y^2
\]</span></p>
<p>This is known as the <em>difference of squares</em>. Let’s work through the derivation of it together:</p>
<p><span class="math display">\[
\begin{align*}
(x + y)(x - y) &= (x + y)(x - y) \\
&= x^2 + xy - xy - y^2 \\
&= x^2 + \cancel{xy - xy} - y^2 \\
&= x^2 - y^2
\end{align*}
\]</span></p>
<p>Recall that we can use the <a href="https://en.wikipedia.org/wiki/FOIL_method">FOIL method</a> to get from the first line to the second.</p>
<p style="text-align: center;">
<img src="http://reasonablypolymorphic.com/images/foil.png" height="65" />
</p>
<p>I implore you to read through this proof carefully, and convince yourself of its truthfulness – even if you don’t consider yourself a “math” person. Believe it or not, there’s a point I’m getting to.</p>
<p>Anyway – by all accounts, this difference of squares thing is a pretty humdrum theorem. Who really cares, right? Let’s switch gears for a bit and talk about something more interesting.</p>
<hr />
<p>Recall that <span class="math inline">\(20 \times 20 = 400\)</span>. As an interesting question, without actually computing it, let’s think about the product <span class="math inline">\(19 \times 21\)</span>. What does this equal? It seems like it <em>could</em> also be <span class="math inline">\(400\)</span> – after all, all we did was take one away from the left side of the times and move it to the right.</p>
<p>In fact, if you work it out, <span class="math inline">\(19 \times 21 = 399\)</span>. That’s kind of interesting: somehow we lost a <span class="math inline">\(1\)</span> by shuffling around the things we were multiplying.</p>
<p>This seems to not be an isolated incident:</p>
<p><span class="math display">\[
\begin{align*}
5 \times 5 &= 25 \\
\text{but,}\quad4 \times 6 &= 24
\end{align*}
\]</span></p>
<p><span class="math display">\[
\begin{align*}
10 \times 10 &= 100 \\
\text{but,}\quad9 \times 11 &= 99
\end{align*}
\]</span></p>
<p>An intriguing question to ask yourself is whether this is always true, or whether we’ve just gotten lucky with the examples we looked at.</p>
<p>But the more interesting question, in my opinion, is what happens if we go from <span class="math inline">\(19 \times 21 = 399\)</span> to <span class="math inline">\(18\times22\)</span>. Will we lose another <span class="math inline">\(1\)</span> when we fiddle with it? Or will something else happen? Form an opinion on what the answer will be before continuing.</p>
<p><span class="math display">\[
\begin{align*}
20 \times 20 &= 400 \\
\text{but,}\quad 21 \times 19 &= 399 \\
\text{but,}\quad 22 \times 18 &= 396
\end{align*}
\]</span></p>
<p>Weird – somehow we lost <span class="math inline">\(3\)</span> that time. What’s happened here?</p>
<p>If you’re confused (and I was, when I first saw this), don’t despair. As it happens, you already know the answer!</p>
<hr />
<p>So, what’s going on here? Well, we’ve actually just been dealing with differences of squares the whole time – probably without even realizing it!</p>
<p>Most people, I think, fail to connect the algebraic fact that <span class="math inline">\((x+y)(x-y)=x^2-y^2\)</span> to the fact that <span class="math inline">\(22\times18=396\)</span>. If you still don’t see why, we can explicitly fill in our variables:</p>
<p><span class="math display">\[
\begin{align*}
22\times18&=(20+2)(20-2)\\
&=20^2-2^2 \\
&= 400 - 4 \\
&= 396
\end{align*}
\]</span></p>
<p>Neat, right? Even if you carefully read through the proof of the difference of squares earlier, you might not have noticed that we’ve been playing with them the entire time! I blame western math education for this; too often are equations presented only to be <em>solved</em>, and never to be <em>thought about</em>. It’s a disservice we’ve done to ourselves.</p>
<p>The takeaway of all of this, in my opinion, is that we should spend some time thinking about the notion of equality, about the <span class="math inline">\(=\)</span> symbol. Ever since looking at this difference of squares thing, I’ve started viewing <span class="math inline">\(=\)</span> not as the symbol which separates the left side of an equation from the right, but as a <em>transformation</em>. The <span class="math inline">\(=\)</span> sign transforms something we can experience into something we can manipulate, and back again.</p>
<p>What I mean by that is that it’s a lot easier to conceptualize <span class="math inline">\(22\times18\)</span> than it is to think about <span class="math inline">\((x+y)(x-y)\)</span>. The numeric representation is better suited for human minds to experience, while the algebraic expression is better at twiddling. We know how to twiddle algebra, but twiddling numbers themselves is rather meaningless.</p>
<p>In terms of everyday usefulness, this isn’t particularly helpful, except that it’s often easier to compute a difference of squares than it is to do the multiplication naively. If you can recognize one, you could probably impress someone with your mental arithmetic – but, again, it’s not going to revolutionize your life in any way.</p>
<p>All of this is to say that math is neat. Even if you don’t see any practical value in this stuff, hopefully you’ll agree that there might be interesting puzzles to be found here. And, as it turns out, algebra can be a satisfying tool for solving these puzzles.</p>
<hr />
<p>Thanks to <a href="http://parsonsmatt.org">Matt Parsons</a> for proof-reading an early version of this post.</p>
</div>
<p class="meta">
<span class="prev">
<a href="http://reasonablypolymorphic.com/blog/type-directed-code-generation">←</a>
</span>
</p>
</article>
</div>Fri, 24 Nov 2017 00:00:00 +0000Robert Harper: Sequentiality as the Essence of Parallelismhttp://existentialtype.wordpress.com/?p=1704
https://existentialtype.wordpress.com/2017/11/04/sequentiality-is-the-essence-of-parallelism/
<p>I recently thought of a nice way to structure a language for parallel programming around the concept of sequential composition. Think of parallelism as the default—evaluate everything in parallel unless the semantics of the situation precludes it: sums are posterior to summands, but the summands can be evaluated simultaneously. You need a way to express the necessary dependencies without introducing any spurious ones.</p>
<p>There’s a tool for that, called <em>lax logic</em>, introduced by <a href="http://www.sciencedirect.com/science/article/pii/S0890540197926274" target="_blank" rel="noopener">Fairtlough and Mendler</a> and elaborated by <a href="https://www.cambridge.org/core/journals/mathematical-structures-in-computer-science/article/a-judgmental-reconstruction-of-modal-logic/975027BB7F07B59619913EAD4CEE52F4" target="_blank" rel="noopener">Davies and Pfenning</a>, which I use extensively in <em><a href="http://www.cs.cmu.edu/~rwh/pfpl" target="_blank" rel="noopener">PFPL</a></em>. The imperative language Modernized Algol is formulated in the lax style, distinguishing two <em>modes</em>, or <em>levels</em>, of syntax, the (pure) <em>expressions</em> and the (impure) <em>commands.</em> The lax modality, which links the two layers, behaves roughly like a monad, but, all the hype notwithstanding, it is not the central player. It’s the modes, not the modality, that matter. (See the <a href="http://www.cs.cmu.edu/~rwh/pfpl/commentary.pdf" target="_blank" rel="noopener">Commentary on PFPL</a> for more.)</p>
<p>The lax modality is just the ticket for expressing parallelism. Rather than separate expressions from commands, here we distinguish between <em>values </em>and <em>computations</em>. The names are important, to avoid possible confusion. Values are fully evaluated; they are not a source of parallelism. (If values were called “pure”, it would be irresistible to think otherwise.) Computations have yet to be evaluated; they engender parallelism by sequential composition. <em>What?</em> No, you didn’t nod off! Let me explain.</p>
<p>Parallelism is all about the join points. If parallel execution is the default, then the job of the programmer is not to <em>induce</em> parallelism, but to <em>harness</em> it. And you do that by saying, “this computation depends on these others.” Absent that, there is nothing else to say, just go for it. No sub-languages. No program analysis. No escaping the monad. Just express the necessary dependencies, and you’re good to go.</p>
<p>So, what are the join points? They are the <em>elimination forms</em> for two <em>parallel modalities.</em> They generalize the sequential case to allow for statically and dynamically determined parallelism. A value of <em>parallel product type</em> is a tuple of unevaluated computations, a kind of “lazy” tuple (but not <em>that</em> kind of laziness, here I just mean unevaluated components). The elimination form evaluates <em>all</em> of the component computations in parallel, creates a value tuple from their values, and passes it to the body of the form. Similarly, a value of <em>parallel sequence type</em> is a generator consisting of two values, a natural number <em>n</em> indicating its size, and a function determining the <em>i</em>th component computation for each <em>1≤i<n.</em> The elimination form activates all <em>n</em> component computations, binds their values to a value sequence, and passes it to the body of the form.</p>
<p><em>The join point effects a change of type</em>, from encapsulated computations to evaluated values, neatly generalizing sequential composition from a unary to a multiway join. If you’d like, the parallel products and parallel sequences are “generalized monads” that encapsulate not just one, but many, unevaluated computations. But they are no more monads than they are in any other functional language: the categorial equational laws need not hold in the presence of, say, divergence, or exceptions.</p>
<p>The dynamics assigns costs to computations, not to values, whose cost of creation has already been paid. The computation that just returns a value has unit work and span. Primitive operations take unit work and span. The sequential composition of a parallel product with <em>n</em> components induces span one more than the maximum span of the constituents, and induces work one more than the sum of their work. The dynamics of sequential composition for parallel sequences is similar, with the “arity” being determined dynamically rather than statically.</p>
<p>Programming in this style means making the join points explicit. If you don’t like that, you can easily define derived forms—and derived costs—for constructs that do it for you. For example, a pair of computations might be rendered as activating a parallel pair of its components, then returning the resulting value pair. And so on and so forth. It’s no big deal.</p>
<p><em>En passant</em> the modal formulation of parallelism solves a nasty technical problem in a substitution-based cost semantics that does not make the modal distinction. The issue is, how to distinguish between the creation of a value, and the many re-uses of it arising from substitution? It’s not correct to charge again and again for cresting the value each time you see it (this cost can be asymptotically significant), but you do have to charge for creating it somewhere (it’s not free, and it can matter). And, anyway, how is one to account for the cost of assessing whether an expression is, in fact, a value? The usual move is to use an environment semantics to manage sharing. But you don’t have to, the modal framework solves the problem, by distinguishing between a value <em>per se; </em>the computation that returns it fully created; and the computation that incrementally constructs it from its constituent parts. It’s the old <em>cons-vs-dotted pair</em> issue, neatly resolved.</p>
<p>Please see Section 10 of the <a href="http://www.cs.cmu.edu/~rwh/pfpl/commentary.pdf" target="_blank" rel="noopener">Commentary on PFPL</a> for a fuller account. The main idea is to generalize a type of single unevaluated computations, which arises in lax logic, to types of statically- and dynamically many unevaluated computations. The bind operation becomes a join operation for these computations, turning a “lazy” tuple or sequence into eager tuples or sequences.</p>
<p><em>Updates</em>: word-smithing, added cite to Davies-Pfenning, replaced cite of course notes with reference to commentary.</p><br />Filed under: <a href="https://existentialtype.wordpress.com/category/programming/">Programming</a>, <a href="https://existentialtype.wordpress.com/category/research/">Research</a>, <a href="https://existentialtype.wordpress.com/category/teaching-2/">Teaching</a> Tagged: <a href="https://existentialtype.wordpress.com/tag/functional-programming/">functional programming</a>, <a href="https://existentialtype.wordpress.com/tag/parallelism/">parallelism</a>, <a href="https://existentialtype.wordpress.com/tag/programming-languages/">programming languages</a>, <a href="https://existentialtype.wordpress.com/tag/semantics/">semantics</a> <a href="http://feeds.wordpress.com/1.0/gocomments/existentialtype.wordpress.com/1704/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/comments/existentialtype.wordpress.com/1704/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/godelicious/existentialtype.wordpress.com/1704/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/delicious/existentialtype.wordpress.com/1704/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gofacebook/existentialtype.wordpress.com/1704/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/facebook/existentialtype.wordpress.com/1704/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gotwitter/existentialtype.wordpress.com/1704/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/twitter/existentialtype.wordpress.com/1704/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gostumble/existentialtype.wordpress.com/1704/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/stumble/existentialtype.wordpress.com/1704/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/godigg/existentialtype.wordpress.com/1704/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/digg/existentialtype.wordpress.com/1704/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/goreddit/existentialtype.wordpress.com/1704/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/reddit/existentialtype.wordpress.com/1704/" alt="" border="0" /></a> <img src="https://pixel.wp.com/b.gif?host=existentialtype.wordpress.com&blog=2157150&post=1704&subd=existentialtype&ref=&feed=1" alt="" height="1" border="0" width="1" />Wed, 22 Nov 2017 18:18:24 +0000FP Complete: Lambda Conference and Haskell Surveyhttps://www.fpcomplete.com/blog/lambda-conference-and-haskell-survey
https://www.fpcomplete.com/blog/lambda-conference-and-haskell-survey
<div class="hs-featured-image-wrapper">
<a href="https://www.fpcomplete.com/blog/lambda-conference-and-haskell-survey" class="hs-featured-image-link" title=""> <img src="https://www.fpcomplete.com/hubfs/Blog/DSC03182.jpg?t=1513366076380" alt="Lambda Conference and Haskell Survey" style="width: auto !important; float: left; margin: 0 15px 15px 0;" class="hs-featured-image" /> </a>
</div>
<h2>LAMBDA WORLD Conference</h2>
<p>Functional programmers are a unique breed of software development professionals. They have decided that the traditional methods to solving problems are not good enough. In their quest to find the most efficient way to find solutions they eventually stumble upon functional programming. Functional programmers also know they are a minority among their programming peers and don't enjoy the cornucopia of resources available to imperative language developers. That's why <a href="http://www.lambda.world/">Lambda World</a> is such an important conference. <a href="http://feeds.feedburner.com/leadership">Michael Snoyman</a>, our VP of Engineering, spoke to his functional programming peers when he discussed <span style="color: #3574e3;"><strong>"<a href="https://youtu.be/KZIN9f9rI34">Everything you didn’t want to know about Monad transformer state</a>". </strong></span></p>
<img src="https://track.hubspot.com/__ptq.gif?a=2814979&k=14&r=https%3A%2F%2Fwww.fpcomplete.com%2Fblog%2Flambda-conference-and-haskell-survey&bu=https%253A%252F%252Fwww.fpcomplete.com%252Fblog&bvt=rss" alt="" height="1" style="width: 1px!important;" width="1" />Wed, 22 Nov 2017 18:16:09 +0000robert@fpcomplete.com (Robert Bobbett)The GHC Team: GHC 8.2.2 is availablehttp://ghc.haskell.org/trac/ghc/blog/ghc-8.2.2-released
http://ghc.haskell.org/trac/ghc/blog/ghc-8.2.2-released
<p>
The GHC Team is pleased to announce a new minor release of GHC. This release
builds on the performance and stability improvements of 8.2.1, fixing a variety
of correctness bugs, improving error messages, and making the compiler more
portable.
</p>
<p>
Notable bug-fixes include
</p>
<ul><li>A correctness issue resulting in segmentation faults in some
FFI-users (<a href="http://ghc.haskell.org/trac/ghc/ticket/13707" class="closed ticket" title="#13707: bug: xmobar crashes with segmentation faults? (closed: fixed)">#13707</a>, <a href="http://ghc.haskell.org/trac/ghc/ticket/14346" class="closed ticket" title="#14346: bug: 8.2.1 regression: heap corruption after safe foreign calls (closed: fixed)">#14346</a>)
</li></ul><ul><li>A correctness issue resulting in undefined behavior in some programs
using STM (<a href="http://ghc.haskell.org/trac/ghc/ticket/14171" class="closed ticket" title="#14171: bug: STM causes program to suddenly exit (closed: fixed)">#14171</a>)
</li></ul><ul><li>A bug which may have manifested in segmentation faults in
out-of-memory condition (<a href="http://ghc.haskell.org/trac/ghc/ticket/14329" class="closed ticket" title="#14329: bug: GHC 8.2.1 segfaults while bootstrapping master (closed: fixed)">#14329</a>)
</li></ul><ul><li>clearBit of Natural no longer bottoms (<a href="http://ghc.haskell.org/trac/ghc/ticket/13203" class="closed ticket" title="#13203: bug: Implement Bits Natural clearBit (closed: fixed)">#13203</a>)
</li></ul><ul><li>A specialisation bug resulting in exponential blowup of compilation
time in some specialisation-intensive programs (<a href="http://ghc.haskell.org/trac/ghc/ticket/14379" class="closed ticket" title="#14379: bug: Regression - GHC 8.2.1 Consumes All Memory On Build (closed: fixed)">#14379</a>)
</li></ul><ul><li>ghc-pkg now works even in environments with misconfigured NFS mounts
(<a href="http://ghc.haskell.org/trac/ghc/ticket/13945" class="closed ticket" title="#13945: bug: 'ghc-pkg update' fails due to bad file descriptor error (closed: fixed)">#13945</a>)
</li></ul><ul><li>GHC again supports production of position-independent executables
(<a href="http://ghc.haskell.org/trac/ghc/ticket/13702" class="closed ticket" title="#13702: bug: GHC can't produce position independent executables (closed: fixed)">#13702</a>)
</li></ul><ul><li>Better error messages around kind mismatches (<a href="http://ghc.haskell.org/trac/ghc/ticket/11198" class="new ticket" title="#11198: bug: TypeInType error message regressions (new)">#11198</a>, <a href="http://ghc.haskell.org/trac/ghc/ticket/12373" class="closed ticket" title="#12373: bug: Type error but types match (closed: fixed)">#12373</a>, <a href="http://ghc.haskell.org/trac/ghc/ticket/13530" class="closed ticket" title="#13530: bug: Horrible error message due to TypeInType (closed: fixed)">#13530</a>,
<a href="http://ghc.haskell.org/trac/ghc/ticket/13610" class="closed ticket" title="#13610: bug: Unhelpful error messages about lifted and unlifted types (closed: fixed)">#13610</a>)
</li></ul><p>
A thorough list of the changes in the release can be found in the release
notes,
</p>
<blockquote>
<p>
<a href="https://haskell.org/ghc/docs/8.2.2/html/users_guide/8.2.2-notes.html" class="ext-link"><span class="icon"></span>https://haskell.org/ghc/docs/8.2.2/html/users_guide/8.2.2-notes.html</a>
</p>
</blockquote>
<h2 id="Howtogetit">How to get it</h2>
<p>
This release can be downloaded from
</p>
<blockquote>
<p>
<a href="https://www.haskell.org/ghc/download_ghc_8_2_2.html" class="ext-link"><span class="icon"></span>https://www.haskell.org/ghc/download_ghc_8_2_2.html</a>
</p>
</blockquote>
<p>
For older versions see
</p>
<blockquote>
<p>
<a href="https://www.haskell.org/ghc/" class="ext-link"><span class="icon"></span>https://www.haskell.org/ghc/</a>
</p>
</blockquote>
<p>
We supply binary builds in the native package format for many platforms, and the
source distribution is available from the same place.
</p>
<h2 id="Background">Background</h2>
<p>
Haskell is a standard lazy functional programming language.
</p>
<p>
GHC is a state-of-the-art programming suite for Haskell. Included is
an optimising compiler generating efficient code for a variety of
platforms, together with an interactive system for convenient, quick
development. The distribution includes space and time profiling
facilities, a large collection of libraries, and support for various
language extensions, including concurrency, exceptions, and foreign
language interfaces. GHC is distributed under a BSD-style open source license.
</p>
<p>
A wide variety of Haskell related resources (tutorials, libraries,
specifications, documentation, compilers, interpreters, references,
contact information, links to research groups) are available from the
Haskell home page (see below).
</p>
<p>
On-line GHC-related resources
<del></del><del></del><del></del><del></del><del></del><del></del><del></del><del>
</del></p>
<p>
Relevant URLs on the World-Wide Web:
</p>
<ul><li><a href="https://www.haskell.org/ghc/" class="ext-link"><span class="icon"></span>GHC home page</a>
</li><li><a href="https://ghc.haskell.org/trac/ghc/" class="ext-link"><span class="icon"></span>GHC developers' home page</a>
</li><li><a href="https://www.haskell.org/" class="ext-link"><span class="icon"></span>Haskell home page</a>
</li></ul><h2 id="SupportedPlatforms">Supported Platforms</h2>
<p>
The list of platforms we support, and the people responsible for them,
is <a href="https://ghc.haskell.org/trac/ghc/wiki/Contributors" class="ext-link"><span class="icon"></span>here</a>
</p>
<p>
Ports to other platforms are possible with varying degrees of
difficulty. The <a href="http://ghc.haskell.org/trac/ghc/wiki/Building" class="ext-link"><span class="icon"></span>Building Guide</a> describes how to go about porting to a
new platform.
</p>
<h2 id="Developers">Developers</h2>
<p>
We welcome new contributors. Instructions on accessing our source
code repository, and getting started with hacking on GHC, are
available from the GHC's developer's site run by <a href="http://ghc.haskell.org/trac/ghc/" class="ext-link"><span class="icon"></span>Trac</a>.
</p>
<h2 id="CommunityResources">Community Resources</h2>
<p>
There are mailing lists for GHC users, develpoers, and monitoring bug tracker
activity; to subscribe, use the Mailman
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo" class="ext-link"><span class="icon"></span>web interface</a>.
</p>
<p>
There are several other Haskell and GHC-related mailing lists on
<a href="http://www.haskell.org" class="ext-link"><span class="icon"></span>haskell.org</a>; for the full list, see the
<a href="https://mail.haskell.org/cgi-bin/mailman/listinfo" class="ext-link"><span class="icon"></span>lists page</a>.
</p>
<p>
Some GHC developers hang out on the <code>#ghc</code> and <code>#haskell</code> of the Freenode IRC
network, too. See the <a href="http://www.haskell.org/haskellwiki/IRC_channel" class="ext-link"><span class="icon"></span>Haskell wiki</a> for details.
</p>
<p>
Please report bugs using our bug tracking system. Instructions on reporting bugs
can be found <a href="http://www.haskell.org/ghc/reportabug">here</a>.
</p>Tue, 21 Nov 2017 22:06:57 +0000Yesod Web Framework: mega-sdist: the mega repo helperhttps://www.yesodweb.com/blog/2017/11/mega-sdist
https://www.yesodweb.com/blog/2017/11/mega-sdist
<p>Many years ago, I wrote a utility called <code>mega-sdist</code> to help me with managing
mega repos (more on that below). I've been using it myself ever since, making
some minor improvements over the years. But I realized recently that I never
really announced it to others, and especially not to the people whom it would
help the most: other Yesod contributors and maintainers. Consider this the
(massively belated) announcement.</p><p>You can find the most up-to-date information in <a href="https://github.com/snoyberg/mega-sdist#readme">the project README.md on
Github</a>. Below is the current
content of that file, to help save you a click.</p><hr /><p>This is a utility written to address the specific needs in maintaining
Haskell "mega-repos," or Git repositories containing multiple Cabal
projects. It is intended to ease the process of deciding which
packages need to be released and tagging those releases appropriately.</p><p>It provides the following functionality:</p><ul><li>Detect when local code has changed from what's on Hackage<ul><li>Note that, due to Hackage revisions, sometimes this logic isn't
perfect</li></ul></li><li>Detect when a version number needs to be updated</li><li>Dump the difference between the Hackage version of your package and
the local version</li></ul><p>To install it... well, listen. This tool is intended for people
authoring Haskell packages. Odds are, you already know how to do
this. And if you don't know, this probably isn't a tool that will help
you. Anyway, in order to install it, first
<a href="https://haskell-lang.org/get-started">install Stack</a> and then run
<code>stack install mega-sdist</code>, or just <code>stack install</code> inside this
repository.</p><h2>Opinionated tool</h2><p>This utility is highly opinionated in some ways, e.g.:</p><ul><li>It only supports one style of Git tag name:
<code>packagename/version</code>. This may look weird in non-mega-repos, where
<code>v1.2.3</code> looks better than <code>foo/1.2.3</code>, but for mega-repos the
former doesn't make sense.</li><li>It depends on Stack for both discovering all of your local packages,
and for uploading to Hackage.</li></ul><p>If you're OK with these opinions, keep reading for usage.</p><h2>Have I changed anything?</h2><p>Let's say I'm working on the
<a href="https://github.com/fpco/monad-unlift">monad-unlift megarepo</a> (chosen
as an example of a relatively small repo). I've merged some PRs
recently, or at least think I have. But I don't remember which of the
individual packages within the repo this affected. Instead of looking
at the commit history like some caveman, I'll typically do:</p><pre><code>$ git pull # make sure I have all latest changes
$ mega-sdist</code></pre><p>The <code>mega-sdist</code> command will:</p><ul><li>Build tarballs for all local packages</li><li>Check what the latest versions of my packages on Hackage are</li><li>Do a full <code>diff</code> on these two things and see if anything's changed</li></ul><p>At the time of writing, here's the output from this repo:</p><pre><code>The following packages from Hackage have not changed:
monad-unlift-0.2.0
The following packages require a version bump:
monad-unlift-ref-0.2.1</code></pre><p>What this means is:</p><ul><li>The <code>monad-unlift</code> package I have locally is at version <code>0.2.0</code>. And
it perfectly matches that version on Hackage. No actions necessary.</li><li>The <code>monad-unlift-ref</code> package I have locally is at version
<code>0.2.1</code>. And it doesn't match the code on Hackage. Therefore, if I
wanted to run <code>stack upload monad-unlift-ref</code> successfully, I'd need
to bump the version number.</li></ul><h2>What did I change?</h2><p>Well, again, if I wanted to see what changed, I could run (again, like
a caveman):</p><pre><code>$ git diff monad-unlift-ref/0.2.1 -- monad-unlift-ref</code></pre><p>But that's long! <code>mega-sidst</code>'s got your back. Just run:</p><pre><code>$ mega-sdist monad-unlift-ref --get-diffs</code></pre><p>This will print out the difference between the tarball uploaded to
Hackage and what you have locally. Besides my tongue-in-cheek comment
above, this is also useful if, for some reason, you either don't have
or don't trust the tags in your Git repo.</p><p>One other thing: this diff is currently based on the pristine tarball
from Hackage, ignoring cabal file revisions. So the difference may be
slightly different from what you'd get from <code>stack unpack
monad-unlift-ref-0.2.1</code>. But <code>¯\_(ツ)_/¯</code> that's revisions for you.</p><p>The default behavior of <code>mega-sdist</code> is to look at all packages
specified in your <code>stack.yaml</code>. Targets can be any directory. And
<code>mega-sdist</code> will automatically look at packages in any
<i>subdirectory</i>, so that <code>mega-sdist .</code> is the same as <code>mega-sdist</code> at
the root of your repo*.</p><p>* Assuming all of your packages are actually <i>in</i> your repo, but only
crazy people would do otherwise.</p><h2>Preparing a new release</h2><p>OK, now I continue working on my project, and I've:</p><ul><li>Made some changes to <code>monad-unlift</code></li><li>Updated the cabal file's version number<ul><li>And of course I also updated the <code>ChangeLog.md</code>, I'm not some
monster</li></ul></li></ul><p>From the root of my repo, I run:</p><pre><code>$ mega-sdist monad-unlift</code></pre><p>Or, equivalently, from inside the <code>monad-unlift</code> subdirectory I run:</p><pre><code>$ mega-sdist .</code></pre><p>Either way, I get:</p><pre><code>The following new packages exist locally:
monad-unlift-0.2.1
No version bumps required, good to go!</code></pre><p>This tells me that my package has local changes, <i>and</i> the version
number has been updated, so that <code>stack upload monad-unlift</code> will
work. Neato! Now, you <i>could</i> just run <code>stack upload ...</code>, but here's
what I usually do. First, I'll review the changes I'm about to upload
and make sure there are no surprises:</p><pre><code>$ mega-sdist --get-diffs .
The following new packages exist locally:
monad-unlift-0.2.1
diff -r old/monad-unlift-0.2.0/ChangeLog.md new/monad-unlift-0.2.1/ChangeLog.md
0a1,4
> ## 0.2.1
>
> * Silly changes
>
diff -r old/monad-unlift-0.2.0/Control/Monad/Trans/Unlift.hs new/monad-unlift-0.2.1/Control/Monad/Trans/Unlift.hs
51a52,54
>
> -- I just need some space
>
diff -r old/monad-unlift-0.2.0/monad-unlift.cabal new/monad-unlift-0.2.1/monad-unlift.cabal
2c2
< version: 0.2.0
---
> version: 0.2.1
No version bumps required, good to go!</code></pre><p>OK, that's what I wanted. Time to release. Next, I'm going to use
<code>mega-sdist</code> to tag the release:</p><pre><code>$ mega-sdist --gittag .</code></pre><p>From the root of my repo, this would notice that <code>monad-unlift-ref</code>
still requires a version bump, and refuse to proceed. But inside the
<code>monad-unlift</code> directory, it notices that all necessary version bumps
are done, and happily tags:</p><pre><code>$ mega-sdist --gittag .
The following new packages exist locally:
monad-unlift-0.2.1
No version bumps required, good to go!
Raw command: git tag monad-unlift/0.2.1</code></pre><p>And suddenly I notice something new:</p><pre><code>$ ls tarballs/
monad-unlift-0.2.1.tar.gz</code></pre><p>Neat, <code>mega-sdist</code> left behind tarballs I can upload! To do so, I run:</p><pre><code>$ stack upload tarballs/*</code></pre><p>Note that this will work whether I'm trying to upload just one
package, or all of the updated packages in my repo. Finally, I need to
push the new tags to Github (or wherever):</p><pre><code>$ git push --tags</code></pre><p>And in fact, this upload sequence is so common that I have a shell
alias set up:</p><pre><code>$ alias upload
alias upload='mega-sdist --gittag . && stack upload tarballs/* && git push --tags'</code></pre><p>So there you have it: convenient little utility to help manage repos
with lots of packages in them.</p>Tue, 21 Nov 2017 15:15:00 +0000Philip Wadler: Pay what you want for Java Generics and Collectionstag:blogger.com,1999:blog-9757377.post-3222815697912190851
http://wadler.blogspot.com/2017/11/pay-what-you-want-for-java-generics-and.html
<div style="clear: both; text-align: center;" class="separator"><a style="margin-left: 1em; margin-right: 1em;" href="https://3.bp.blogspot.com/-gnMAvuMNG-Y/WhQX6imBC3I/AAAAAAAAlnM/aoQIQCmlb-8DL2C6348EAdxRprp-gKqAQCLcBGAs/s1600/java-generics.jpg"><img src="https://3.bp.blogspot.com/-gnMAvuMNG-Y/WhQX6imBC3I/AAAAAAAAlnM/aoQIQCmlb-8DL2C6348EAdxRprp-gKqAQCLcBGAs/s640/java-generics.jpg" height="640" border="0" width="484" /></a></div>Humble Book Bundle is <a href="https://www.humblebundle.com/books/java-books?linkID=&mcID=102:5a0fb9d5f0c3608b55a06397:ot:56c3d392733462ca893dbe19:1&utm_source=Humble+Bundle+Newsletter&utm_medium=email&utm_campaign=2017_11_20_javaoreilly_bookbundle&linkID=&utm_content=hero_image">selling off</a> a passle of Java books, including Java Generics and Collection by Naftalin and Wadler, on a pay-what-you-want basis (USD $1 minimum), DRM-free. You choose what proportion of the profits go to Humble and what goes to the charity Code for America. A great deal!Tue, 21 Nov 2017 12:16:23 +0000noreply@blogger.com (Philip Wadler)Sandy Maguire: Type-Directed Code Generationhttp://reasonablypolymorphic.com//blog/type-directed-code-generation
http://reasonablypolymorphic.com//blog/type-directed-code-generation
<div class="main">
<article>
<header>
<h1><a href="http://reasonablypolymorphic.com/blog/type-directed-code-generation">Type-Directed Code Generation</a></h1>
</header>
<p class="meta">
<span class="prev">
<a href="http://reasonablypolymorphic.com/blog/recursion-schemes">←</a>
</span>
<span class="next">
<a href="http://reasonablypolymorphic.com/blog/difference-of-squares">→</a>
</span>
<time>November 18, 2017</time>
<span class="tags">
</span>
</p>
<div class="content">
<blockquote>
<p>aka “Type-Level Icing Sugar”</p>
</blockquote>
<h2 id="context">Context</h2>
<p>At work recently I’ve been working on a library to get idiomatic gRPC support in our Haskell project. I’m quite proud of how it’s come out, and thought it’d make a good topic for a blog post. The approach demonstrates several type-level techniques that in my opinion are under-documented and exceptionally useful in using the type-system to enforce external contracts.</p>
<p>Thankfully the networking side of the library had already been done for me by <a href="https://github.com/awakesecurity/gRPC-haskell">Awake Security</a>, but the interface feels like a thin-wrapper on top of C bindings. I’m <em>very, very</em> grateful that it exists, but I wouldn’t expect myself to be able to use it in anger without causing an uncaught type error somewhere along the line. I’m sure I’m probably just using it wrong, but the library’s higher-level bindings all seemed to be targeted at Awake’s implementation of protobuffers.</p>
<p>We wanted a version that would play nicely with <a href="https://github.com/google/proto-lens">proto-lens</a>, which, at time of writing, has no official support for describing RPC services via protobuffers. If you’re not familiar with proto-lens, it generates Haskell modules containing idiomatic types and lenses for protobuffers, and can be used directly in the build chain.</p>
<p>So the task was to add support to proto-lens for generating interfaces to RPC services defined in protobuffers.</p>
<p>My first approach was to generate the dumbest possible thing that could work – the idea was to generate records containing fields of the shape <code>Request -> IO Response</code>. Of course, with a network involved there is a non-negligible chance of things going wrong, so this interface should expose some means of dealing with errors. However, the protobuffer spec is agnostic about the actual RPC backend used, and so it wasn’t clear how to continue without assuming anything about the particulars behind errors.</p>
<p>More worrisome, however, was that RPCs can be marked as streaming – on the side of the client, server, or both. This means, for example, that a method marked as server-streaming has a different interface on either side of the network:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="ot">serverSide ::</span> <span class="dt">Request</span> <span class="ot">-></span> (<span class="dt">Response</span> <span class="ot">-></span> <span class="dt">IO</span> ()) <span class="ot">-></span> <span class="dt">IO</span> ()
<span class="ot">clientSide ::</span> <span class="dt">Request</span> <span class="ot">-></span> (<span class="dt">IO</span> (<span class="dt">Maybe</span> <span class="dt">Response</span>) <span class="ot">-></span> <span class="dt">IO</span> r) <span class="ot">-></span> <span class="dt">IO</span> r</code></pre></div>
<p>This is problematic. Should we generate different records corresponding to which side of the network we’re dealing with? An early approach I had was to parameterize the same record based on which side of the network, and use a type family to get the correct signature:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="ot">{-# LANGUAGE DataKinds #-}</span>
<span class="kw">data</span> <span class="dt">NetworkSide</span> <span class="fu">=</span> <span class="dt">Client</span> <span class="fu">|</span> <span class="dt">Server</span>
<span class="kw">data</span> <span class="dt">MyService</span> side <span class="fu">=</span> <span class="dt">MyService</span>
{<span class="ot"> runServerStreaming ::</span> <span class="dt">ServerStreamingType</span> side <span class="dt">Request</span> <span class="dt">Response</span>
}
<span class="kw">type</span> family <span class="dt">ServerStreamingType</span> (<span class="ot">side ::</span> <span class="dt">NetworkSide</span>) input output <span class="kw">where</span>
<span class="dt">ServerStreamingType</span> <span class="dt">Server</span> input output <span class="fu">=</span>
input <span class="ot">-></span> (output <span class="ot">-></span> <span class="dt">IO</span> ()) <span class="ot">-></span> <span class="dt">IO</span> ()
<span class="dt">ServerStreamingType</span> <span class="dt">Client</span> input output <span class="fu">=</span>
forall r<span class="fu">.</span> input <span class="ot">-></span> (<span class="dt">IO</span> (<span class="dt">Maybe</span> output) <span class="ot">-></span> <span class="dt">IO</span> r) <span class="ot">-></span> <span class="dt">IO</span> r</code></pre></div>
<p>This seems like it would work, but in fact the existence of the <code>forall</code> on the client-side is “illegally polymorphic” in GHC’s eyes, and it will refuse to compile such a thing. Giving it up would mean we wouldn’t be able to return arbitrarily-computed values on the client-side while streaming data from the server. Users of the library might be able to get around it by invoking <code>IORef</code>s or something, but it would be ugly and non-idiomatic.</p>
<p>So that, along with wanting to be backend-agnostic, made this approach a no-go. Luckily, my brilliant coworker <a href="https://github.com/judah">Judah Jacobson</a> (who is coincidentally also the author of proto-lens), suggested we instead generate metadata for RPC services in proto-lens, and let backend library code figure it out from there.</p>
<p>With all of that context out of the way, we’re ready to get into the actual meat of the post. Finally.</p>
<h2 id="generating-metadata">Generating Metadata</h2>
<p>According to the <a href="https://developers.google.com/protocol-buffers/docs/reference/proto3-spec">spec</a>, a protobuffer service may contain zero or more RPC methods. Each method has a request and response type, either of which might be marked as streaming.</p>
<p>While we could represent this metadata at the term-level, that won’t do us any favors in terms of getting type-safe bindings to this stuff. And so, we instead turn to <code>TypeFamilies</code>, <code>DataKinds</code> and <code>GHC.TypeLits</code>.</p>
<p>For reasons that will become clear later, we chose to represent RPC services via types, and methods in those services as symbols (type-level strings). The relevant typeclasses look like this:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">Service</span> s <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">ServiceName</span><span class="ot"> s ::</span> <span class="dt">Symbol</span>
<span class="kw">class</span> <span class="dt">HasMethod</span> s (<span class="ot">m ::</span> <span class="dt">Symbol</span>) <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">MethodInput</span> s<span class="ot"> m ::</span> <span class="fu">*</span>
<span class="kw">type</span> <span class="dt">MethodOutput</span> s<span class="ot"> m ::</span> <span class="fu">*</span>
<span class="kw">type</span> <span class="dt">IsClientStreaming</span> s<span class="ot"> m ::</span> <span class="dt">Bool</span>
<span class="kw">type</span> <span class="dt">IsServerStreaming</span> s<span class="ot"> m ::</span> <span class="dt">Bool</span></code></pre></div>
<p>For example, the instances generated for the RPC service:</p>
<pre><code>service MyService {
rpc BiDiStreaming(stream Request) returns(stream Response);
}</code></pre>
<p>would look like this:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">MyService</span> <span class="fu">=</span> <span class="dt">MyService</span>
<span class="kw">instance</span> <span class="dt">Service</span> <span class="dt">MyService</span> <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">ServiceName</span> <span class="dt">MyService</span> <span class="fu">=</span> <span class="st">"myService"</span>
<span class="kw">instance</span> <span class="dt">HasMethod</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">MethodInput</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="fu">=</span> <span class="dt">Request</span>
<span class="kw">type</span> <span class="dt">MethodOutput</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="fu">=</span> <span class="dt">Response</span>
<span class="kw">type</span> <span class="dt">IsClientStreaming</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="fu">=</span> <span class="ch">'True</span>
<span class="kw">type</span> <span class="dt">IsServerStreaming</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="fu">=</span> <span class="ch">'True</span></code></pre></div>
<p>You’ll notice that these typeclasses perfectly encode all of the information we had in the protobuffer definition. The idea is that with all of this metadata available to them, specific backends can generate type-safe interfaces to these RPCs. We’ll walk through the implementation of the gRPC bindings together.</p>
<h2 id="the-client-side">The Client Side</h2>
<p>The client side of things is relatively easy. We can the <code>HasMethod</code> instance directly:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">runNonStreamingClient
<span class="ot"> ::</span> <span class="dt">HasMethod</span> s m
<span class="ot">=></span> s
<span class="ot">-></span> <span class="dt">Proxy</span> m
<span class="ot">-></span> <span class="dt">MethodInput</span> s m
<span class="ot">-></span> <span class="dt">IO</span> (<span class="dt">Either</span> <span class="dt">GRPCError</span> (<span class="dt">MethodOutput</span> s m))
runNonStreamingClient <span class="fu">=</span> <span class="co">-- call the underlying gRPC code</span>
runServerStreamingClient
<span class="ot"> ::</span> <span class="dt">HasMethod</span> s m
<span class="ot">=></span> s
<span class="ot">-></span> <span class="dt">Proxy</span> m
<span class="ot">-></span> <span class="dt">MethodInput</span> s m
<span class="ot">-></span> (<span class="dt">IO</span> (<span class="dt">Either</span> <span class="dt">GRPCError</span> (<span class="dt">Maybe</span> (<span class="dt">MethodOutput</span> s m)) <span class="ot">-></span> <span class="dt">IO</span> r)
<span class="ot">-></span> <span class="dt">IO</span> r
runServerStreamingClient <span class="fu">=</span> <span class="co">-- call the underlying gRPC code</span>
<span class="co">-- etc</span></code></pre></div>
<p>This is a great start! We’ve got the interface we wanted for the server-streaming code, and our functions are smart enough to require the correct request and response types.</p>
<p>However, there’s already some type-unsafety here; namely that nothing stops us from calling <code>runNonStreamingClient</code> on a streaming method, or other such silly things.</p>
<p>Thankfully the fix is quite easy – we can use type-level equality to force callers to be attentive to the streaming-ness of the method:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">runNonStreamingClient
<span class="ot"> ::</span> ( <span class="dt">HasMethod</span> s m
, <span class="dt">IsClientStreaming</span> s m <span class="fu">~</span> <span class="ch">'False</span>
, <span class="dt">IsServerStreaming</span> s m <span class="fu">~</span> <span class="ch">'False</span>
)
<span class="ot">=></span> s
<span class="ot">-></span> <span class="dt">Proxy</span> m
<span class="ot">-></span> <span class="dt">MethodInput</span> s m
<span class="ot">-></span> <span class="dt">IO</span> (<span class="dt">Either</span> <span class="dt">GRPCError</span> (<span class="dt">MethodOutput</span> s m))
runServerStreamingClient
<span class="ot"> ::</span> ( <span class="dt">HasMethod</span> s m
, <span class="dt">IsClientStreaming</span> s m <span class="fu">~</span> <span class="ch">'False</span>
, <span class="dt">IsServerStreaming</span> s m <span class="fu">~</span> <span class="ch">'True</span>
)
<span class="ot">=></span> s
<span class="ot">-></span> <span class="dt">Proxy</span> m
<span class="ot">-></span> <span class="dt">MethodInput</span> s m
<span class="ot">-></span> (<span class="dt">IO</span> (<span class="dt">Either</span> <span class="dt">GRPCError</span> (<span class="dt">Maybe</span> (<span class="dt">MethodOutput</span> s m)) <span class="ot">-></span> <span class="dt">IO</span> r)
<span class="ot">-></span> <span class="dt">IO</span> r
<span class="co">-- et al.</span></code></pre></div>
<p>Would-be callers attempting to use the wrong function for their method will now be warded off by the type-system, due to the equality constraints being unable to be discharged. Success!</p>
<p>The actual usability of this code leaves much to be desired (it requires being passed a proxy, and the type errors are absolutely <em>disgusting</em>), but we’ll circle back on improving it later. As it stands, this code is type-safe, and that’s good enough for us for the time being.</p>
<h2 id="the-server-side">The Server Side</h2>
<h3 id="method-discovery">Method Discovery</h3>
<p>Prepare yourself (but don’t panic!): the server side of things is significantly more involved.</p>
<p>In order to run a server, we’re going to need to be able to handle any sort of request that can be thrown at us. That means we’ll need an arbitrary number of handlers, depending on the service in question. An obvious thought would be to generate a record we could consume that would contain handlers for every method, but there’s no obvious place to generate such a thing. Recall: proto-lens can’t, since such a type would be backend-specific, and so our only other strategy down this path would be Template Haskell. Yuck.</p>
<p>Instead, recall that we have an instance of <code>HasMethod</code> for every method on <code>Service s</code> – maybe we could exploit that information somehow? Unfortunately, without Template Haskell, there’s no way to discover typeclass instances.</p>
<p>But that doesn’t mean we’re stumped. Remember that we control the code generation, and so if the representation we have isn’t powerful enough, we can change it. And indeed, the representation we have isn’t quite enough. We can go from a <code>HasMethod s m</code> to its <code>Service s</code>, but not the other way. So let’s change that.</p>
<p>We change the <code>Service</code> class slightly:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">Service</span> s <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">ServiceName</span><span class="ot"> s ::</span> <span class="dt">Symbol</span>
<span class="kw">type</span> <span class="dt">ServiceMethods</span><span class="ot"> s ::</span> [<span class="dt">Symbol</span>]</code></pre></div>
<p>If we ensure that the <code>ServiceMethods s</code> type family always contains an element for every instance of <code>HasService</code>, we’ll be able to use that info to discover our instances. For example, our previous <code>MyService</code> will now get generated thusly:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">MyService</span> <span class="fu">=</span> <span class="dt">MyService</span>
<span class="kw">instance</span> <span class="dt">Service</span> <span class="dt">MyService</span> <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">ServiceName</span> <span class="dt">MyService</span> <span class="fu">=</span> <span class="st">"myService"</span>
<span class="kw">type</span> <span class="dt">ServiceMethods</span> <span class="dt">MyService</span> <span class="fu">=</span> <span class="ch">'["biDiStreaming"]</span>
<span class="kw">instance</span> <span class="dt">HasMethod</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">MethodInput</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="fu">=</span> <span class="dt">Request</span>
<span class="kw">type</span> <span class="dt">MethodOutput</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="fu">=</span> <span class="dt">Response</span>
<span class="kw">type</span> <span class="dt">IsClientStreaming</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="fu">=</span> <span class="ch">'True</span>
<span class="kw">type</span> <span class="dt">IsServerStreaming</span> <span class="dt">MyService</span> <span class="st">"biDiStreaming"</span> <span class="fu">=</span> <span class="ch">'True</span></code></pre></div>
<p>and we would likewise add the <code>m</code> for any other <code>HasMethod MyService m</code> instances if they existed.</p>
<p>This seems like we can now use <code>ServiceMethods s</code> to get a list of methods, and then somehow type-level <code>map</code> over them to get the <code>HasMethod s m</code> constraints we want.</p>
<p>And we almost can, except that we haven’t told the type-system that <code>ServiceMethods s</code> relates to <code>HasService s m</code> instances in this way. We can add a superclass constraint to <code>Service</code> to do this:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">HasAllMethods</span> s (<span class="dt">ServiceMethods</span> s) <span class="ot">=></span> <span class="dt">Service</span> s <span class="kw">where</span>
<span class="co">-- as before</span></code></pre></div>
<p>But was is this <code>HasAllMethods</code> thing? It’s a specialized type-level <code>map</code> which turns our list of methods into a bunch of constraints proving we have <code>HasMethod s m</code> for every <code>m</code> in that promoted list.</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">HasAllMethods</span> s (<span class="ot">xs ::</span> [<span class="dt">Symbol</span>])
<span class="kw">instance</span> <span class="dt">HasAllMethods</span> s <span class="ch">'[]</span>
<span class="kw">instance</span> (<span class="dt">HasMethod</span> s x, <span class="dt">HasAllMethods</span> s xs) <span class="ot">=></span> <span class="dt">HasAllMethods</span> s (x <span class="ch">': xs)</span></code></pre></div>
<p>We can think of <code>xs</code> here as the list of constraints we want. Obviously if we don’t want any constraints (the <code>'[]</code> case), we trivially have all of them. The other case is induction: if we have a non-empty list of constraints we’re looking for, that’s the same as looking for the tail of the list, and having the constraint for the head of it.</p>
<p>Read through these instances a few times; make sure you understand the approach before continuing, because we’re going to keep using this technique in scarier and scarier ways.</p>
<p>With this <code>HasAllMethods</code> superclass constraint, we can now convince ourselves (and, more importantly, GHC), that we can go from a <code>Service s</code> constraint to all of its <code>HasMethod s m</code> constraints. Cool!</p>
<h2 id="typing-the-server">Typing the Server</h2>
<p>We return to thinking about how to actually run a server. As we’ve discussed, such a function will need to be able to handle every possible method, and, unfortunately, we can’t pack them into a convenient data structure.</p>
<p>Our actual implementation of such a thing might take a list of handlers. But recall that each handler has different input and output types, as well as different shapes depending on which bits of it are streaming. We can make this approach work by <a href="http://reasonablypolymorphic.com/existentialization/">existentializing</a> away all of the details.</p>
<p>While it works as far as the actual implementation of the underlying gRPC goes, we’re left with a great sense of uneasiness. We have no guarantees that we’ve provided a handler for every method, and the very nature of existentialization means we have absolutely no guarantees that any of these things are the right ype.</p>
<p>Our only recourse is to somehow use our <code>Service s</code> constraint to put a prettier facade in front of this ugly-if-necessary implementation detail.</p>
<p>The actual interface we’ll eventually provide will, for example, for a service with two methods, look like this:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="ot">runServer ::</span> <span class="dt">HandlerForMethod1</span> <span class="ot">-></span> <span class="dt">HandlerForMethod2</span> <span class="ot">-></span> <span class="dt">IO</span> ()</code></pre></div>
<p>Of course, we can’t know a priori how many methods there will be (or what type their handlers should have, for that matter). We’ll somehow need to extract this information from <code>Service s</code> – which is why we previously spent so much effort on making the methods discoverable.</p>
<p>The technique we’ll use is the same one you’ll find yourself using again and again when you’re programming at the type-level. We’ll make a typeclass with an associated type family, and then provide a base case and an induction case.</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">HasServer</span> s (<span class="ot">xs ::</span> [<span class="dt">Symbol</span>]) <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">ServerType</span> s<span class="ot"> xs ::</span> <span class="fu">*</span></code></pre></div>
<p>We need to make the methods <code>xs</code> explicit as parameters in the typeclass, so that we can reduce them. The base case is simple – a server with no more handlers is just an IO action:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">HasServer</span> s <span class="ch">'[] where</span>
<span class="kw">type</span> <span class="dt">ServerType</span> s <span class="ch">'[] = IO ()</span></code></pre></div>
<p>The induction case, however, is much more interesting:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">instance</span> ( <span class="dt">HasMethod</span> s x
, <span class="dt">HasMethodHandler</span> s x
, <span class="dt">HasServer</span> s xs
) <span class="ot">=></span> <span class="dt">HasServer</span> s (x <span class="ch">': xs) where</span>
<span class="kw">type</span> <span class="dt">ServerType</span> s (x <span class="ch">': xs) = MethodHandler s x -> ServerType s xs</span></code></pre></div>
<p>The idea is that as we pull methods <code>x</code> off our list of methods to handle, we build a function type that takes a value of the correct type to handle method <code>x</code>, which will take another method off the list until we’re out of methods to handle. This is exactly a type-level fold over a list.</p>
<p>The only remaining question is “what is this <code>MethodHandler</code> thing?” It’s going to have to be a type family that will give us back the correct type for the handler under consideration. Such a type will need to dispatch on the streaming variety as well as the request and response, so we’ll define it as follows, and go back and fix <code>HasServer</code> later.</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">HasMethodHandler</span> input output cs ss <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">MethodHandler</span> input output cs<span class="ot"> ss ::</span> <span class="fu">*</span></code></pre></div>
<p><code>cs</code> and <code>ss</code> refer to whether we’re looking for client-streaming and/or server-streaming types, respectively.</p>
<p>Such a thing could be a type family, but isn’t because we’ll need its class-ness later in order to actually provide an implementation of all of this stuff. We provide the following instances:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="co">-- non-streaming</span>
<span class="kw">instance</span> <span class="dt">HasMethodHandler</span> input output <span class="ch">'False '</span><span class="dt">False</span> <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">MethodHandler</span> input output <span class="ch">'False '</span><span class="dt">False</span> <span class="fu">=</span>
input <span class="ot">-></span> <span class="dt">IO</span> output
<span class="co">-- server-streaming</span>
<span class="kw">instance</span> <span class="dt">HasMethodHandler</span> input output <span class="ch">'False '</span><span class="dt">False</span> <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">MethodHandler</span> input output <span class="ch">'False '</span><span class="dt">True</span> <span class="fu">=</span>
input <span class="ot">-></span> (output <span class="ot">-></span> <span class="dt">IO</span> ()) <span class="ot">-></span> <span class="dt">IO</span> ()
<span class="co">-- etc for client and bidi streaming</span></code></pre></div>
<p>With <code>MethodHandler</code> now powerful enough to give us the types we want for handlers, we can go back and fix <code>HasServer</code> so it will compile again:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">instance</span> ( <span class="dt">HasMethod</span> s x
, <span class="dt">HasMethodHandler</span> (<span class="dt">MethodInput</span> s x)
(<span class="dt">MethodOutput</span> s x)
(<span class="dt">IsClientStreaming</span> s x)
(<span class="dt">IsServerStreaming</span> s x)
, <span class="dt">HasServer</span> s xs
) <span class="ot">=></span> <span class="dt">HasServer</span> s (x <span class="ch">': xs) where</span>
<span class="kw">type</span> <span class="dt">ServerType</span> s (x <span class="ch">': xs)</span>
<span class="fu">=</span> <span class="dt">MethodHandler</span> (<span class="dt">MethodInput</span> s x)
(<span class="dt">MethodOutput</span> s x)
(<span class="dt">IsClientStreaming</span> s x)
(<span class="dt">IsServerStreaming</span> s x)
<span class="ot">-></span> <span class="dt">ServerType</span> s xs</code></pre></div>
<p>It’s not pretty, but it works! We can convince ourselves of this by asking ghci:</p>
<pre><code>ghci> :kind! ServerType MyService (ServiceMethods MyService)
(Request -> (Response -> IO ()) -> IO ()) -> IO () :: *</code></pre>
<p>and, if we had other methods defined for <code>MyService</code>, they’d show up here with the correct handler type, in the order they were listed in <code>ServiceMethods MyService</code>.</p>
<h2 id="implementing-the-server">Implementing the Server</h2>
<p>Our <code>ServerType</code> family now expands to a function type which takes a handler value (of the correct type) for every method on our service. That turns out to be more than half the battle – all we need to do now is to provide a value of this type.</p>
<p>The generation of such a value is going to need to proceed in perfect lockstep with the generation of its type, so we add to the definition of <code>HasServer</code>:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">HasServer</span> s (<span class="ot">xs ::</span> [<span class="dt">Symbol</span>]) <span class="kw">where</span>
<span class="kw">type</span> <span class="dt">ServerType</span> s<span class="ot"> xs ::</span> <span class="fu">*</span>
<span class="ot"> runServerImpl ::</span> [<span class="dt">AnyHandler</span>] <span class="ot">-></span> <span class="dt">ServerType</span> s xs</code></pre></div>
<p>What is this <code>[AnyHandler]</code> thing, you might ask. It’s an explicit accumulator for existentialized handlers we’ve collected during the fold over <code>xs</code>. It’ll make sense when we look at the induction case.</p>
<p>For now, however, the base case is trivial as always:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">HasServer</span> s <span class="ch">'[] where</span>
<span class="kw">type</span> <span class="dt">ServerType</span> s <span class="ch">'[] = IO ()</span>
runServerImpl handlers <span class="fu">=</span> runGRPCServer handlers</code></pre></div>
<p>where <code>runGRPCServer</code> is the underlying server provided by Awake’s library.</p>
<p>We move to the induction case:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">instance</span> ( <span class="dt">HasMethod</span> s x
, <span class="dt">HasMethodHandler</span> (<span class="dt">MethodInput</span> s x)
(<span class="dt">MethodOutput</span> s x)
(<span class="dt">IsClientStreaming</span> s x)
(<span class="dt">IsServerStreaming</span> s x)
, <span class="dt">HasServer</span> s xs
) <span class="ot">=></span> <span class="dt">HasServer</span> s (x <span class="ch">': xs) where</span>
<span class="kw">type</span> <span class="dt">ServerType</span> s (x <span class="ch">': xs)</span>
<span class="fu">=</span> <span class="dt">MethodHandler</span> (<span class="dt">MethodInput</span> s x)
(<span class="dt">MethodOutput</span> s x)
(<span class="dt">IsClientStreaming</span> s x)
(<span class="dt">IsServerStreaming</span> s x)
<span class="ot">-></span> <span class="dt">ServerType</span> s xs
runServerImpl handlers f <span class="fu">=</span> runServerImpl (existentialize f <span class="fu">:</span> handlers)</code></pre></div>
<p>where <code>existentialize</code> is a new class method we add to <code>HasMethodHandler</code> We will elide it here because it is just a function <code>MethodHandler i o cs mm -> AnyHandler</code> and is not particularly interesting if you’re familiar with existentialization.</p>
<p>It’s evident here what I meant by <code>handlers</code> being an explicit accumulator – our recursion adds the parameters it receives into this list so that it can pass them eventually to the base case.</p>
<p>There’s a problem here, however. Reading through this implementation of <code>runServerImpl</code>, you and I both know what the right-hand-side means, unfortunately GHC isn’t as clever as we are. If you try to compile it right now, GHC will complain about the non-injectivity of <code>HasServer</code> as implied by the call to <code>runServerImpl</code> (and also about <code>HasMethodHandler</code> and <code>existentialize</code>, but for the exact same reason.)</p>
<p>The problem is that there’s nothing constraining the type variables <code>s</code> and <code>xs</code> on <code>runServerImpl</code>. I always find this error confusing (and I suspect everyone does), because in my mind it’s perfectly clear from the <code>HasServer s xs</code> in the instance constraint. However, because <code>SeverType</code> is a type family without any injectivity declarations, it means we can’t learn <code>s</code> and <code>xs</code> from <code>ServerType s xs</code>.</p>
<p>Let’s see why. For a very simple example, let’s look at the following type family:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">type</span> family <span class="dt">NotInjective</span> a <span class="kw">where</span>
<span class="dt">NotInjective</span> <span class="dt">Int</span> <span class="fu">=</span> ()
<span class="dt">NotInjective</span> <span class="dt">Bool</span> <span class="fu">=</span> ()</code></pre></div>
<p>Here we have <code>NotInjective Int ~ ()</code> and <code>NotInjective Bool ~ ()</code>, which means even if we know <code>NotInjective a ~ ()</code> it doesn’t mean that we know what <code>a</code> is – it could be either <code>Int</code> or <code>Bool</code>.</p>
<p>This is the exact problem we have with <code>runServerImpl</code>: even though we know what type <code>runServerImpl</code> has (it must be <code>ServerType s xs</code>, so that the type on the left-hand of the equality is the same as on the right), that doesn’t mean we know what <code>s</code> and <code>xs</code> are! The solution is to explicitly tell GHC via a type signature or type application:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">instance</span> ( <span class="dt">HasMethod</span> s x
, <span class="dt">HasMethodHandler</span> (<span class="dt">MethodInput</span> s x)
(<span class="dt">MethodOutput</span> s x)
(<span class="dt">IsClientStreaming</span> s x)
(<span class="dt">IsServerStreaming</span> s x)
, <span class="dt">HasServer</span> s xs
) <span class="ot">=></span> <span class="dt">HasServer</span> s (x <span class="ch">': xs) where</span>
<span class="kw">type</span> <span class="dt">ServerType</span> s (x <span class="ch">': xs)</span>
<span class="fu">=</span> <span class="dt">MethodHandler</span> (<span class="dt">MethodInput</span> s x)
(<span class="dt">MethodOutput</span> s x)
(<span class="dt">IsClientStreaming</span> s x)
(<span class="dt">IsServerStreaming</span> s x)
<span class="ot">-></span> <span class="dt">ServerType</span> s xs
runServerImpl handlers f <span class="fu">=</span> runServerImpl <span class="fu">@</span>s <span class="fu">@</span>xs (existentialize f <span class="fu">:</span> handlers)</code></pre></div>
<p>(For those of you playing along at home, you’ll need to type-apply the monstrous <code>MethodInput</code> and friends to the <code>existentialize</code> as well.)</p>
<p>And finally, we’re done! We can slap a prettier interface in front of this <code>runServerImpl</code> to fill in some of the implementation details for us:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">runServer
<span class="ot"> ::</span> forall s
<span class="fu">.</span> ( <span class="dt">Service</span> s
, <span class="dt">HasServer</span> s (<span class="dt">ServiceMethods</span> s)
)
<span class="ot">=></span> s
<span class="ot">-></span> <span class="dt">ServerType</span> s (<span class="dt">ServiceMethods</span> s)
runServer _ <span class="fu">=</span> runServerImpl <span class="fu">@</span>s <span class="fu">@</span>(<span class="dt">ServiceMethods</span> s) []</code></pre></div>
<p>Sweet and typesafe! Yes!</p>
<h2 id="client-side-usability">Client-side Usability</h2>
<p>Sweet and typesafe all of this might be, but the user-friendliness on the client-side leaves a lot to be desired. As promised, we’ll address that now.</p>
<h3 id="removing-proxies">Removing Proxies</h3>
<p>Recall that the <code>runNonStreamingClient</code> function and its friends require a <code>Proxy m</code> parameter in order to specify the method you want to call. However, <code>m</code> has kind <code>Symbol</code>, and thankfully we have some new extensions in GHC for turning <code>Symbol</code>s into values.</p>
<p>We can define a new type, isomorphic to <code>Proxy</code>, but which packs the fact that it is a <code>KnownSymbol</code> (something we can turn into a <code>String</code> at runtime):</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">WrappedMethod</span> (<span class="ot">sym ::</span> <span class="dt">Symbol</span>) <span class="kw">where</span>
<span class="dt">WrappedMethod</span><span class="ot"> ::</span> <span class="dt">KnownSymbol</span> sym <span class="ot">=></span> <span class="dt">WrappedMethod</span> sym</code></pre></div>
<p>We change our <code>run*Client</code> friends to take this <code>WrappedMethod m</code> instead of the <code>Proxy m</code> they used to:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">runNonStreamingClient
<span class="ot"> ::</span> ( <span class="dt">HasMethod</span> s m
, <span class="dt">IsClientStreaming</span> s m <span class="fu">~</span> <span class="ch">'False</span>
, <span class="dt">IsServerStreaming</span> s m <span class="fu">~</span> <span class="ch">'False</span>
)
<span class="ot">=></span> s
<span class="ot">-></span> <span class="dt">WrappedMethod</span> m
<span class="ot">-></span> <span class="dt">MethodInput</span> s m
<span class="ot">-></span> <span class="dt">IO</span> (<span class="dt">Either</span> <span class="dt">GRPCError</span> (<span class="dt">MethodOutput</span> s m))</code></pre></div>
<p>and, with this change in place, we’re ready for the magic syntax I promised earlier.</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">import </span><span class="dt">GHC.OverloadedLabel</span>
<span class="kw">instance</span> ( <span class="dt">KnownSymbol</span> sym
, sym <span class="fu">~</span> sym'
) <span class="ot">=></span> <span class="dt">IsLabel</span> sym (<span class="dt">WrappedMethod</span> sym') <span class="kw">where</span>
fromLabel _ <span class="fu">=</span> <span class="dt">WrappedMethod</span></code></pre></div>
<p>This <code>sym ~ sym'</code> thing is known as the <a href="http://chrisdone.com/posts/haskell-constraint-trick">constraint trick for instances</a>, and is necessary here to convince GHC that this can be the only possible instance of <code>IsLabel</code> that will give you back <code>WrappedMethod</code>s.</p>
<p>Now turning on the <code>{-# LANGUAGE OverloadedLabels #-}</code> pragma, we’ve changed the syntax to call these client functions from the ugly:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">runBiDiStreamingClient <span class="dt">MyService</span> (<span class="dt">Proxy</span> <span class="fu">@</span><span class="st">"biDiStreaming"</span>)</code></pre></div>
<p>into the much nicer:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">runBiDiStreamingClient <span class="dt">MyService</span> <span class="fu">#</span>biDiStreaming</code></pre></div>
<h3 id="better-wrong-streaming-variety-errors">Better “Wrong Streaming Variety” Errors</h3>
<p>The next step in our journey to delightful usability is remembering that the users of our library are only human, and at some point they are going to call the wrong <code>run*Client</code> function on their method with a different variety of streaming semantics.</p>
<p>At the moment, the errors they’re going to get when they try that will be a few stanza long, the most informative of which will be something along the lines of <code>unable to match 'False with 'True</code>. Yes, it’s technically correct, but it’s entirely useless.</p>
<p>Instead, we can use the <code>TypeError</code> machinery from <code>GHC.TypeLits</code> to make these error messages actually helpful to our users. If you aren’t familiar with it, if GHC ever encounters a <code>TypeError</code> constraint it will die with a error message of your choosing.</p>
<p>We will introduce the following type family:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">type</span> family <span class="dt">RunNonStreamingClient</span> (<span class="ot">cs ::</span> <span class="dt">Bool</span>) (<span class="ot">ss ::</span> <span class="dt">Bool</span>)<span class="ot"> ::</span> <span class="dt">Constraint</span> <span class="kw">where</span>
<span class="dt">RunNonStreamingClient</span> <span class="ch">'False '</span><span class="dt">False</span> <span class="fu">=</span> ()
<span class="dt">RunNonStreamingClient</span> <span class="ch">'False '</span><span class="dt">True</span> <span class="fu">=</span> <span class="dt">TypeError</span>
( <span class="dt">Text</span> <span class="st">"Called 'runNonStreamingClient' on a server-streaming method."</span>
<span class="fu">:$$:</span> <span class="dt">Text</span> <span class="st">"Perhaps you meant 'runServerStreamingClient'."</span>
)
<span class="dt">RunNonStreamingClient</span> <span class="ch">'True '</span><span class="dt">False</span> <span class="fu">=</span> <span class="dt">TypeError</span>
( <span class="dt">Text</span> <span class="st">"Called 'runNonStreamingClient' on a client-streaming method."</span>
<span class="fu">:$$:</span> <span class="dt">Text</span> <span class="st">"Perhaps you meant 'runClientStreamingClient'."</span>
)
<span class="dt">RunNonStreamingClient</span> <span class="ch">'True '</span><span class="dt">True</span> <span class="fu">=</span> <span class="dt">TypeError</span>
( <span class="dt">Text</span> <span class="st">"Called 'runNonStreamingClient' on a bidi-streaming method."</span>
<span class="fu">:$$:</span> <span class="dt">Text</span> <span class="st">"Perhaps you meant 'runBiDiStreamingClient'."</span>
)</code></pre></div>
<p>The <code>:$$:</code> type operator stacks message vertically, while <code>:<>:</code> stacks it horizontally.</p>
<p>We can change the constraints on <code>runNonStreamingClient</code>:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">runNonStreamingClient
<span class="ot"> ::</span> ( <span class="dt">HasMethod</span> s m
, <span class="dt">RunNonStreamingClient</span> (<span class="dt">IsClientStreaming</span> s m)
(<span class="dt">IsServerStreaming</span> s m)
)
<span class="ot">=></span> s
<span class="ot">-></span> <span class="dt">WrappedMethod</span> m
<span class="ot">-></span> <span class="dt">MethodInput</span> s m
<span class="ot">-></span> <span class="dt">IO</span> (<span class="dt">Either</span> <span class="dt">GRPCError</span> (<span class="dt">MethodOutput</span> s m))</code></pre></div>
<p>and similarly for our other client functions. Reduction of the resulting boilerplate is left as an exercise to the reader.</p>
<p>With all of this work out of the way, we can test it:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">runNonStreamingClient <span class="dt">MyService</span> <span class="fu">#</span>biDiStreaming</code></pre></div>
<pre><code>Main.hs:45:13: error:
• Called 'runNonStreamingClient' on a bidi-streaming method.
Perhaps you meant 'runBiDiStreamingClient'.
• In the expression: runNonStreamingClient MyService #bidi</code></pre>
<p>Amazing!</p>
<h3 id="better-wrong-method-errors">Better “Wrong Method” Errors</h3>
<p>The other class of errors we expect our users to make is to attempt to call a method that doesn’t exist – either because they made a typo, or are forgetful of which methods exist on the service in question.</p>
<p>As it stands, users are likely to get about six stanzas of error messages, from <code>No instance for (HasMethod s m)</code> to <code>Ambiguous type variable 'm0'</code>, and other terrible things that leak our implementation details. Our first thought might be to somehow emit a <code>TypeError</code> constraint if we <em>don’t</em> have a <code>HasMethod s m</code> instance, but I’m not convinced such a thing is possible.</p>
<p>But luckily, we can actually do better than any error messages we could produce in that way. Since our service is driven by a value (in our example, the data constructor <code>MyService</code>), by the time things go wrong we <em>do</em> have a <code>Service s</code> instance in scope. Which means we can look up our <code>ServiceMethods s</code> and given some helpful suggestions about what the user probably meant.</p>
<p>The first step is to implement a <code>ListContains</code> type family so we can determine if the method we’re looking for is actually a real method.</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">type</span> family <span class="dt">ListContains</span> (<span class="ot">n ::</span> k) (<span class="ot">hs ::</span> [k])<span class="ot"> ::</span> <span class="dt">Bool</span> <span class="kw">where</span>
<span class="dt">ListContains</span> n <span class="ch">'[] = '</span><span class="dt">False</span>
<span class="dt">ListContains</span> n (n <span class="ch">': hs) = '</span><span class="dt">True</span>
<span class="dt">ListContains</span> n (x <span class="ch">': hs) = ListContains n hs</span></code></pre></div>
<p>In the base case, we have no list to look through, so our needle is trivially not in the haystack. If the head of the list is the thing we’re looking for, then it must be in the list. Otherwise, take off the head of the list and continue looking. Simple really, right?</p>
<p>We can now use this thing to generate an error message in the case that the method we’re looking for is not in our list of methods:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">type</span> family <span class="dt">RequireHasMethod</span> s (<span class="ot">m ::</span> <span class="dt">Symbol</span>) (<span class="ot">found ::</span> <span class="dt">Bool</span>)<span class="ot"> ::</span> <span class="dt">Constraint</span> <span class="kw">where</span>
<span class="dt">RequireHasMethod</span> s m <span class="ch">'False = TypeError</span>
( <span class="dt">Text</span> <span class="st">"No method "</span>
<span class="fu">:<>:</span> <span class="dt">ShowType</span> m
<span class="fu">:<>:</span> <span class="dt">Text</span> <span class="st">" available for service '"</span>
<span class="fu">:<>:</span> <span class="dt">ShowType</span> s
<span class="fu">:<>:</span> <span class="dt">Text</span> <span class="st">"'."</span>
<span class="fu">:$$:</span> <span class="dt">Text</span> <span class="st">"Available methods are: "</span>
<span class="fu">:<>:</span> <span class="dt">ShowType</span> (<span class="dt">ServiceMethods</span> s)
)
<span class="dt">RequireHasMethod</span> s m <span class="ch">'True = ()</span></code></pre></div>
<p>If <code>found ~ 'False</code>, then the method <code>m</code> we’re looking for is not part of the service <code>s</code>. We produce a nice error message informing the user about this (using <code>ShowType</code> to expand the type variables).</p>
<p>We will provide a type alias to perform this lookup:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">HasMethod'</span> s m <span class="fu">=</span>
( <span class="dt">RequireHasMethod</span> s m (<span class="dt">ListContains</span> m (<span class="dt">ServiceMethods</span> s)
, <span class="dt">HasMethod</span> s m
)</code></pre></div>
<p>Our new <code>HasMethod' s m</code> has the same shape as <code>HasMethod</code>, but will expand to our custom type error if we’re missing the method under scrutiny.</p>
<p>Replacing all of our old <code>HasMethod</code> constraints with <code>HasMethod'</code> works fantastically:</p>
<pre><code>Main.hs:54:15: error:
• No method "missing" available for service 'MyService'.
Available methods are: '["biDiStreaming"]</code></pre>
<p>Damn near perfect! That list of methods is kind of ugly, though, so we can write a quick pretty printer for showing promoted lists:</p>
<div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">type</span> family <span class="dt">ShowList</span> (<span class="ot">ls ::</span> [k])<span class="ot"> ::</span> <span class="dt">ErrorMessage</span> <span class="kw">where</span>
<span class="dt">ShowList</span> <span class="ch">'[] = Text ""</span>
<span class="dt">ShowList</span> <span class="ch">'[x] = ShowType x</span>
<span class="dt">ShowList</span> (x <span class="ch">': xs) = ShowType x :<>: Text ", " :<>: ShowList xs</span></code></pre></div>
<p>Replacing our final <code>ShowType</code> with <code>ShowList</code> in <code>RequireHasMethod</code> now gives us error messages of the following:</p>
<pre><code>Main.hs:54:15: error:
• No method "missing" available for service 'MyService'.
Available methods are: "biDiStreaming"</code></pre>
<p>Absolutely gorgeous.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This is where we stop. We’ve used type-level metadata to generate client- and server-side bindings to an underlying library. Everything we’ve made is entirely typesafe, and provides gorgeous, helpful error messages if the user does anything wrong. We’ve found a practical use for many of these seemingly-obscure type-level features, and learned a few things in the process.</p>
<p>In the words of my coworker <a href="https://ren.zone/articles/opaleye-sot">Renzo Carbonara</a><a href="http://reasonablypolymorphic.com/atom.xml#fn1" id="fnref1" class="footnoteRef"><sup>1</sup></a>:</p>
<p>“It is up to us, as people who understand a problem at hand, to try and teach the type system as much as we can about that problem. And when we don’t understand the problem, talking to the type system about it will help us understand. Remember, the type system is not magic, it is a logical reasoning tool.”</p>
<p>This resounds so strongly in my soul, and maybe it will in yours too. If so, I encourage you to go forth and find uses for these techniques to improve the experience and safety of your own libraries.</p>
<div class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Whose article “Opaleye’s sugar on top” was a strong inspiration on me, and subsequently on this post.<a href="http://reasonablypolymorphic.com/atom.xml#fnref1">↩</a></p></li>
</ol>
</div>
</div>
<p class="meta">
<span class="prev">
<a href="http://reasonablypolymorphic.com/blog/recursion-schemes">←</a>
</span>
<span class="next">
<a href="http://reasonablypolymorphic.com/blog/difference-of-squares">→</a>
</span>
</p>
</article>
</div>Sat, 18 Nov 2017 00:00:00 +0000Tweag I/O: Parallelising your array codehttp://www.tweag.io/posts/2017-11-16-repa.html
http://www.tweag.io/posts/2017-11-16-repa.html
<div>Manuel M T Chakravarty</div><p><em>This is the fifth post in a series about array programming in Haskell — you might be interested in the <a href="http://www.tweag.io/posts/2017-08-09-array-programming-in-haskell.html">first</a>, <a href="http://www.tweag.io/posts/2017-08-31-hmatrix.html">second</a>, <a href="http://www.tweag.io/posts/2017-09-27-array-package.html">third</a>, and <a href="http://www.tweag.io/posts/2017-10-12-vector-package.html">fourth</a>, too.</em></p>
<p>A recurring theme in array programming is performance. After all, many algorithms in numerical computing and data science are computationally intensive. Once the sequential implementation of an array program has been fully optimised, the natural next step is to use one or multiple forms of parallelism to achieve further performance improvements. This can be parallelism within one computational core (SIMD parallelism), multicore parallelism, or distributed multi-machine parallelism. Unfortunately, at this point matters become much more complicated, because parallel programming comes with its own set of serious challenges.</p>
<p>In this post, we will focus on multicore parallelism for computations operating on multi-dimensional arrays. In other words, in relation to the <code>vector</code> package, which we discussed <a href="http://www.tweag.io/posts/2017-10-12-vector-package.html">in the last post</a>, we have two new ingredients. Firstly, instead of <em>one-dimensional</em> Int-indexed arrays, we have <em>multi-dimensional</em> Int-indexed arrays. Secondly, the collective operations provided on these arrays come with parallel implementations. In fact, the library API is designed to favour collective operations that have good parallel implementations. Similarly, the move to explicitly multi-dimensional arrays is motivated by being able to provide parallel implementations that take the array shape into account, wherever that is an advantage.</p>
<p>To make matters concrete, we will discuss the <a href="https://hackage.haskell.org/package/repa"><code>Repa</code></a> library. Internally it uses many of the same techniques as <code>vector</code>, including <em>strictness</em>, <em>unboxing</em>, and a <em>two-phase</em> initialisation strategy. However, it uses a second array fusion strategy in addition to <code>vector</code>’s <em>stream fusion</em>. More precisely, <code>Repa</code> internally uses <code>vector</code> to represent plain boxed and unboxed arrays and to execute sequential computations on those, which still benefit from stream fusion. However, <code>Repa</code> introduces additional array representations, such as <em>delayed arrays</em>, to also achieve fusion across parallel computations.</p>
<p>This additional complication is necessary as stream fusion, by itself, tends to turn parallel into sequential code. In other words, one of the challenges of high-performance parallel array implementations that are built on collective operations is <em>to apply fusion while preserving parallelism</em>. To really get good performance, we need to simultaneously optimize along two orthogonal dimensions: get more done simultaneously, by parallelizing, but also make each sequential unit of work run faster.</p>
<p>A second consequence of targeting a parallelisation-friendly API is a very limited use of mutable arrays. Mutable structures generally interact badly with concurrency and parallelism, opening the door to a whole range of hard to diagnose faults. In fact, the focus on immutable arrays for parallel programming is <em>one of the most compelling conceptual improvements of functional over imperative parallel array programming</em>. (To be precise, <code>Repa</code>’s API does provide access to the mutable array structures used to implement two-phase initialisation, but it is usually not necessary to use them directly.)</p>
<h2>Multiple dimensions</h2>
<p>The obvious structure for indexing multi-dimensional Int-indexed arrays are tuples of <code>Int</code>s. However, they come with two severe drawbacks: (1) they force us to fix the dimensionality of all functions over arrays and (2) they are not sufficient to characterise operations on lower-dimensional subarrays of an array (e.g., a two-dimensional plane within a three-dimensional cube).</p>
<p>As an example of the first drawback, consider a fold function that given a three-dimensional cube, reduces it along, say, the x-axis to a two-dimensional plane of sums. The only difference of that operation compared to a fold that sums a two-dimensional plane across one axis to a one-dimensional vector is the number of dimensions that we do not reduce along. Now, we could have a family of fold functions (<code>fold1</code>, <code>fold2</code>, and so on), one for each possible dimension of argument array. But that is hardly satisfactory.</p>
<p>Instead, Repa uses a custom datatype for indexing. Index types are built from the infix constructor <code>(:.)</code> and the constant <code>Z</code>, representing a zero-dimensional array (which is the special cases of a singleton array). For example, the type of two-dimensional indices is <code>Z :. Int :. Int</code> and one of its values is <code>Z :. 3 :. 5</code>. By using a type variable instead of <code>Z</code>, we can denote indices with a particular minimum dimensionality. For instance, <code>sh :. Int</code> has at least one dimension, but it might have more, depending on how the type variable <code>sh</code> is instantiated — in any case, instances of <code>sh</code> need to be drawn from the class <code>Shape</code>. On the basis of this index representation, we can capture the entire family of multi-dimensional fold functions in a single type:</p>
<pre><code>foldS :: (Shape sh, Source r a, Unbox a)
=> (a -> a -> a) -> a -> Array r (sh :. Int) a -> Array U sh a
</code></pre>
<p>The function <code>foldS</code> implements a sequential, multi-dimensional reduction; hence, the <code>S</code> suffix. It gets three arguments:</p>
<ol>
<li><code>a -> a -> a</code> is the type of the binary reduction function, which needs to be associative,</li>
<li><code>a</code> is the reduction function’s neutral (i.e, together they form a monoid), and</li>
<li><code>Array e (sh :. Int) a</code> is an at least one-dimensional array of elements of type <code>a</code>, which the type constraint <code>Unbox a</code> requires to be a type that has an associated unboxed representation.</li>
</ol>
<p>Finally, the result of type <code>Array U sh a</code> has one dimension less than the argument array, but contains elements of the same type <code>a</code>. This leaves us with wondering about the meaning of the first type argument of <code>Array</code> — <code>r</code> and <code>U</code>, respectively— as well as the type constraint <code>Source r a</code>.</p>
<h2>Indexed arrays</h2>
<p>The first type argument of <code>Array</code> determines the array <em>representation</em>. The available representations include boxed (<code>V</code>) and unboxed (<code>U</code>) representations, but also <em>delayed</em> (<code>D</code>) and <em>cursored</em> (<code>C</code>) representations. The latter are guaranteed to be removed by fusion, but can lead to the superfluous recomputation of array elements that are used more than once. Repa makes the choice of representation explicit to place it under programmer control — experience shows that compiler heuristics for automatic representation selection tend to be fragile and unreliable.</p>
<p>A consequence of a representation that is fused away, such as delayed <code>D</code> and cursored <code>C</code>, is that it can only be a data <code>Source</code> of a computation. Hence, the type class of the same name provides elementary array access functions for arrays. The opposite, a <code>Target</code>, provides the functionality to fill an array as part of two-phase initialisation and is only available to <em>manifest</em> representations, such as the boxed <code>V</code> and unboxed <code>U</code> representation. A manifest representation is one which, in contrast to a fused-away delayed representation, is actually stored in memory.</p>
<p>In addition to concrete representations, Repa representation tags can also include meta information, such as the <em>interleaving</em> hint <code>I</code>. An array tagged <code>I U</code> uses an unboxed interleaved representation, which improves parallel load balancing in parallel computations where the amount of work strongly varies between different regions in the parallel array. A standard example is computing a Mandelbrot set, where black pixels are significantly more expensive than others.</p>
<h2>Parallelism</h2>
<p>As we saw above with <code>foldS</code>, Repa follows the convention of adding an <code>S</code> to sequential array operations. Similarly, it uses a <code>P</code> as a suffix for parallel functions. For example, we have</p>
<pre><code>foldP :: (Shape sh, Source r a, Unbox a, Monad m)
=> (a -> a -> a) -> a -> Array r (sh :. Int) a -> m (Array U sh a)
</code></pre>
<p>for the parallel version of fold. The distinction between sequential and parallel functions is an important one, since Repa does not support nested parallelism. That is, a parallel function (e.g., <code>foldP</code>) cannot use another parallel function as an argument (e.g., as the combination function).</p>
<p>In addition to the suffix, the parallel fold distinguishes itself from the sequential by the use of a not further specified monad. The purpose of this monad is to ensure the one-by-one execution of pipelines of parallel computations. This is important to prevent inadvertent nesting of parallel computations as Haskell is a lazy language and we might otherwise feed a suspended (i.e., not yet evaluated) parallel computation into another parallel computation.</p>
<h2>Parallel matrix multiplication</h2>
<p>As a simple example of a parallel computation, consider the multiplication of two matrices <code>arr</code> and <code>brr</code> of type <code>Array U DIM2 Double</code> (two-dimensional, unboxed arrays), where <code>type DIM2 = Z :. Int :. Int</code>:</p>
<pre><code>mmultP :: Monad m
=> Array U DIM2 Double
-> Array U DIM2 Double
-> m (Array U DIM2 Double)
mmultP arr brr
= do trr <- transpose2P brr
computeP (fromFunction (Z :. h1 :. w2) dotp)
where
(Z :. h1 :. _) = extent arr
(Z :. _ :. w2) = extent brr
dotp ix = sumAllS $
zipWith (*)
(slice arr (Any :. (row ix) :. All))
(slice trr (Any :. (col ix) :. All))
</code></pre>
<p>We assume the existence of a helper function <code>transpose2P</code>, which transposes a matrix in parallel — for example, by using Repa’s <code>backpermute</code> function. Then, we generate the manifest result array by computing all elements of <code>fromFunction (Z :. h1 :. w2) dotp</code>in parallel with <code>computeP</code>. The shape (i.e., the size of the dimensions) of the result is <code>h1</code> times <code>w2</code>, and <code>fromFunction</code> turns a function, which takes an array index to the corresponding array element , into a delayed array:</p>
<pre><code>fromFunction :: sh -> (sh -> a) -> Array D sh a
</code></pre>
<p>At each index <code>ix</code> of the resulting array, we evaluate <code>dotp</code>, which only involves a sequential computation. It’s sequential nature is important for two reasons. Firstly, as mentioned, Repa does not support nested parallelism, so the computations on each result array index triggered by <code>computeP</code> in parallel may themselves not be parallel. Secondly, the work complexity of matrix multiplication is <em>n</em>^3 — that is the number of scalar multiplications that need to be performed. Performing them all in parallel would lead to (a) too much and (b) too fine-grained parallelism. Both too much parallelism and parallel workloads that are each too little work lead to bad performance as they result in too much administrative overhead.</p>
<p>In contrast, the sequential computation performed by <code>dotp</code> obtains a row of the matrix <code>arr</code> and a column of <code>brr</code> (actually, a row of the transposed <code>brr</code>, which is <code>trr</code>) with <code>slice</code>, which extracts an entire subarray from an array. Then, it multiples the row and column pointwise with <code>zipWith (*)</code> and sums up the products with <code>sumAllS</code>, where</p>
<pre><code>zipWith :: (Shape sh, Source r1 a, Source r2 b)
=> (a -> b -> c) -> Array r1 sh a -> Array r2 sh b -> Array D sh c
sumAllS :: (Shape sh, Source r a, Num a) => Array r sh a -> a
</code></pre>
<p>This example highlights how reasoning about the decomposition of an algorithm into parallel and sequential components is crucial for good parallel performance. This is assisted by Repa’s clear distinction between sequential and parallel operations.</p>
<h2>Further reading</h2>
<p>Repa went through three major iterations before arriving at the current interface. The underlying concepts are described and supported by benchmarks in the papers <a href="http://benl.ouroborus.net/papers/2010-rarrays/repa-icfp2010.pdf">Regular, shape-polymorphic, parallel arrays in Haskell</a>, <a href="http://benl.ouroborus.net/papers/2011-stencil/stencil-haskell2011.pdf">Efficient Parallel Stencil Convolution in Haskell</a>, and <a href="http://benl.ouroborus.net/papers/2012-guiding/guiding-Haskell2012.pdf">Guiding Parallel Array Fusion with Indexed Types</a>, respectively. In addition, <a href="http://benl.ouroborus.net/papers/2013-series/flow-Haskell2013-rev1.pdf">Data Flow Fusion with Series Expressions in Haskell</a> proposes a further improvement to the fusion system. However, this has not been integrated into the main package.</p>Thu, 16 Nov 2017 00:00:00 +0000Jeremy Gibbons: The Digits of Pihttp://patternsinfp.wordpress.com/?p=316
https://patternsinfp.wordpress.com/2017/11/09/the-digits-of-pi/
<p>
In <a href="https://patternsinfp.wordpress.com/2017/10/04/metamorphisms/">the previous post</a> we were introduced to <em>metamorphisms</em>, which consist of an unfold after a fold—typically on lists, and the fold part typically a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" />. A canonical example is the conversion of a fraction from one base to another. For simplicity, let’s consider here only infinite fractions, so we don’t have to deal with the end of the input and flushing the state: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmulticolumn%7B3%7D%7B%40%7B%7Dl%7D%7B%5Cmathit%7Bstream%7D+%3A%3A+%28%5Cbeta+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cgamma%2C%5Cbeta%29%29+%5Crightarrow+%28%5Cbeta+%5Crightarrow+%5Calpha+%5Crightarrow+%5Cbeta%29+%5Crightarrow+%5Cbeta+%5Crightarrow+%5B%5Calpha%5D+%5Crightarrow+%5B%5Cgamma%5D%7D+%5C%5C+%5Cmathit%7Bstream%7D%5C%3Bg%5C%3Bf%5C%3Bb%5C%3Bx+%26%3D%26+%5Cmathbf%7Bcase%7D%5C%3Bg%5C%3Bb%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7BJust%7D%5C%3B%28c%2Cb%27%29+%26%5Crightarrow%26+c+%3A+%5Cmathit%7Bstream%7D%5C%3Bg%5C%3Bf%5C%3Bb%27%5C%3Bx+%5C%5C+%5Cmathit%7BNothing%7D+%26%5Crightarrow%26+%5Cmathbf%7Bcase%7D%5C%3Bx%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+a%3Ax%27+%26%5Crightarrow%26+%5Cmathit%7Bstream%7D%5C%3Bg%5C%3Bf%5C%3B%28f%5C%3Bb%5C%3Ba%29%5C%3Bx%27+%5Cend%7Barray%7D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l}{\mathit{stream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{stream}\;g\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{stream}\;g\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{stream}\;g\;f\;(f\;b\;a)\;x' \end{array} \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l}{\mathit{stream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{stream}\;g\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{stream}\;g\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{stream}\;g\;f\;(f\;b\;a)\;x' \end{array} \end{array} \end{array} " />
</p></blockquote>
<p> So for example, we can convert an infinite fraction in base 3 to one in base 7 with </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bstream%7D%5C%3B%5Cmathit%7Bnext%7D%5C%3B%5Cmathit%7Bstep%7D%5C%3B%280%2C1%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{stream}\;\mathit{next}\;\mathit{step}\;(0,1) " class="latex" title="\displaystyle \mathit{stream}\;\mathit{next}\;\mathit{step}\;(0,1) " />
</p></blockquote>
<p> where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bnext%7D%5C%3B%28u%2Cv%29+%26%3D%26+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dl%7D+%5Cmathbf%7Blet%7D%5C%3By+%3D+%5Clfloor%7B7+%5Ctimes+u+%5Ctimes+v%7D%5Crfloor%5C%3B%5Cmathbf%7Bin%7D+%5C%5C+%5Cmathbf%7Bif%7D%5C%3B%5Clfloor%7By%7D%5Crfloor+%3D+%5Clfloor%7B7+%5Ctimes+%28u%2B1%29+%5Ctimes+v%7D%5Crfloor%5C%3B%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dl%40%7B%5C%3B%7Dl%7D%5Cmathbf%7Bthen%7D%26%5Cmathit%7BJust%7D%5C%3B%28y%2C%28u+-+y%2F%28v+%5Ctimes+7%29%2C+v+%5Ctimes+7%29%29%5C%5C%5Cmathbf%7Belse%7D%26%5Cmathit%7BNothing%7D+%5C%5C+%5Cend%7Barray%7D+%5Cend%7Barray%7D+%5C%5C+%5Cmathit%7Bstepl%7D%5C%3B%28u%2Cv%29%5C%3Bd+%26%3D%26+%28u+%5Ctimes+3+%2B+d%2C+v+%2F+3%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{next}\;(u,v) &=& \begin{array}[t]{@{}l} \mathbf{let}\;y = \lfloor{7 \times u \times v}\rfloor\;\mathbf{in} \\ \mathbf{if}\;\lfloor{y}\rfloor = \lfloor{7 \times (u+1) \times v}\rfloor\;\begin{array}[t]{@{}l@{\;}l}\mathbf{then}&\mathit{Just}\;(y,(u - y/(v \times 7), v \times 7))\\\mathbf{else}&\mathit{Nothing} \\ \end{array} \end{array} \\ \mathit{stepl}\;(u,v)\;d &=& (u \times 3 + d, v / 3) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{next}\;(u,v) &=& \begin{array}[t]{@{}l} \mathbf{let}\;y = \lfloor{7 \times u \times v}\rfloor\;\mathbf{in} \\ \mathbf{if}\;\lfloor{y}\rfloor = \lfloor{7 \times (u+1) \times v}\rfloor\;\begin{array}[t]{@{}l@{\;}l}\mathbf{then}&\mathit{Just}\;(y,(u - y/(v \times 7), v \times 7))\\\mathbf{else}&\mathit{Nothing} \\ \end{array} \end{array} \\ \mathit{stepl}\;(u,v)\;d &=& (u \times 3 + d, v / 3) \end{array} " />
</p></blockquote>
<p> In this post, we’ll see another number conversion problem, which will deliver the digits of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cpi%7D&bg=ffffff&fg=000000&s=0" alt="{\pi}" class="latex" title="{\pi}" />. For more details, see <a href="https://www.cs.ox.ac.uk/publications/publication1674-abstract.html">my paper</a>—although the presentation here is simpler now.</p>
<p></p><h2> Series for pi </h2>
<p>
Leibniz showed that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cdisplaystyle+%5Cfrac%7B%5Cpi%7D%7B4%7D+%3D+%5Csum_%7Bi%3D0%7D%5E%7B%5Cinfty%7D+%5Cfrac%7B%28-1%29%5Ei%7D%7B2i%2B1%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \displaystyle \frac{\pi}{4} = \sum_{i=0}^{\infty} \frac{(-1)^i}{2i+1} " class="latex" title="\displaystyle \displaystyle \frac{\pi}{4} = \sum_{i=0}^{\infty} \frac{(-1)^i}{2i+1} " />
</p></blockquote>
<p> From this, using Euler’s convergence-accelerating transformation, one may derive </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cpi+%3D+%5Csum_%7Bi%3D0%7D%5E%7B%5Cinfty%7D+%5Cfrac%7B%28i%21%29%5E2%5C%2C2%5E%7Bi%2B1%7D%7D%7B%282i%2B1%29%21%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \pi = \sum_{i=0}^{\infty} \frac{(i!)^2\,2^{i+1}}{(2i+1)!} " class="latex" title="\displaystyle \pi = \sum_{i=0}^{\infty} \frac{(i!)^2\,2^{i+1}}{(2i+1)!} " />
</p></blockquote>
<p> or equivalently </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cpi+%3D+2+%2B+%5Cfrac%7B1%7D%7B3%7D+%5Ctimes+%5Cbiggl%282+%2B+%5Cfrac%7B2%7D%7B5%7D%5Ctimes+%5Cbiggl%282+%2B+%5Cfrac%7B3%7D%7B7%7D%5Ctimes+%5Cbiggl%28+%5Ccdots+%5Cbiggl%282+%2B+%5Cfrac%7Bi%7D%7B2i%2B1%7D%5Ctimes+%5Cbiggl%28%5Ccdots%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \pi = 2 + \frac{1}{3} \times \biggl(2 + \frac{2}{5}\times \biggl(2 + \frac{3}{7}\times \biggl( \cdots \biggl(2 + \frac{i}{2i+1}\times \biggl(\cdots\biggr)\biggr)\biggr)\biggr)\biggr) " class="latex" title="\displaystyle \pi = 2 + \frac{1}{3} \times \biggl(2 + \frac{2}{5}\times \biggl(2 + \frac{3}{7}\times \biggl( \cdots \biggl(2 + \frac{i}{2i+1}\times \biggl(\cdots\biggr)\biggr)\biggr)\biggr)\biggr) " />
</p></blockquote>
<p> This can be seen as the number <img src="https://s0.wp.com/latex.php?latex=%7B%282%3B2%2C2%2C2...%29%7D&bg=ffffff&fg=000000&s=0" alt="{(2;2,2,2...)}" class="latex" title="{(2;2,2,2...)}" /> in a funny mixed-radix base <img src="https://s0.wp.com/latex.php?latex=%7B%28%5Cfrac%7B1%7D%7B3%7D%2C+%5Cfrac%7B2%7D%7B5%7D%2C+%5Cfrac%7B3%7D%7B7%7D...%29%7D&bg=ffffff&fg=000000&s=0" alt="{(\frac{1}{3}, \frac{2}{5}, \frac{3}{7}...)}" class="latex" title="{(\frac{1}{3}, \frac{2}{5}, \frac{3}{7}...)}" />, just as the usual decimal expansion </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cpi+%3D+3+%2B+%5Cfrac%7B1%7D%7B10%7D+%5Ctimes+%5Cbiggl%281+%2B+%5Cfrac%7B1%7D%7B10%7D%5Ctimes+%5Cbiggl%284+%2B+%5Cfrac%7B1%7D%7B10%7D%5Ctimes+%5Cbiggl%28+%5Ccdots%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \pi = 3 + \frac{1}{10} \times \biggl(1 + \frac{1}{10}\times \biggl(4 + \frac{1}{10}\times \biggl( \cdots\biggr)\biggr)\biggr) " class="latex" title="\displaystyle \pi = 3 + \frac{1}{10} \times \biggl(1 + \frac{1}{10}\times \biggl(4 + \frac{1}{10}\times \biggl( \cdots\biggr)\biggr)\biggr) " />
</p></blockquote>
<p> is represented by the number <img src="https://s0.wp.com/latex.php?latex=%7B%283%3B1%2C4%2C1...%29%7D&bg=ffffff&fg=000000&s=0" alt="{(3;1,4,1...)}" class="latex" title="{(3;1,4,1...)}" /> in the fixed-radix base <img src="https://s0.wp.com/latex.php?latex=%7B%28%5Cfrac%7B1%7D%7B10%7D%2C%5Cfrac%7B1%7D%7B10%7D%2C%5Cfrac%7B1%7D%7B10%7D...%29%7D&bg=ffffff&fg=000000&s=0" alt="{(\frac{1}{10},\frac{1}{10},\frac{1}{10}...)}" class="latex" title="{(\frac{1}{10},\frac{1}{10},\frac{1}{10}...)}" />. Computing the decimal digits of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cpi%7D&bg=ffffff&fg=000000&s=0" alt="{\pi}" class="latex" title="{\pi}" /> is then a matter of conversion from the mixed-radix base to the fixed-radix base.</p>
<p></p><h2> Conversion from a fixed base </h2>
<p>
Let’s remind ourselves of how it should work, using a simpler example: conversion from one fixed base to another. We are given an infinite-precision fraction in the unit interval </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++x+%3D+%5Cfrac%7B1%7D%7Bm%7D+%5Ctimes+%5Cbiggl%28x_0+%2B+%5Cfrac%7B1%7D%7Bm%7D%5Ctimes+%5Cbiggl%28x_1+%2B+%5Cfrac%7B1%7D%7Bm%7D%5Ctimes+%5Cbiggl%28+%5Ccdots%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle x = \frac{1}{m} \times \biggl(x_0 + \frac{1}{m}\times \biggl(x_1 + \frac{1}{m}\times \biggl( \cdots\biggr)\biggr)\biggr) " class="latex" title="\displaystyle x = \frac{1}{m} \times \biggl(x_0 + \frac{1}{m}\times \biggl(x_1 + \frac{1}{m}\times \biggl( \cdots\biggr)\biggr)\biggr) " />
</p></blockquote>
<p> in base <img src="https://s0.wp.com/latex.php?latex=%7Bm%7D&bg=ffffff&fg=000000&s=0" alt="{m}" class="latex" title="{m}" />, in which <img src="https://s0.wp.com/latex.php?latex=%7B0+%5Cle+x_i+%3C+m%7D&bg=ffffff&fg=000000&s=0" alt="{0 \le x_i < m}" class="latex" title="{0 \le x_i < m}" /> for each digit <img src="https://s0.wp.com/latex.php?latex=%7Bx_i%7D&bg=ffffff&fg=000000&s=0" alt="{x_i}" class="latex" title="{x_i}" />. We are to convert it to a similar representation </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++x+%3D+%5Cfrac%7B1%7D%7Bn%7D+%5Ctimes+%5Cbiggl%28y_0+%2B+%5Cfrac%7B1%7D%7Bn%7D%5Ctimes+%5Cbiggl%28y_1+%2B+%5Cfrac%7B1%7D%7Bn%7D%5Ctimes+%5Cbiggl%28+%5Ccdots%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle x = \frac{1}{n} \times \biggl(y_0 + \frac{1}{n}\times \biggl(y_1 + \frac{1}{n}\times \biggl( \cdots\biggr)\biggr)\biggr) " class="latex" title="\displaystyle x = \frac{1}{n} \times \biggl(y_0 + \frac{1}{n}\times \biggl(y_1 + \frac{1}{n}\times \biggl( \cdots\biggr)\biggr)\biggr) " />
</p></blockquote>
<p> in base <img src="https://s0.wp.com/latex.php?latex=%7Bn%7D&bg=ffffff&fg=000000&s=0" alt="{n}" class="latex" title="{n}" />, in which <img src="https://s0.wp.com/latex.php?latex=%7B0+%5Cle+y_j+%3C+n%7D&bg=ffffff&fg=000000&s=0" alt="{0 \le y_j < n}" class="latex" title="{0 \le y_j < n}" /> for each output digit <img src="https://s0.wp.com/latex.php?latex=%7By_j%7D&bg=ffffff&fg=000000&s=0" alt="{y_j}" class="latex" title="{y_j}" />. The streaming process maintains a state <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" />, a pair of rationals; the invariant is that after consuming <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> input digits and producing <img src="https://s0.wp.com/latex.php?latex=%7Bj%7D&bg=ffffff&fg=000000&s=0" alt="{j}" class="latex" title="{j}" /> output digits, we have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++x+%3D+%5Cfrac%7B1%7D%7Bn%7D+%5Ctimes+%5Cbiggl%28y_0+%2B+%5Ccdots+%2B+%5Cfrac%7B1%7D%7Bn%7D%5Ctimes+%5Cbiggl%28y_%7Bj-1%7D+%2B+v+%5Ctimes+%28u+%2B+%5Cfrac%7B1%7D%7Bm%7D+%5Ctimes+%5Cbiggl%28+x_i+%2B+%5Cfrac%7B1%7D%7Bm%7D+%5Ctimes+%5Cbiggl%28x_%7Bi%2B1%7D+%2B+%5Ccdots+%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle x = \frac{1}{n} \times \biggl(y_0 + \cdots + \frac{1}{n}\times \biggl(y_{j-1} + v \times (u + \frac{1}{m} \times \biggl( x_i + \frac{1}{m} \times \biggl(x_{i+1} + \cdots \biggr)\biggr)\biggr)\biggr)\biggr) " class="latex" title="\displaystyle x = \frac{1}{n} \times \biggl(y_0 + \cdots + \frac{1}{n}\times \biggl(y_{j-1} + v \times (u + \frac{1}{m} \times \biggl( x_i + \frac{1}{m} \times \biggl(x_{i+1} + \cdots \biggr)\biggr)\biggr)\biggr)\biggr) " />
</p></blockquote>
<p> so that <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" /> represents a linear function <img src="https://s0.wp.com/latex.php?latex=%7B%28v%5Ctimes%29+%5Ccdot+%28u%2B%29%7D&bg=ffffff&fg=000000&s=0" alt="{(v\times) \cdot (u+)}" class="latex" title="{(v\times) \cdot (u+)}" /> that should be applied to the value represented by the remaining input.</p>
<p>
We can initialize the process with <img src="https://s0.wp.com/latex.php?latex=%7Bi%3D0%2C+j%3D0%2C+u%3D0%2C+v%3D1%7D&bg=ffffff&fg=000000&s=0" alt="{i=0, j=0, u=0, v=1}" class="latex" title="{i=0, j=0, u=0, v=1}" />. At each step, we first try to produce another output digit. The remaining input digits <img src="https://s0.wp.com/latex.php?latex=%7Bx_i%2C+x_%7Bi%2B1%7D%2C...%7D&bg=ffffff&fg=000000&s=0" alt="{x_i, x_{i+1},...}" class="latex" title="{x_i, x_{i+1},...}" /> represent a value in the unit interval; so if <img src="https://s0.wp.com/latex.php?latex=%7Bn+%5Ctimes+v+%5Ctimes+%28u%2B0%29%7D&bg=ffffff&fg=000000&s=0" alt="{n \times v \times (u+0)}" class="latex" title="{n \times v \times (u+0)}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bn+%5Ctimes+v+%5Ctimes+%28u%2B1%29%7D&bg=ffffff&fg=000000&s=0" alt="{n \times v \times (u+1)}" class="latex" title="{n \times v \times (u+1)}" /> have the same integer part, then that must be the next output digit, whatever the remaining input digits are. Let <img src="https://s0.wp.com/latex.php?latex=%7By_j+%3D+%5Clfloor+n+%5Ctimes+v+%5Ctimes+u+%5Crfloor%7D&bg=ffffff&fg=000000&s=0" alt="{y_j = \lfloor n \times v \times u \rfloor}" class="latex" title="{y_j = \lfloor n \times v \times u \rfloor}" /> be that integer. Now we need to find <img src="https://s0.wp.com/latex.php?latex=%7B%28u%27%2Cv%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u',v')}" class="latex" title="{(u',v')}" /> such that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cfrac%7B1%7D%7Bn%7D+%5Ctimes+%5Cbiggl%28y_j+%2B+v%27+%5Ctimes+%28u%27+%2B+r%29%5Cbiggr%29+%3D+v+%5Ctimes+%28u+%2B+r%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \frac{1}{n} \times \biggl(y_j + v' \times (u' + r)\biggr) = v \times (u + r) " class="latex" title="\displaystyle \frac{1}{n} \times \biggl(y_j + v' \times (u' + r)\biggr) = v \times (u + r) " />
</p></blockquote>
<p> for any remainder <img src="https://s0.wp.com/latex.php?latex=%7Br%7D&bg=ffffff&fg=000000&s=0" alt="{r}" class="latex" title="{r}" />; then we can increment <img src="https://s0.wp.com/latex.php?latex=%7Bj%7D&bg=ffffff&fg=000000&s=0" alt="{j}" class="latex" title="{j}" /> and set <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" /> to <img src="https://s0.wp.com/latex.php?latex=%7B%28u%27%2Cv%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u',v')}" class="latex" title="{(u',v')}" /> and the invariant is maintained. A little algebra shows that we should take <img src="https://s0.wp.com/latex.php?latex=%7Bv%27+%3D+n+%5Ctimes+v%7D&bg=ffffff&fg=000000&s=0" alt="{v' = n \times v}" class="latex" title="{v' = n \times v}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bu%27+%3D+u+-+y_j%2Fv%27%7D&bg=ffffff&fg=000000&s=0" alt="{u' = u - y_j/v'}" class="latex" title="{u' = u - y_j/v'}" />.</p>
<p>
If <img src="https://s0.wp.com/latex.php?latex=%7Bv+%5Ctimes+u%7D&bg=ffffff&fg=000000&s=0" alt="{v \times u}" class="latex" title="{v \times u}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bv+%5Ctimes+%28u%2B1%29%7D&bg=ffffff&fg=000000&s=0" alt="{v \times (u+1)}" class="latex" title="{v \times (u+1)}" /> have different integer parts, we cannot yet tell what the next output digit should be, so we must consume the next input digit instead. Now we need to find <img src="https://s0.wp.com/latex.php?latex=%7B%28u%27%2Cv%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u',v')}" class="latex" title="{(u',v')}" /> such that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++v+%5Ctimes+%5Cbiggl%28u+%2B+%5Cfrac%7B1%7D%7Bm%7D+%5Ctimes+%5Cbiggl%28x_i+%2B+r%5Cbiggr%29%5Cbiggr%29+%3D+v%27+%5Ctimes+%28u%27+%2B+r%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle v \times \biggl(u + \frac{1}{m} \times \biggl(x_i + r\biggr)\biggr) = v' \times (u' + r) " class="latex" title="\displaystyle v \times \biggl(u + \frac{1}{m} \times \biggl(x_i + r\biggr)\biggr) = v' \times (u' + r) " />
</p></blockquote>
<p> for any remainder <img src="https://s0.wp.com/latex.php?latex=%7Br%7D&bg=ffffff&fg=000000&s=0" alt="{r}" class="latex" title="{r}" />; then we can increment <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> and set <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" /> to <img src="https://s0.wp.com/latex.php?latex=%7B%28u%27%2Cv%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u',v')}" class="latex" title="{(u',v')}" /> and the invariant is again maintained. Again, algebraic manipulation leads us to <img src="https://s0.wp.com/latex.php?latex=%7Bv%27+%3D+v%2Fm%7D&bg=ffffff&fg=000000&s=0" alt="{v' = v/m}" class="latex" title="{v' = v/m}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bu%27+%3D+m+%5Ctimes+u+%2B+x_i%7D&bg=ffffff&fg=000000&s=0" alt="{u' = m \times u + x_i}" class="latex" title="{u' = m \times u + x_i}" />.</p>
<p>
For example, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac%7B1%7D%7B4%7D+%3D+0.020202..._3+%3D+0.151515..._7%7D&bg=ffffff&fg=000000&s=0" alt="{\frac{1}{4} = 0.020202..._3 = 0.151515..._7}" class="latex" title="{\frac{1}{4} = 0.020202..._3 = 0.151515..._7}" />, and the conversion starts as follows: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7Bc%7Cc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dcc%7D+x_i+%26+%26+0+%26%26+2+%26%26+0+%26%26+%26%26+2+%26%26+%26%26+0+%26%26+2+%5C%5C+%5Chline+%28u%2Cv%29+%26+%5Cbigl%28%5Cfrac%7B0%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B1%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B0%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B3%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B2%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B9%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B6%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B27%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B15%7D%7B7%7D%2C%5Cfrac%7B7%7D%7B27%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B59%7D%7B7%7D%2C%5Cfrac%7B7%7D%7B81%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B8%7D%7B49%7D%2C%5Cfrac%7B49%7D%7B81%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B24%7D%7B49%7D%2C%5Cfrac%7B49%7D%7B243%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B170%7D%7B49%7D%2C%5Cfrac%7B49%7D%7B729%7D%5Cbigr%29+%26+%5Ccdots+%5Cvrule+height+2.5ex+depth+1.5ex+width+0pt+%5C%5C+%5Chline+y_j+%26+%26+%26%26+%26%26+%26%26+1+%26%26+%26%26+5+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 0 && 2 && 0 && && 2 && && 0 && 2 \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{0}{1},\frac{1}{3}\bigr) && \bigl(\frac{2}{1},\frac{1}{9}\bigr) && \bigl(\frac{6}{1},\frac{1}{27}\bigr) && \bigl(\frac{15}{7},\frac{7}{27}\bigr) && \bigl(\frac{59}{7},\frac{7}{81}\bigr) && \bigl(\frac{8}{49},\frac{49}{81}\bigr) && \bigl(\frac{24}{49},\frac{49}{243}\bigr) && \bigl(\frac{170}{49},\frac{49}{729}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && && && 1 && && 5 \end{array} " class="latex" title="\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 0 && 2 && 0 && && 2 && && 0 && 2 \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{0}{1},\frac{1}{3}\bigr) && \bigl(\frac{2}{1},\frac{1}{9}\bigr) && \bigl(\frac{6}{1},\frac{1}{27}\bigr) && \bigl(\frac{15}{7},\frac{7}{27}\bigr) && \bigl(\frac{59}{7},\frac{7}{81}\bigr) && \bigl(\frac{8}{49},\frac{49}{81}\bigr) && \bigl(\frac{24}{49},\frac{49}{243}\bigr) && \bigl(\frac{170}{49},\frac{49}{729}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && && && 1 && && 5 \end{array} " />
</p></blockquote>
<p> That is, the initial state is <img src="https://s0.wp.com/latex.php?latex=%7Bu_0%3D%5Cfrac%7B0%7D%7B1%7D%2C+v_0%3D%5Cfrac%7B1%7D%7B1%7D%7D&bg=ffffff&fg=000000&s=0" alt="{u_0=\frac{0}{1}, v_0=\frac{1}{1}}" class="latex" title="{u_0=\frac{0}{1}, v_0=\frac{1}{1}}" />. This state does not yet determine the first output digit, so we consume the first input digit 0 to yield the next state <img src="https://s0.wp.com/latex.php?latex=%7Bu_1+%3D+%5Cfrac%7B0%7D%7B1%7D%2C+v_1+%3D+%5Cfrac%7B1%7D%7B3%7D%7D&bg=ffffff&fg=000000&s=0" alt="{u_1 = \frac{0}{1}, v_1 = \frac{1}{3}}" class="latex" title="{u_1 = \frac{0}{1}, v_1 = \frac{1}{3}}" />. This state still does not determine the first output, and nor will the next; so we consume the next two input digits 2 and 0, yielding state <img src="https://s0.wp.com/latex.php?latex=%7Bu_3+%3D+%5Cfrac%7B6%7D%7B1%7D%2C+v_3+%3D+%5Cfrac%7B1%7D%7B27%7D%7D&bg=ffffff&fg=000000&s=0" alt="{u_3 = \frac{6}{1}, v_3 = \frac{1}{27}}" class="latex" title="{u_3 = \frac{6}{1}, v_3 = \frac{1}{27}}" />. This state does determine the next digit: <img src="https://s0.wp.com/latex.php?latex=%7Bv_3+%5Ctimes+u_3+%3D+0.020_3+%3D+0.136..._7%7D&bg=ffffff&fg=000000&s=0" alt="{v_3 \times u_3 = 0.020_3 = 0.136..._7}" class="latex" title="{v_3 \times u_3 = 0.020_3 = 0.136..._7}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bv_3+%5Ctimes+%28u_3%2B1%29+%3D+0.021_3+%3D+0.154..._7%7D&bg=ffffff&fg=000000&s=0" alt="{v_3 \times (u_3+1) = 0.021_3 = 0.154..._7}" class="latex" title="{v_3 \times (u_3+1) = 0.021_3 = 0.154..._7}" /> both start with a 1 in base 7. So we can produce a 1 as the first output digit, yielding state <img src="https://s0.wp.com/latex.php?latex=%7Bu_4+%3D+%5Cfrac%7B15%7D%7B7%7D%2C+v_4+%3D+%5Cfrac%7B7%7D%7B27%7D%7D&bg=ffffff&fg=000000&s=0" alt="{u_4 = \frac{15}{7}, v_4 = \frac{7}{27}}" class="latex" title="{u_4 = \frac{15}{7}, v_4 = \frac{7}{27}}" />. And so on.</p>
<p>
The process tends to converge. Each production step widens the non-empty window <img src="https://s0.wp.com/latex.php?latex=%7B%5Bn+%5Ctimes+v+%5Ctimes+u%2C+n+%5Ctimes+v+%5Ctimes+%28u%2B1%29%29%7D&bg=ffffff&fg=000000&s=0" alt="{[n \times v \times u, n \times v \times (u+1))}" class="latex" title="{[n \times v \times u, n \times v \times (u+1))}" /> by a factor of <img src="https://s0.wp.com/latex.php?latex=%7Bn%7D&bg=ffffff&fg=000000&s=0" alt="{n}" class="latex" title="{n}" />, so it will eventually contain multiple integers; therefore we cannot produce indefinitely. Each consumption step narrows the window by a factor of <img src="https://s0.wp.com/latex.php?latex=%7Bm%7D&bg=ffffff&fg=000000&s=0" alt="{m}" class="latex" title="{m}" />, so it will tend towards eventually producing the next output digit. However, this doesn’t always work. For example, consider converting <img src="https://s0.wp.com/latex.php?latex=%7B0.333..._%7B10%7D%7D&bg=ffffff&fg=000000&s=0" alt="{0.333..._{10}}" class="latex" title="{0.333..._{10}}" /> to base 3: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7Bc%7Cc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dcc%7D+x_i+%26+%26+3+%26%26+3+%26%26+3+%26+%5C%5C+%5Chline+%28u%2Cv%29+%26+%5Cbigl%28%5Cfrac%7B0%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B1%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B3%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B10%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B33%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B100%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B333%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B1000%7D%5Cbigr%29+%26+%5Ccdots+%5Cvrule+height+2.5ex+depth+1.5ex+width+0pt+%5C%5C+%5Chline+y_j+%26+%26+%26%26+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 3 && 3 && 3 & \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{3}{1},\frac{1}{10}\bigr) && \bigl(\frac{33}{1},\frac{1}{100}\bigr) && \bigl(\frac{333}{1},\frac{1}{1000}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && \end{array} " class="latex" title="\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 3 && 3 && 3 & \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{3}{1},\frac{1}{10}\bigr) && \bigl(\frac{33}{1},\frac{1}{100}\bigr) && \bigl(\frac{333}{1},\frac{1}{1000}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && \end{array} " />
</p></blockquote>
<p> The first output digit is never determined: if the first non-3 in the input is less than 3, the value is less than a third, and the first output digit should be a 0; if the first non-3 is greater than 3, then the value is definitely greater than a third, and it is safe to produce a 1 as the first output digit; but because the input is all 3s, we never get to make this decision. This problem will happen whenever the value being represented has a finite representation in the output base.</p>
<p></p><h2> Conversion from a mixed base </h2>
<p>
Let’s return now to computing the digits of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cpi%7D&bg=ffffff&fg=000000&s=0" alt="{\pi}" class="latex" title="{\pi}" />. We have the input </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cpi+%3D+2+%2B+%5Cfrac%7B1%7D%7B3%7D+%5Ctimes+%5Cbiggl%282+%2B+%5Cfrac%7B2%7D%7B5%7D%5Ctimes+%5Cbiggl%282+%2B+%5Cfrac%7B3%7D%7B7%7D%5Ctimes+%5Cbiggl%28+%5Ccdots+%5Cbiggl%282+%2B+%5Cfrac%7Bi%7D%7B2i%2B1%7D%5Ctimes+%5Cbiggl%28%5Ccdots%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \pi = 2 + \frac{1}{3} \times \biggl(2 + \frac{2}{5}\times \biggl(2 + \frac{3}{7}\times \biggl( \cdots \biggl(2 + \frac{i}{2i+1}\times \biggl(\cdots\biggr)\biggr)\biggr)\biggr)\biggr) " class="latex" title="\displaystyle \pi = 2 + \frac{1}{3} \times \biggl(2 + \frac{2}{5}\times \biggl(2 + \frac{3}{7}\times \biggl( \cdots \biggl(2 + \frac{i}{2i+1}\times \biggl(\cdots\biggr)\biggr)\biggr)\biggr)\biggr) " />
</p></blockquote>
<p> which we want to convert to decimal. The streaming process maintains a pair <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" /> of rationals—but this time representing the linear function <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2B%29+%5Ccdot+%28v%5Ctimes%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u+) \cdot (v\times)}" class="latex" title="{(u+) \cdot (v\times)}" />, since this time our expression starts with a sum rather than a product. The invariant is similar: after consuming <img src="https://s0.wp.com/latex.php?latex=%7Bi-1%7D&bg=ffffff&fg=000000&s=0" alt="{i-1}" class="latex" title="{i-1}" /> input digits and producing <img src="https://s0.wp.com/latex.php?latex=%7Bj%7D&bg=ffffff&fg=000000&s=0" alt="{j}" class="latex" title="{j}" /> output digits, we have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cpi+%3D+y_0+%2B+%5Cfrac%7B1%7D%7B10%7D+%5Ctimes+%5Cbiggl%28%5Ccdots+y_%7Bj-1%7D+%2B+%5Cfrac%7B1%7D%7B10%7D+%5Ctimes+%5Cbiggl%28u+%2B+v+%5Ctimes+%5Cbiggl%28x_i+%2B+%5Cfrac%7Bi%7D%7B2i%2B1%7D+%5Ctimes+%5Cbiggl%28x_%7Bi%2B1%7D+%2B+%5Cfrac%7Bi%2B1%7D%7B2i%2B3%7D+%5Ctimes+%5Ccdots%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \pi = y_0 + \frac{1}{10} \times \biggl(\cdots y_{j-1} + \frac{1}{10} \times \biggl(u + v \times \biggl(x_i + \frac{i}{2i+1} \times \biggl(x_{i+1} + \frac{i+1}{2i+3} \times \cdots\biggr)\biggr)\biggr)\biggr) " class="latex" title="\displaystyle \pi = y_0 + \frac{1}{10} \times \biggl(\cdots y_{j-1} + \frac{1}{10} \times \biggl(u + v \times \biggl(x_i + \frac{i}{2i+1} \times \biggl(x_{i+1} + \frac{i+1}{2i+3} \times \cdots\biggr)\biggr)\biggr)\biggr) " />
</p></blockquote>
<p> Note that the output base is fixed at 10; but more importantly, the input <em>digits</em> <img src="https://s0.wp.com/latex.php?latex=%7Bx_i%7D&bg=ffffff&fg=000000&s=0" alt="{x_i}" class="latex" title="{x_i}" /> are all fixed at 2, and it is the input <em>base</em> that varies from digit to digit.</p>
<p>
We can initialize the process with <img src="https://s0.wp.com/latex.php?latex=%7Bi%3D1%2C+j%3D0%2C+u%3D0%2C+v%3D1%7D&bg=ffffff&fg=000000&s=0" alt="{i=1, j=0, u=0, v=1}" class="latex" title="{i=1, j=0, u=0, v=1}" />. At each step, we first try to produce an output digit. What value might the remaining input </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++r+%3D+2+%2B+%5Cfrac%7Bi%7D%7B2i%2B1%7D+%5Ctimes+%5Cbiggl%282+%2B+%5Cfrac%7Bi%2B1%7D%7B2i%2B3%7D+%5Ctimes+%5Ccdots+%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle r = 2 + \frac{i}{2i+1} \times \biggl(2 + \frac{i+1}{2i+3} \times \cdots \biggr) " class="latex" title="\displaystyle r = 2 + \frac{i}{2i+1} \times \biggl(2 + \frac{i+1}{2i+3} \times \cdots \biggr) " />
</p></blockquote>
<p> represent? Each of the bases is at least <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac%7B1%7D%7B3%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\frac{1}{3}}" class="latex" title="{\frac{1}{3}}" />, so it is clear that <img src="https://s0.wp.com/latex.php?latex=%7Br_%7B%5Cmathrm%7Bmin%7D%7D+%5Cle+r%7D&bg=ffffff&fg=000000&s=0" alt="{r_{\mathrm{min}} \le r}" class="latex" title="{r_{\mathrm{min}} \le r}" />, where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++r_%7B%5Cmathrm%7Bmin%7D%7D+%3D+2+%2B+%5Cfrac%7B1%7D%7B3%7D+%5Ctimes+r_%7B%5Cmathrm%7Bmin%7D%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle r_{\mathrm{min}} = 2 + \frac{1}{3} \times r_{\mathrm{min}} " class="latex" title="\displaystyle r_{\mathrm{min}} = 2 + \frac{1}{3} \times r_{\mathrm{min}} " />
</p></blockquote>
<p> which has unique solution <img src="https://s0.wp.com/latex.php?latex=%7Br_%7B%5Cmathrm%7Bmin%7D%7D+%3D+3%7D&bg=ffffff&fg=000000&s=0" alt="{r_{\mathrm{min}} = 3}" class="latex" title="{r_{\mathrm{min}} = 3}" />. Similarly, each of the bases is less than <img src="https://s0.wp.com/latex.php?latex=%7B%5Cfrac%7B1%7D%7B2%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\frac{1}{2}}" class="latex" title="{\frac{1}{2}}" />, so it is clear that <img src="https://s0.wp.com/latex.php?latex=%7Br+%3C+r_%7B%5Cmathrm%7Bmax%7D%7D%7D&bg=ffffff&fg=000000&s=0" alt="{r < r_{\mathrm{max}}}" class="latex" title="{r < r_{\mathrm{max}}}" />, where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++r_%7B%5Cmathrm%7Bmax%7D%7D+%3D+2+%2B+%5Cfrac%7B1%7D%7B2%7D+%5Ctimes+r_%7B%5Cmathrm%7Bmax%7D%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle r_{\mathrm{max}} = 2 + \frac{1}{2} \times r_{\mathrm{max}} " class="latex" title="\displaystyle r_{\mathrm{max}} = 2 + \frac{1}{2} \times r_{\mathrm{max}} " />
</p></blockquote>
<p> which has unique solution <img src="https://s0.wp.com/latex.php?latex=%7Br_%7B%5Cmathrm%7Bmax%7D%7D+%3D+4%7D&bg=ffffff&fg=000000&s=0" alt="{r_{\mathrm{max}} = 4}" class="latex" title="{r_{\mathrm{max}} = 4}" />. So we consider the bounds <img src="https://s0.wp.com/latex.php?latex=%7B%5Clfloor+u+%2B+v+%5Ctimes+3+%5Crfloor%7D&bg=ffffff&fg=000000&s=0" alt="{\lfloor u + v \times 3 \rfloor}" class="latex" title="{\lfloor u + v \times 3 \rfloor}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Clfloor+u+%2B+v+%5Ctimes+4+%5Crfloor%7D&bg=ffffff&fg=000000&s=0" alt="{\lfloor u + v \times 4 \rfloor}" class="latex" title="{\lfloor u + v \times 4 \rfloor}" />; if these have the same integer part <img src="https://s0.wp.com/latex.php?latex=%7By_j%7D&bg=ffffff&fg=000000&s=0" alt="{y_j}" class="latex" title="{y_j}" />, then that is the next output digit. Now we need to find <img src="https://s0.wp.com/latex.php?latex=%7B%28u%27%2Cv%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u',v')}" class="latex" title="{(u',v')}" /> such that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++y_j+%2B+%5Cfrac%7B1%7D%7B10%7D+%5Ctimes+%28u%27+%2B+v%27+%5Ctimes+r%29+%3D+u+%2B+v+%5Ctimes+r+&bg=ffffff&fg=000000&s=0" alt="\displaystyle y_j + \frac{1}{10} \times (u' + v' \times r) = u + v \times r " class="latex" title="\displaystyle y_j + \frac{1}{10} \times (u' + v' \times r) = u + v \times r " />
</p></blockquote>
<p> for any remainder <img src="https://s0.wp.com/latex.php?latex=%7Br%7D&bg=ffffff&fg=000000&s=0" alt="{r}" class="latex" title="{r}" />, so we pick <img src="https://s0.wp.com/latex.php?latex=%7Bu%27+%3D+10+%5Ctimes+%28u+-+y_j%29%7D&bg=ffffff&fg=000000&s=0" alt="{u' = 10 \times (u - y_j)}" class="latex" title="{u' = 10 \times (u - y_j)}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bv%27+%3D+10+%5Ctimes+v%7D&bg=ffffff&fg=000000&s=0" alt="{v' = 10 \times v}" class="latex" title="{v' = 10 \times v}" />. Then we can increment <img src="https://s0.wp.com/latex.php?latex=%7Bj%7D&bg=ffffff&fg=000000&s=0" alt="{j}" class="latex" title="{j}" /> and set <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" /> to <img src="https://s0.wp.com/latex.php?latex=%7B%28u%27%2Cv%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u',v')}" class="latex" title="{(u',v')}" />, and the invariant is maintained.</p>
<p>
If the two bounds have different integer parts, we must consume the next input digit instead. Now we need to find <img src="https://s0.wp.com/latex.php?latex=%7B%28u%27%2Cv%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u',v')}" class="latex" title="{(u',v')}" /> such that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++u%27+%2B+v%27+%5Ctimes+r+%3D+u+%2B+v+%5Ctimes+%5Cbiggl%28x_i+%2B+%5Cfrac%7Bi%7D%7B2i%2B1%7D+%5Ctimes+r%5Cbiggr%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle u' + v' \times r = u + v \times \biggl(x_i + \frac{i}{2i+1} \times r\biggr) " class="latex" title="\displaystyle u' + v' \times r = u + v \times \biggl(x_i + \frac{i}{2i+1} \times r\biggr) " />
</p></blockquote>
<p> for all <img src="https://s0.wp.com/latex.php?latex=%7Br%7D&bg=ffffff&fg=000000&s=0" alt="{r}" class="latex" title="{r}" />, so we pick <img src="https://s0.wp.com/latex.php?latex=%7Bu%27+%3D+u+%2B+v+%5Ctimes+x_i%7D&bg=ffffff&fg=000000&s=0" alt="{u' = u + v \times x_i}" class="latex" title="{u' = u + v \times x_i}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bv%27+%3D+v+%5Ctimes+i+%2F+%282i%2B1%29%7D&bg=ffffff&fg=000000&s=0" alt="{v' = v \times i / (2i+1)}" class="latex" title="{v' = v \times i / (2i+1)}" />. Then we can increment <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> and set <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" /> to <img src="https://s0.wp.com/latex.php?latex=%7B%28u%27%2Cv%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u',v')}" class="latex" title="{(u',v')}" />, and again the invariant is maintained. </p>
<p>
The conversion starts as follows: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7Bc%7Cc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dc%40%7B%7Dcc%7D+x_i+%26+%26+2+%26%26+%26%26+2+%26%26+2+%26%26+%26%26+2+%26%26+2+%5C%5C+%5Chline+%28u%2Cv%29+%26+%5Cbigl%28%5Cfrac%7B0%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B1%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B2%7D%7B1%7D%2C%5Cfrac%7B1%7D%7B3%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B-10%7D%7B1%7D%2C%5Cfrac%7B10%7D%7B3%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B10%7D%7B3%7D%2C%5Cfrac%7B4%7D%7B3%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B-2%7D%7B3%7D%2C%5Cfrac%7B4%7D%7B7%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B-50%7D%7B3%7D%2C%5Cfrac%7B40%7D%7B7%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B-110%7D%7B21%7D%2C%5Cfrac%7B160%7D%7B63%7D%5Cbigr%29+%26%26+%5Cbigl%28%5Cfrac%7B-10%7D%7B63%7D%2C%5Cfrac%7B800%7D%7B693%7D%5Cbigr%29+%26+%5Ccdots+%5Cvrule+height+2.5ex+depth+1.5ex+width+0pt+%5C%5C+%5Chline+y_j+%26+%26+%26%26+3+%26%26+%26%26+%26%26+1+%26%26+%26%26+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 2 && && 2 && 2 && && 2 && 2 \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{2}{1},\frac{1}{3}\bigr) && \bigl(\frac{-10}{1},\frac{10}{3}\bigr) && \bigl(\frac{10}{3},\frac{4}{3}\bigr) && \bigl(\frac{-2}{3},\frac{4}{7}\bigr) && \bigl(\frac{-50}{3},\frac{40}{7}\bigr) && \bigl(\frac{-110}{21},\frac{160}{63}\bigr) && \bigl(\frac{-10}{63},\frac{800}{693}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && 3 && && && 1 && && \end{array} " class="latex" title="\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 2 && && 2 && 2 && && 2 && 2 \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{2}{1},\frac{1}{3}\bigr) && \bigl(\frac{-10}{1},\frac{10}{3}\bigr) && \bigl(\frac{10}{3},\frac{4}{3}\bigr) && \bigl(\frac{-2}{3},\frac{4}{7}\bigr) && \bigl(\frac{-50}{3},\frac{40}{7}\bigr) && \bigl(\frac{-110}{21},\frac{160}{63}\bigr) && \bigl(\frac{-10}{63},\frac{800}{693}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && 3 && && && 1 && && \end{array} " />
</p></blockquote>
<p> Happily, non-termination ceases to be a problem: the value being represented does not have a finite representation in the output base, being irrational.</p>
<p></p><h2> Code </h2>
<p>
We can plug these definitions straight into the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstream%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stream}}" class="latex" title="{\mathit{stream}}" /> function above: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7BpiDigits%7D+%3D+%5Cmathit%7Bstream%7D%5C%3Bg%5C%3Bf%5C%3B%281%2C0%2C0%5C%251%2C1%5C%251%29%5C%3B%28%5Cmathit%7Brepeat%7D%5C%3B2%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{piDigits} = \mathit{stream}\;g\;f\;(1,0,0\%1,1\%1)\;(\mathit{repeat}\;2) " class="latex" title="\displaystyle \mathit{piDigits} = \mathit{stream}\;g\;f\;(1,0,0\%1,1\%1)\;(\mathit{repeat}\;2) " />
</p></blockquote>
<p> where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++g%5C%3B%28i%2Cj%2Cu%2Cv%29+%3D+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dl%7D+%5Cmathbf%7Bif%7D%5C%3By+%3D%3D+%5Cmathit%7Bfloor%7D%5C%3B%28u+%2B+v+%5Ctimes+4%29+%5C%5C+%5Cmathbf%7Bthen%7D%5C%3B%5Cmathit%7BJust%7D%5C%3B%28y%2C+%28i%2Cj%2B1%2C+10+%5Ctimes+%28u+-+%5Cmathit%7BfromIntegral%7D%5C%3By%29%2C+10+%5Ctimes+v%29%29+%5C%5C+%5Cmathbf%7Belse%7D%5C%3B%5Cmathit%7BNothing%7D+%5C%5C+%5Cmathbf%7Bwhere%7D%5C%3By+%3D+%5Cmathit%7Bfloor%7D%5C%3B%28u+%2B+v+%5Ctimes+3%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle g\;(i,j,u,v) = \begin{array}[t]{@{}l} \mathbf{if}\;y == \mathit{floor}\;(u + v \times 4) \\ \mathbf{then}\;\mathit{Just}\;(y, (i,j+1, 10 \times (u - \mathit{fromIntegral}\;y), 10 \times v)) \\ \mathbf{else}\;\mathit{Nothing} \\ \mathbf{where}\;y = \mathit{floor}\;(u + v \times 3) \end{array} " class="latex" title="\displaystyle g\;(i,j,u,v) = \begin{array}[t]{@{}l} \mathbf{if}\;y == \mathit{floor}\;(u + v \times 4) \\ \mathbf{then}\;\mathit{Just}\;(y, (i,j+1, 10 \times (u - \mathit{fromIntegral}\;y), 10 \times v)) \\ \mathbf{else}\;\mathit{Nothing} \\ \mathbf{where}\;y = \mathit{floor}\;(u + v \times 3) \end{array} " />
</p></blockquote>
<p> and </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++f%5C%3B%28i%2Cj%2Cu%2Cv%29%5C%3Bx+%3D+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dl%7D+%28i%2B1%2Cj%2Cu+%2B+v+%5Ctimes+%5Cmathit%7BfromIntegral%7D%5C%3Bx%2C+v+%5Ctimes+i%27+%2F+%282+%5Ctimes+i%27+%2B+1%29%29+%5C%5C+%5Cmathbf%7Bwhere%7D%5C%3Bi%27+%3D+%5Cmathit%7BfromIntegral%7D%5C%3Bi+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle f\;(i,j,u,v)\;x = \begin{array}[t]{@{}l} (i+1,j,u + v \times \mathit{fromIntegral}\;x, v \times i' / (2 \times i' + 1)) \\ \mathbf{where}\;i' = \mathit{fromIntegral}\;i \end{array} " class="latex" title="\displaystyle f\;(i,j,u,v)\;x = \begin{array}[t]{@{}l} (i+1,j,u + v \times \mathit{fromIntegral}\;x, v \times i' / (2 \times i' + 1)) \\ \mathbf{where}\;i' = \mathit{fromIntegral}\;i \end{array} " />
</p></blockquote>
<p>(The <img src="https://s0.wp.com/latex.php?latex=%7B%5C%25%7D&bg=ffffff&fg=000000&s=0" alt="{\%}" class="latex" title="{\%}" />s make rational numbers in Haskell, and force the ambiguous fractional type to be <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BRational%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{Rational}}" class="latex" title="{\mathit{Rational}}" /> rather than <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BDouble%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{Double}}" class="latex" title="{\mathit{Double}}" />.)</p>
<p>
In fact, this program can be considerably simplified, by inlining the definitions. In particular, the input digits are all 2, so we need not supply them. Moreover, the <img src="https://s0.wp.com/latex.php?latex=%7Bj%7D&bg=ffffff&fg=000000&s=0" alt="{j}" class="latex" title="{j}" /> component of the state is never used, because we treat each output digit in the same way (in contrast to the input digits); so that may be eliminated. Finally, we can eliminate some of the numeric coercions if we represent the <img src="https://s0.wp.com/latex.php?latex=%7Bi%7D&bg=ffffff&fg=000000&s=0" alt="{i}" class="latex" title="{i}" /> component as a rational in the first place: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7BpiDigits%7D+%3D+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dl%7D+%5Cmathit%7Bgo%7D%5C%3B%28%281%2C0%2C1%29+%3A%3A+%28%5Cmathit%7BRational%7D%2C%5Cmathit%7BRational%7D%2C%5Cmathit%7BRational%7D%29%29%5C%3B%5Cmathbf%7Bwhere%7D+%5C%5C+%5Cqquad+%5Cmathit%7Bgo%7D%5C%3B%28i%2Cu%2Cv%29+%3D+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dll%7D+%5Cmathbf%7Bif%7D+%26+y+%3D%3D+%5Cmathit%7Bfloor%7D%5C%3B%28u%2Bv+%5Ctimes+4%29+%5C%5C+%5Cmathbf%7Bthen%7D+%26+y+%3A+%5Cmathit%7Bgo%7D%5C%3B%28i%2C10+%5Ctimes+%28u-%5Cmathit%7BfromIntegral%7D%5C%3By%29%2C10+%5Ctimes+v%29+%5C%5C+%5Cmathbf%7Belse%7D+%26+%5Cmathit%7Bgo%7D%5C%3B%28i%2B1%2Cu%2B2+%5Ctimes+v%2C+%28v+%5Ctimes+i%29+%2F+%282+%5Ctimes+i%2B1%29%29+%5C%5C+%5Cmulticolumn%7B2%7D%7B%40%7B%7Dl%7D%7B%5Cqquad+%5Cmathbf%7Bwhere%7D%5C%3B+y+%3D+%5Cmathit%7Bfloor%7D%5C%3B%28u%2Bv+%5Ctimes+3%29%7D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{piDigits} = \begin{array}[t]{@{}l} \mathit{go}\;((1,0,1) :: (\mathit{Rational},\mathit{Rational},\mathit{Rational}))\;\mathbf{where} \\ \qquad \mathit{go}\;(i,u,v) = \begin{array}[t]{@{}ll} \mathbf{if} & y == \mathit{floor}\;(u+v \times 4) \\ \mathbf{then} & y : \mathit{go}\;(i,10 \times (u-\mathit{fromIntegral}\;y),10 \times v) \\ \mathbf{else} & \mathit{go}\;(i+1,u+2 \times v, (v \times i) / (2 \times i+1)) \\ \multicolumn{2}{@{}l}{\qquad \mathbf{where}\; y = \mathit{floor}\;(u+v \times 3)} \end{array} \end{array} " class="latex" title="\displaystyle \mathit{piDigits} = \begin{array}[t]{@{}l} \mathit{go}\;((1,0,1) :: (\mathit{Rational},\mathit{Rational},\mathit{Rational}))\;\mathbf{where} \\ \qquad \mathit{go}\;(i,u,v) = \begin{array}[t]{@{}ll} \mathbf{if} & y == \mathit{floor}\;(u+v \times 4) \\ \mathbf{then} & y : \mathit{go}\;(i,10 \times (u-\mathit{fromIntegral}\;y),10 \times v) \\ \mathbf{else} & \mathit{go}\;(i+1,u+2 \times v, (v \times i) / (2 \times i+1)) \\ \multicolumn{2}{@{}l}{\qquad \mathbf{where}\; y = \mathit{floor}\;(u+v \times 3)} \end{array} \end{array} " />
</p></blockquote>
<p> Then we have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7BpiDigits%7D+%3D+%5B3%2C1%2C4%2C1%2C5%2C9%2C2%2C6%2C5%2C3%2C5%2C8%2C9%2C7%2C9%2C3...+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{piDigits} = [3,1,4,1,5,9,2,6,5,3,5,8,9,7,9,3... " class="latex" title="\displaystyle \mathit{piDigits} = [3,1,4,1,5,9,2,6,5,3,5,8,9,7,9,3... " />
</p></blockquote>
<p><img src="https://pixel.wp.com/b.gif?host=patternsinfp.wordpress.com&blog=15593982&post=316&subd=patternsinfp&ref=&feed=1" alt="" height="1" border="0" width="1" /></p>Wed, 15 Nov 2017 17:22:26 +0000Jeremy Gibbons: Metamorphismshttp://patternsinfp.wordpress.com/?p=288
https://patternsinfp.wordpress.com/2017/10/04/metamorphisms/
<p>
It appears that I have insufficient time, or at least insufficient discipline, to contribute to this blog, except when I am on sabbatical. Which I now am… so let’s see if I can do better.</p>
<p></p><h2> Hylomorphisms </h2>
<p>
I don’t think I’ve written about them yet in this series—another story, for another day—but <em>hylomorphisms</em> consist of a fold after an unfold. One very simple example is the factorial function: <img src="https://s0.wp.com/latex.php?latex=%7Bn%21%7D&bg=ffffff&fg=000000&s=0" alt="{n!}" class="latex" title="{n!}" /> is the product of the predecessors <img src="https://s0.wp.com/latex.php?latex=%7B%5Bn%2C...%2C1%5D%7D&bg=ffffff&fg=000000&s=0" alt="{[n,...,1]}" class="latex" title="{[n,...,1]}" /> of <img src="https://s0.wp.com/latex.php?latex=%7Bn%7D&bg=ffffff&fg=000000&s=0" alt="{n}" class="latex" title="{n}" />. The predecessors can be computed with an unfold: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bpreds%7D+%26%3A%3A%26+%5Cmathit%7BInteger%7D+%5Crightarrow+%5B%5Cmathit%7BInteger%7D%5D+%5C%5C+%5Cmathit%7Bpreds%7D+%26%3D%26+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bstep%7D+%5C%3B+%5Cmathbf%7Bwhere%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bstep%7D%5C%3B0+%26%3D%26+%5Cmathit%7BNothing%7D+%5C%5C+%5Cmathit%7Bstep%7D%5C%3Bn+%26%3D%26+%5Cmathit%7BJust%7D%5C%3B%28n%2C+n-1%29+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{preds} &::& \mathit{Integer} \rightarrow [\mathit{Integer}] \\ \mathit{preds} &=& \mathit{unfoldr}\;\mathit{step} \; \mathbf{where} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{step}\;0 &=& \mathit{Nothing} \\ \mathit{step}\;n &=& \mathit{Just}\;(n, n-1) \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{preds} &::& \mathit{Integer} \rightarrow [\mathit{Integer}] \\ \mathit{preds} &=& \mathit{unfoldr}\;\mathit{step} \; \mathbf{where} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{step}\;0 &=& \mathit{Nothing} \\ \mathit{step}\;n &=& \mathit{Just}\;(n, n-1) \end{array} \end{array} " />
</p></blockquote>
<p> and the product as a fold: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bprod%7D+%26%3A%3A%26+%5B%5Cmathit%7BInteger%7D%5D+%5Crightarrow+%5Cmathit%7BInteger%7D+%5C%5C+%5Cmathit%7Bprod%7D+%26%3D%26+%5Cmathit%7Bfoldr%7D%5C%3B%28%5Ctimes%29%5C%3B1+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{prod} &::& [\mathit{Integer}] \rightarrow \mathit{Integer} \\ \mathit{prod} &=& \mathit{foldr}\;(\times)\;1 \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{prod} &::& [\mathit{Integer}] \rightarrow \mathit{Integer} \\ \mathit{prod} &=& \mathit{foldr}\;(\times)\;1 \end{array} " />
</p></blockquote>
<p> and then factorial is their composition: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bfactorial%7D+%26%3A%3A%26+%5Cmathit%7BInteger%7D+%5Crightarrow+%5Cmathit%7BInteger%7D+%5C%5C+%5Cmathit%7Bfactorial%7D+%26%3D%26+%5Cmathit%7Bprod%7D+%5Ccdot+%5Cmathit%7Bpreds%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{factorial} &::& \mathit{Integer} \rightarrow \mathit{Integer} \\ \mathit{factorial} &=& \mathit{prod} \cdot \mathit{preds} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{factorial} &::& \mathit{Integer} \rightarrow \mathit{Integer} \\ \mathit{factorial} &=& \mathit{prod} \cdot \mathit{preds} \end{array} " />
</p></blockquote>
<p> Another example is a tree-based sorting algorithm that resembles Hoare’s quicksort: from the input list, grow a binary search tree, as an unfold, and then flatten that tree back to a sorted list, as a fold. This is a divide-and-conquer algorithm; in general, these can be modelled as unfolding a tree of subproblems by repeatedly dividing the problem, then collecting the solution to the original problem by folding together the solutions to subproblems.</p>
<p></p><h2> An unfold after a fold </h2>
<p>
This post is about the opposite composition, an unfold after a fold. Some examples: </p>
<ul>
<li> <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bregroup%7D%5C%3Bn+%3D+%5Cmathit%7Bgroup%7D%5C%3Bn+%5Ccdot+%5Cmathit%7Bconcat%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{regroup}\;n = \mathit{group}\;n \cdot \mathit{concat}}" class="latex" title="{\mathit{regroup}\;n = \mathit{group}\;n \cdot \mathit{concat}}" /> to reformat a list of lists to a given length;
</li><li> <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bheapsort%7D+%3D+%5Cmathit%7BflattenHeap%7D+%5Ccdot+%5Cmathit%7BbuildHeap%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{heapsort} = \mathit{flattenHeap} \cdot \mathit{buildHeap}}" class="latex" title="{\mathit{heapsort} = \mathit{flattenHeap} \cdot \mathit{buildHeap}}" /> to sort a list;
</li><li> <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bbaseconv%7D%5C%3B%28b%2Cc%29+%3D+%5Cmathit%7BtoBase%7D%5C%3Bb+%5Ccdot+%5Cmathit%7BfromBase%7D%5C%3Bc%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{baseconv}\;(b,c) = \mathit{toBase}\;b \cdot \mathit{fromBase}\;c}" class="latex" title="{\mathit{baseconv}\;(b,c) = \mathit{toBase}\;b \cdot \mathit{fromBase}\;c}" /> to convert a fraction from base <img src="https://s0.wp.com/latex.php?latex=%7Bc%7D&bg=ffffff&fg=000000&s=0" alt="{c}" class="latex" title="{c}" /> to base <img src="https://s0.wp.com/latex.php?latex=%7Bb%7D&bg=ffffff&fg=000000&s=0" alt="{b}" class="latex" title="{b}" />;
</li><li> <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BarithCode%7D+%3D+%5Cmathit%7BtoBits%7D+%5Ccdot+%5Cmathit%7Bnarrow%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{arithCode} = \mathit{toBits} \cdot \mathit{narrow}}" class="latex" title="{\mathit{arithCode} = \mathit{toBits} \cdot \mathit{narrow}}" /> to encode a text in binary by “arithmetic coding”.
</li></ul>
<p> In each of these cases, the first phase is a fold, which consumes some structured representation of a value into an intermediate unstructured format, and the second phase is an unfold, which generates a new structured representation. Their composition effects a change of representation, so we call them <a href="https://www.cs.ox.ac.uk/publications/publication380-abstract.html">metamorphisms</a>. </p>
<p>
Hylomorphisms always <em>fuse</em>, and one can deforest the intermediate <em>virtual data structure</em>. For example, one need not construct the intermediate list in the factorial function; since each cell gets constructed in the unfold only to be immediately deconstructed in the fold, one can cut to the chase and go straight to the familiar recursive definition. For the base case, we have: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7Bll%7D+%26+%5Cmathit%7Bfactorial%7D%5C%3B0+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bfactorial%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7Bprod%7D%5C%3B%28%5Cmathit%7Bpreds%7D%5C%3B0%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bpreds%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7Bprod%7D%5C%3B%5B%5C%2C%5D+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bprod%7D+%5C%7D+%5C%5C+%26+1+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{ll} & \mathit{factorial}\;0 \\ = & \qquad \{ \mathit{factorial} \} \\ & \mathit{prod}\;(\mathit{preds}\;0) \\ = & \qquad \{ \mathit{preds} \} \\ & \mathit{prod}\;[\,] \\ = & \qquad \{ \mathit{prod} \} \\ & 1 \end{array} " class="latex" title="\displaystyle \begin{array}{ll} & \mathit{factorial}\;0 \\ = & \qquad \{ \mathit{factorial} \} \\ & \mathit{prod}\;(\mathit{preds}\;0) \\ = & \qquad \{ \mathit{preds} \} \\ & \mathit{prod}\;[\,] \\ = & \qquad \{ \mathit{prod} \} \\ & 1 \end{array} " />
</p></blockquote>
<p> and for non-zero argument <img src="https://s0.wp.com/latex.php?latex=%7Bn%7D&bg=ffffff&fg=000000&s=0" alt="{n}" class="latex" title="{n}" />, we have: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7Bll%7D+%26+%5Cmathit%7Bfactorial%7D%5C%3Bn+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bfactorial%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7Bprod%7D%5C%3B%28%5Cmathit%7Bpreds%7D%5C%3Bn%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bpreds%7D+%5C%7D+%5C%5C+%26+%5Cmathit%7Bprod%7D%5C%3B%28n+%3A+%5Cmathit%7Bpreds%7D%5C%3B%28n-1%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bprod%7D+%5C%7D+%5C%5C+%26+n+%5Ctimes+%5Cmathit%7Bprod%7D%5C%3B%28%5Cmathit%7Bpreds%7D%5C%3B%28n-1%29%29+%5C%5C+%3D+%26+%5Cqquad+%5C%7B+%5Cmathit%7Bfactorial%7D+%5C%7D+%5C%5C+%26+n+%5Ctimes+%5Cmathit%7Bfactorial%7D%5C%3B%28n-1%29+%5C%5C+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{ll} & \mathit{factorial}\;n \\ = & \qquad \{ \mathit{factorial} \} \\ & \mathit{prod}\;(\mathit{preds}\;n) \\ = & \qquad \{ \mathit{preds} \} \\ & \mathit{prod}\;(n : \mathit{preds}\;(n-1)) \\ = & \qquad \{ \mathit{prod} \} \\ & n \times \mathit{prod}\;(\mathit{preds}\;(n-1)) \\ = & \qquad \{ \mathit{factorial} \} \\ & n \times \mathit{factorial}\;(n-1) \\ \end{array} " class="latex" title="\displaystyle \begin{array}{ll} & \mathit{factorial}\;n \\ = & \qquad \{ \mathit{factorial} \} \\ & \mathit{prod}\;(\mathit{preds}\;n) \\ = & \qquad \{ \mathit{preds} \} \\ & \mathit{prod}\;(n : \mathit{preds}\;(n-1)) \\ = & \qquad \{ \mathit{prod} \} \\ & n \times \mathit{prod}\;(\mathit{preds}\;(n-1)) \\ = & \qquad \{ \mathit{factorial} \} \\ & n \times \mathit{factorial}\;(n-1) \\ \end{array} " />
</p></blockquote>
<p>
In contrast, metamorphisms only fuse under certain conditions. However, when they do fuse, they also allow infinite representations to be processed, as we shall see.</p>
<p>
Fusion seems to depend on the fold being tail-recursive; that is, a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" />: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bfoldl%7D+%26%3A%3A%26+%28%5Cbeta+%5Crightarrow+%5Calpha+%5Crightarrow+%5Cbeta%29+%5Crightarrow+%5Cbeta+%5Crightarrow+%5B%5Calpha%5D+%5Crightarrow+%5Cbeta+%5C%5C+%5Cmathit%7Bfoldl%7D%5C%3Bf%5C%3Bb%5C%3B%28a%3Ax%29+%26%3D%26+%5Cmathit%7Bfoldl%7D%5C%3Bf%5C%3B%28f%5C%3Bb%5C%3Ba%29%5C%3Bx+%5C%5C+%5Cmathit%7Bfoldl%7D%5C%3Bf%5C%3Bb%5C%3B%5B%5C%2C%5D+%26%3D%26+b+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{foldl} &::& (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow \beta \\ \mathit{foldl}\;f\;b\;(a:x) &=& \mathit{foldl}\;f\;(f\;b\;a)\;x \\ \mathit{foldl}\;f\;b\;[\,] &=& b \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{foldl} &::& (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow \beta \\ \mathit{foldl}\;f\;b\;(a:x) &=& \mathit{foldl}\;f\;(f\;b\;a)\;x \\ \mathit{foldl}\;f\;b\;[\,] &=& b \end{array} " />
</p></blockquote>
<p> For the unfold phase, we will use the usual list unfold: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bunfoldr%7D+%26%3A%3A%26+%28%5Cbeta+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cgamma%2C%5Cbeta%29%29+%5Crightarrow+%5Cbeta+%5Crightarrow+%5B%5Cgamma%5D+%5C%5C+%5Cmathit%7Bunfoldr%7D%5C%3Bg%5C%3Bb+%26%3D%26+%5Cmathbf%7Bcase%7D%5C%3Bg%5C%3Bb%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7BJust%7D%5C%3B%28c%2Cb%27%29+%26%5Crightarrow%26+c+%3A+%5Cmathit%7Bunfoldr%7D%5C%3Bg%5C%3Bb%27+%5C%5C+%5Cmathit%7BNothing%7D+%26%5Crightarrow%26+%5B%5C%2C%5D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{unfoldr} &::& (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow \beta \rightarrow [\gamma] \\ \mathit{unfoldr}\;g\;b &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{unfoldr}\;g\;b' \\ \mathit{Nothing} &\rightarrow& [\,] \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{unfoldr} &::& (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow \beta \rightarrow [\gamma] \\ \mathit{unfoldr}\;g\;b &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{unfoldr}\;g\;b' \\ \mathit{Nothing} &\rightarrow& [\,] \end{array} \end{array} " />
</p></blockquote>
<p> We define a metamorphism as their composition: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7Bl%7D+%5Cmathit%7Bmeta%7D+%3A%3A+%28%5Cbeta+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cgamma%2C%5Cbeta%29%29+%5Crightarrow+%28%5Cbeta+%5Crightarrow+%5Calpha+%5Crightarrow+%5Cbeta%29+%5Crightarrow+%5Cbeta+%5Crightarrow+%5B%5Calpha%5D+%5Crightarrow+%5B%5Cgamma%5D+%5C%5C+%5Cmathit%7Bmeta%7D%5C%3Bg%5C%3Bf%5C%3Bb+%3D+%5Cmathit%7Bunfoldr%7D%5C%3Bg+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3Bf%5C%3Bb+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{l} \mathit{meta} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma] \\ \mathit{meta}\;g\;f\;b = \mathit{unfoldr}\;g \cdot \mathit{foldl}\;f\;b \end{array} " class="latex" title="\displaystyle \begin{array}{l} \mathit{meta} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma] \\ \mathit{meta}\;g\;f\;b = \mathit{unfoldr}\;g \cdot \mathit{foldl}\;f\;b \end{array} " />
</p></blockquote>
<p> This transforms input of type <img src="https://s0.wp.com/latex.php?latex=%7B%5BA%5D%7D&bg=ffffff&fg=000000&s=0" alt="{[A]}" class="latex" title="{[A]}" /> to output of type <img src="https://s0.wp.com/latex.php?latex=%7B%5BC%5D%7D&bg=ffffff&fg=000000&s=0" alt="{[C]}" class="latex" title="{[C]}" />: in the first phase, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%5C%3Bf%5C%3Bb%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}\;f\;b}" class="latex" title="{\mathit{foldl}\;f\;b}" />, it consumes all the input into an intermediate value of type <img src="https://s0.wp.com/latex.php?latex=%7BB%7D&bg=ffffff&fg=000000&s=0" alt="{B}" class="latex" title="{B}" />; in the second phase, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%5C%3Bg%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}\;g}" class="latex" title="{\mathit{unfoldr}\;g}" />, it produces all the output.</p>
<p></p><h2> Streaming </h2>
<p>
Under certain conditions, it is possible to fuse these two phases—this time, not in order to eliminate an intermediate data structure (after all, the intermediate type <img src="https://s0.wp.com/latex.php?latex=%7BB%7D&bg=ffffff&fg=000000&s=0" alt="{B}" class="latex" title="{B}" /> need not be structured), but rather in order to allow some production steps to happen before all the consumption steps are complete. </p>
<p>
To that end, we define the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstream%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stream}}" class="latex" title="{\mathit{stream}}" /> function as follows: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmulticolumn%7B3%7D%7B%40%7B%7Dl%7D%7B%5Cmathit%7Bstream%7D+%3A%3A+%28%5Cbeta+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cgamma%2C%5Cbeta%29%29+%5Crightarrow+%28%5Cbeta+%5Crightarrow+%5Calpha+%5Crightarrow+%5Cbeta%29+%5Crightarrow+%5Cbeta+%5Crightarrow+%5B%5Calpha%5D+%5Crightarrow+%5B%5Cgamma%5D%7D+%5C%5C+%5Cmathit%7Bstream%7D%5C%3Bg%5C%3Bf%5C%3Bb%5C%3Bx+%26%3D%26+%5Cmathbf%7Bcase%7D%5C%3Bg%5C%3Bb%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7BJust%7D%5C%3B%28c%2Cb%27%29+%26%5Crightarrow%26+c+%3A+%5Cmathit%7Bstream%7D%5C%3Bg%5C%3Bf%5C%3Bb%27%5C%3Bx+%5C%5C+%5Cmathit%7BNothing%7D+%26%5Crightarrow%26+%5Cmathbf%7Bcase%7D%5C%3Bx%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+a%3Ax%27+%26%5Crightarrow%26+%5Cmathit%7Bstream%7D%5C%3Bg%5C%3Bf%5C%3B%28f%5C%3Bb%5C%3Ba%29%5C%3Bx%27+%5C%5C+%7B%5B%5C%2C%5D%7D+%26%5Crightarrow%26+%5B%5C%2C%5D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l}{\mathit{stream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{stream}\;g\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{stream}\;g\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{stream}\;g\;f\;(f\;b\;a)\;x' \\ {[\,]} &\rightarrow& [\,] \end{array} \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l}{\mathit{stream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{stream}\;g\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{stream}\;g\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{stream}\;g\;f\;(f\;b\;a)\;x' \\ {[\,]} &\rightarrow& [\,] \end{array} \end{array} \end{array} " />
</p></blockquote>
<p> This takes the same arguments as <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bmeta%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{meta}}" class="latex" title="{\mathit{meta}}" />. It maintains a current state <img src="https://s0.wp.com/latex.php?latex=%7Bb%7D&bg=ffffff&fg=000000&s=0" alt="{b}" class="latex" title="{b}" />, and produces an output element <img src="https://s0.wp.com/latex.php?latex=%7Bc%7D&bg=ffffff&fg=000000&s=0" alt="{c}" class="latex" title="{c}" /> when it can; and when it can’t produce, it consumes an input element instead. In more detail, it examines the current state <img src="https://s0.wp.com/latex.php?latex=%7Bb%7D&bg=ffffff&fg=000000&s=0" alt="{b}" class="latex" title="{b}" /> using function <img src="https://s0.wp.com/latex.php?latex=%7Bg%7D&bg=ffffff&fg=000000&s=0" alt="{g}" class="latex" title="{g}" />, which is like the body of an unfold; this may produce a first element <img src="https://s0.wp.com/latex.php?latex=%7Bc%7D&bg=ffffff&fg=000000&s=0" alt="{c}" class="latex" title="{c}" /> of the result and a new state <img src="https://s0.wp.com/latex.php?latex=%7Bb%27%7D&bg=ffffff&fg=000000&s=0" alt="{b'}" class="latex" title="{b'}" />; when it yields no element, the next element <img src="https://s0.wp.com/latex.php?latex=%7Ba%7D&bg=ffffff&fg=000000&s=0" alt="{a}" class="latex" title="{a}" /> of the input is consumed using function <img src="https://s0.wp.com/latex.php?latex=%7Bf%7D&bg=ffffff&fg=000000&s=0" alt="{f}" class="latex" title="{f}" />, which is like the body of a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" />; and when no input remains either, we are done.</p>
<p>
The <em>streaming condition</em> for <img src="https://s0.wp.com/latex.php?latex=%7Bf%7D&bg=ffffff&fg=000000&s=0" alt="{f}" class="latex" title="{f}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bg%7D&bg=ffffff&fg=000000&s=0" alt="{g}" class="latex" title="{g}" /> is that </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++g%5C%3Bb+%3D+%5Cmathit%7BJust%7D%5C%3B%28c%2Cb%27%29+%5Cquad%5CRightarrow%5Cquad+%5Cforall+a+%5Cmathbin%7B.%7D+g%5C%3B%28f%5C%3Bb%5C%3Ba%29+%3D+%5Cmathit%7BJust%7D%5C%3B%28c%2C+f%5C%3Bb%27%5C%3Ba%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle g\;b = \mathit{Just}\;(c,b') \quad\Rightarrow\quad \forall a \mathbin{.} g\;(f\;b\;a) = \mathit{Just}\;(c, f\;b'\;a) " class="latex" title="\displaystyle g\;b = \mathit{Just}\;(c,b') \quad\Rightarrow\quad \forall a \mathbin{.} g\;(f\;b\;a) = \mathit{Just}\;(c, f\;b'\;a) " />
</p></blockquote>
<p> Consider a state <img src="https://s0.wp.com/latex.php?latex=%7Bb%7D&bg=ffffff&fg=000000&s=0" alt="{b}" class="latex" title="{b}" /> from which the body <img src="https://s0.wp.com/latex.php?latex=%7Bg%7D&bg=ffffff&fg=000000&s=0" alt="{g}" class="latex" title="{g}" /> of the unfold is productive, yielding some <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BJust%7D%5C%3B%28c%2Cb%27%29%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{Just}\;(c,b')}" class="latex" title="{\mathit{Just}\;(c,b')}" />. From here we have two choices: we can either produce the output <img src="https://s0.wp.com/latex.php?latex=%7Bc%7D&bg=ffffff&fg=000000&s=0" alt="{c}" class="latex" title="{c}" />, move to intermediate state <img src="https://s0.wp.com/latex.php?latex=%7Bb%27%7D&bg=ffffff&fg=000000&s=0" alt="{b'}" class="latex" title="{b'}" />, then consume the next input <img src="https://s0.wp.com/latex.php?latex=%7Ba%7D&bg=ffffff&fg=000000&s=0" alt="{a}" class="latex" title="{a}" /> to yield a final state <img src="https://s0.wp.com/latex.php?latex=%7Bf%5C%3Bb%27%5C%3Ba%7D&bg=ffffff&fg=000000&s=0" alt="{f\;b'\;a}" class="latex" title="{f\;b'\;a}" />; or we can consume first to get the intermediate state <img src="https://s0.wp.com/latex.php?latex=%7Bf%5C%3Bb%5C%3Ba%7D&bg=ffffff&fg=000000&s=0" alt="{f\;b\;a}" class="latex" title="{f\;b\;a}" />, and again try to produce. The streaming condition says that this intermediate state <img src="https://s0.wp.com/latex.php?latex=%7Bf%5C%3Bb%5C%3Ba%7D&bg=ffffff&fg=000000&s=0" alt="{f\;b\;a}" class="latex" title="{f\;b\;a}" /> will again be productive, and will yield the same output <img src="https://s0.wp.com/latex.php?latex=%7Bc%7D&bg=ffffff&fg=000000&s=0" alt="{c}" class="latex" title="{c}" /> and the same final state <img src="https://s0.wp.com/latex.php?latex=%7Bf%5C%3Bb%27%5C%3Ba%7D&bg=ffffff&fg=000000&s=0" alt="{f\;b'\;a}" class="latex" title="{f\;b'\;a}" />. That is, instead of consuming all the inputs first, and then producing all the outputs, it is possible to produce some of the outputs early, without jeopardizing the overall result. Provided that the streaming condition holds for <img src="https://s0.wp.com/latex.php?latex=%7Bf%7D&bg=ffffff&fg=000000&s=0" alt="{f}" class="latex" title="{f}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bg%7D&bg=ffffff&fg=000000&s=0" alt="{g}" class="latex" title="{g}" />, then </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bstream%7D%5C%3Bg%5C%3Bf%5C%3Bb%5C%3Bx+%3D+%5Cmathit%7Bmeta%7D%5C%3Bg%5C%3Bf%5C%3Bb%5C%3Bx+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{stream}\;g\;f\;b\;x = \mathit{meta}\;g\;f\;b\;x " class="latex" title="\displaystyle \mathit{stream}\;g\;f\;b\;x = \mathit{meta}\;g\;f\;b\;x " />
</p></blockquote>
<p> for all finite lists <img src="https://s0.wp.com/latex.php?latex=%7Bx%7D&bg=ffffff&fg=000000&s=0" alt="{x}" class="latex" title="{x}" />.</p>
<p>
As a simple example, consider the `buffering’ process <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bmeta%7D%5C%3B%5Cmathit%7Buncons%7D%5C%3B%28%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D%29%5C%3B%5B%5C%2C%5D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{meta}\;\mathit{uncons}\;(\mathbin{{+}\!\!\!{+}})\;[\,]}" class="latex" title="{\mathit{meta}\;\mathit{uncons}\;(\mathbin{{+}\!\!\!{+}})\;[\,]}" />, where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Buncons%7D%5C%3Bx+%26%3D%26+%5Cmathbf%7Bcase%7D%5C%3Bx%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5B%5C%2C%5D+%26%5Crightarrow%26+%5Cmathit%7BNothing%7D+%5C%5C+c%3Ax%27+%26%5Crightarrow%26+%5Cmathit%7BJust%7D%5C%3B%28c%2Cx%27%29+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{uncons}\;x &=& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} [\,] &\rightarrow& \mathit{Nothing} \\ c:x' &\rightarrow& \mathit{Just}\;(c,x') \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{uncons}\;x &=& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} [\,] &\rightarrow& \mathit{Nothing} \\ c:x' &\rightarrow& \mathit{Just}\;(c,x') \end{array} \end{array} " />
</p></blockquote>
<p> Note that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Buncons%7D+%3D+%5Cmathit%7Bid%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}\;\mathit{uncons} = \mathit{id}}" class="latex" title="{\mathit{unfoldr}\;\mathit{uncons} = \mathit{id}}" />, so <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bmeta%7D%5C%3B%5Cmathit%7Buncons%7D%5C%3B%28%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D%29%5C%3B%5B%5C%2C%5D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{meta}\;\mathit{uncons}\;(\mathbin{{+}\!\!\!{+}})\;[\,]}" class="latex" title="{\mathit{meta}\;\mathit{uncons}\;(\mathbin{{+}\!\!\!{+}})\;[\,]}" /> is just a complicated way of writing <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bconcat%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{concat}}" class="latex" title="{\mathit{concat}}" /> as a <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" />. But the streaming condition holds for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathbin{{+}\!\!\!{+}}}" class="latex" title="{\mathbin{{+}\!\!\!{+}}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Buncons%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{uncons}}" class="latex" title="{\mathit{uncons}}" /> (as you may check), so <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bconcat%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{concat}}" class="latex" title="{\mathit{concat}}" /> may be streamed. Operationally, the streaming version of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bconcat%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{concat}}" class="latex" title="{\mathit{concat}}" /> consumes one list from the input list of lists, then peels off and produces its elements one by one; when they have all been delivered, it consumes the next input list, and so on.</p>
<p></p><h2> Flushing </h2>
<p>
The streaming version of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bconcat%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{concat}}" class="latex" title="{\mathit{concat}}" /> is actually rather special, because the production steps can always completely exhaust the intermediate state. In contrast, consider the `regrouping’ example <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bregroup%7D%5C%3Bn+%3D+%5Cmathit%7Bmeta%7D%5C%3B%28%5Cmathit%7Bchunk%7D%5C%3Bn%29%5C%3B%28%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D%29%5C%3B%5B%5C%2C%5D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{regroup}\;n = \mathit{meta}\;(\mathit{chunk}\;n)\;(\mathbin{{+}\!\!\!{+}})\;[\,]}" class="latex" title="{\mathit{regroup}\;n = \mathit{meta}\;(\mathit{chunk}\;n)\;(\mathbin{{+}\!\!\!{+}})\;[\,]}" /> where </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bchunk%7D%5C%3Bn%5C%3B%5B%5C%2C%5D+%26%3D%26+%5Cmathit%7BNothing%7D+%5C%5C+%5Cmathit%7Bchunk%7D%5C%3Bn%5C%3Bx+%26%3D%26+%5Cmathit%7BJust%7D%5C%3B%28%5Cmathit%7BsplitAt%7D%5C%3Bn%5C%3Bx%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{chunk}\;n\;[\,] &=& \mathit{Nothing} \\ \mathit{chunk}\;n\;x &=& \mathit{Just}\;(\mathit{splitAt}\;n\;x) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{chunk}\;n\;[\,] &=& \mathit{Nothing} \\ \mathit{chunk}\;n\;x &=& \mathit{Just}\;(\mathit{splitAt}\;n\;x) \end{array} " />
</p></blockquote>
<p> from the introduction (here, <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BsplitAt%7D%5C%3Bn%5C%3Bx%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{splitAt}\;n\;x}" class="latex" title="{\mathit{splitAt}\;n\;x}" /> yields <img src="https://s0.wp.com/latex.php?latex=%7B%28y%2Cz%29%7D&bg=ffffff&fg=000000&s=0" alt="{(y,z)}" class="latex" title="{(y,z)}" /> where <img src="https://s0.wp.com/latex.php?latex=%7By+%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D+z+%3D+x%7D&bg=ffffff&fg=000000&s=0" alt="{y \mathbin{{+}\!\!\!{+}} z = x}" class="latex" title="{y \mathbin{{+}\!\!\!{+}} z = x}" />, with <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Blength%7D%5C%3By%3Dn%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{length}\;y=n}" class="latex" title="{\mathit{length}\;y=n}" /> when <img src="https://s0.wp.com/latex.php?latex=%7Bn+%5Cle+%5Cmathit%7Blength%7D%5C%3Bx%7D&bg=ffffff&fg=000000&s=0" alt="{n \le \mathit{length}\;x}" class="latex" title="{n \le \mathit{length}\;x}" /> and <img src="https://s0.wp.com/latex.php?latex=%7By%3Dx%7D&bg=ffffff&fg=000000&s=0" alt="{y=x}" class="latex" title="{y=x}" /> otherwise). This transforms an input list of lists into an output list of lists, where each output `chunk’ except perhaps the last has length <img src="https://s0.wp.com/latex.php?latex=%7Bn%7D&bg=ffffff&fg=000000&s=0" alt="{n}" class="latex" title="{n}" />—if the content doesn’t divide up evenly, then the last chunk is short. One might hope to be able to stream <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bregroup%7D%5C%3Bn%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{regroup}\;n}" class="latex" title="{\mathit{regroup}\;n}" />, but it doesn’t quite work with the formulation so far. The problem is that <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bchunk%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{chunk}}" class="latex" title="{\mathit{chunk}}" /> is too aggressive, and will produce short chunks when there is still some input to consume. (Indeed, the streaming condition does not hold for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathbin{{+}\!\!\!{+}}}" class="latex" title="{\mathbin{{+}\!\!\!{+}}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bchunk%7D%5C%3Bn%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{chunk}\;n}" class="latex" title="{\mathit{chunk}\;n}" />—why not?) One might try the more cautious producer <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bchunk%27%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{chunk'}}" class="latex" title="{\mathit{chunk'}}" />: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlclcl%40%7B%7D%7D+%5Cmathit%7Bchunk%27%7D%5C%3Bn%5C%3Bx+%26%5Cmid%26+n+%5Cle+%5Cmathit%7Blength%7D%5C%3Bx+%26%3D%26+%5Cmathit%7BJust%7D%5C%3B%28%5Cmathit%7BsplitAt%7D%5C%3Bn%5C%3Bx%29+%5C%5C+%26%5Cmid%26+%5Cmathbf%7Botherwise%7D+%26%3D%26+%5Cmathit%7BNothing%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lclcl@{}} \mathit{chunk'}\;n\;x &\mid& n \le \mathit{length}\;x &=& \mathit{Just}\;(\mathit{splitAt}\;n\;x) \\ &\mid& \mathbf{otherwise} &=& \mathit{Nothing} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lclcl@{}} \mathit{chunk'}\;n\;x &\mid& n \le \mathit{length}\;x &=& \mathit{Just}\;(\mathit{splitAt}\;n\;x) \\ &\mid& \mathbf{otherwise} &=& \mathit{Nothing} \end{array} " />
</p></blockquote>
<p> But this never produces a short chunk, and so if the content doesn’t divide up evenly then the last few elements will not be extracted from the intermediate state and will be lost.</p>
<p>
We need to combine these two producers somehow: the streaming process should behave cautiously while there is still remaining input, which might influence the next output; but it should then switch to a more aggressive strategy once the input is finished, in order to flush out the contents of the intermediate state. To achieve this, we define a more general <em>flushing stream</em> operator: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmulticolumn%7B3%7D%7B%40%7B%7Dl%40%7B%7D%7D%7B%5Cmathit%7Bfstream%7D+%3A%3A+%28%5Cbeta+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cgamma%2C%5Cbeta%29%29+%5Crightarrow+%28%5Cbeta+%5Crightarrow+%5B%5Cgamma%5D%29+%5Crightarrow+%28%5Cbeta+%5Crightarrow+%5Calpha+%5Crightarrow+%5Cbeta%29+%5Crightarrow+%5Cbeta+%5Crightarrow+%5B%5Calpha%5D+%5Crightarrow+%5B%5Cgamma%5D%7D+%5C%5C+%5Cmathit%7Bfstream%7D%5C%3Bg%5C%3Bh%5C%3Bf%5C%3Bb%5C%3Bx+%26%3D%26+%5Cmathbf%7Bcase%7D%5C%3Bg%5C%3Bb%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7BJust%7D%5C%3B%28c%2Cb%27%29+%26%5Crightarrow%26+c+%3A+%5Cmathit%7Bfstream%7D%5C%3Bg%5C%3Bh%5C%3Bf%5C%3Bb%27%5C%3Bx+%5C%5C+%5Cmathit%7BNothing%7D+%26%5Crightarrow%26+%5Cmathbf%7Bcase%7D%5C%3Bx%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+a%3Ax%27+%26%5Crightarrow%26+%5Cmathit%7Bfstream%7D%5C%3Bg%5C%3Bh%5C%3Bf%5C%3B%28f%5C%3Bb%5C%3Ba%29%5C%3Bx%27+%5C%5C+%7B%5B%5C%2C%5D%7D+%26%5Crightarrow%26+h%5C%3Bb+%5Cend%7Barray%7D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l@{}}{\mathit{fstream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow [\gamma]) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{fstream}\;g\;h\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{fstream}\;g\;h\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{fstream}\;g\;h\;f\;(f\;b\;a)\;x' \\ {[\,]} &\rightarrow& h\;b \end{array} \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l@{}}{\mathit{fstream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow [\gamma]) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{fstream}\;g\;h\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{fstream}\;g\;h\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{fstream}\;g\;h\;f\;(f\;b\;a)\;x' \\ {[\,]} &\rightarrow& h\;b \end{array} \end{array} \end{array} " />
</p></blockquote>
<p> This takes an additional argument <img src="https://s0.wp.com/latex.php?latex=%7Bh+%3A%3A+%5Cbeta+%5Crightarrow+%5B%5Cgamma%5D%7D&bg=ffffff&fg=000000&s=0" alt="{h :: \beta \rightarrow [\gamma]}" class="latex" title="{h :: \beta \rightarrow [\gamma]}" />; when the cautious producer <img src="https://s0.wp.com/latex.php?latex=%7Bg%7D&bg=ffffff&fg=000000&s=0" alt="{g}" class="latex" title="{g}" /> is unproductive, and there is no remaining input to consume, it uses <img src="https://s0.wp.com/latex.php?latex=%7Bh%7D&bg=ffffff&fg=000000&s=0" alt="{h}" class="latex" title="{h}" /> to flush out the remaining output elements from the state. Clearly, specializing to <img src="https://s0.wp.com/latex.php?latex=%7Bh%5C%3Bb%3D%5B%5C%2C%5D%7D&bg=ffffff&fg=000000&s=0" alt="{h\;b=[\,]}" class="latex" title="{h\;b=[\,]}" /> retrieves the original <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstream%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stream}}" class="latex" title="{\mathit{stream}}" /> operator.</p>
<p>
The corresponding metamorphism uses an <em><a href="https://www.mii.lt/informatica/htm/INFO141.htm">apomorphism</a></em> in place of the unfold. Define </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bapo%7D+%26%3A%3A%26+%28%5Cbeta+%5Crightarrow+%5Cmathsf%7BMaybe%7D%5C%3B%28%5Cgamma%2C%5Cbeta%29%29+%5Crightarrow+%28%5Cbeta+%5Crightarrow+%5B%5Cgamma%5D%29+%5Crightarrow+%5Cbeta+%5Crightarrow+%5B%5Cgamma%5D+%5C%5C+%5Cmathit%7Bapo%7D%5C%3Bg%5C%3Bh%5C%3Bb+%26%3D%26+%5Cmathbf%7Bcase%7D%5C%3Bg%5C%3Bb%5C%3B%5Cmathbf%7Bof%7D+%5C%5C+%26+%26+%5Cquad+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7BJust%7D%5C%3B%28c%2Cb%27%29+%26%5Crightarrow%26+c+%3A+%5Cmathit%7Bapo%7D%5C%3Bg%5C%3Bh%5C%3Bb%27+%5C%5C+%5Cmathit%7BNothing%7D+%26%5Crightarrow%26+h%5C%3Bb+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{apo} &::& (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow [\gamma]) \rightarrow \beta \rightarrow [\gamma] \\ \mathit{apo}\;g\;h\;b &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{apo}\;g\;h\;b' \\ \mathit{Nothing} &\rightarrow& h\;b \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{apo} &::& (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow [\gamma]) \rightarrow \beta \rightarrow [\gamma] \\ \mathit{apo}\;g\;h\;b &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{apo}\;g\;h\;b' \\ \mathit{Nothing} &\rightarrow& h\;b \end{array} \end{array} " />
</p></blockquote>
<p> Then <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bapo%7D%5C%3Bg%5C%3Bh%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{apo}\;g\;h}" class="latex" title="{\mathit{apo}\;g\;h}" /> behaves like <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%5C%3Bg%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}\;g}" class="latex" title="{\mathit{unfoldr}\;g}" />, except that if and when <img src="https://s0.wp.com/latex.php?latex=%7Bg%7D&bg=ffffff&fg=000000&s=0" alt="{g}" class="latex" title="{g}" /> stops being productive it finishes up by applying <img src="https://s0.wp.com/latex.php?latex=%7Bh%7D&bg=ffffff&fg=000000&s=0" alt="{h}" class="latex" title="{h}" /> to the final state. Similarly, define flushing metamorphisms: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bfmeta%7D%5C%3Bg%5C%3Bh%5C%3Bf%5C%3Bb+%3D+%5Cmathit%7Bapo%7D%5C%3Bg%5C%3Bh+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3Bf%5C%3Bb+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{fmeta}\;g\;h\;f\;b = \mathit{apo}\;g\;h \cdot \mathit{foldl}\;f\;b " class="latex" title="\displaystyle \mathit{fmeta}\;g\;h\;f\;b = \mathit{apo}\;g\;h \cdot \mathit{foldl}\;f\;b " />
</p></blockquote>
<p> Then we have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bfstream%7D%5C%3Bg%5C%3Bh%5C%3Bf%5C%3Bb%5C%3Bx+%3D+%5Cmathit%7Bfmeta%7D%5C%3Bg%5C%3Bh%5C%3Bf%5C%3Bb%5C%3Bx+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{fstream}\;g\;h\;f\;b\;x = \mathit{fmeta}\;g\;h\;f\;b\;x " class="latex" title="\displaystyle \mathit{fstream}\;g\;h\;f\;b\;x = \mathit{fmeta}\;g\;h\;f\;b\;x " />
</p></blockquote>
<p> for all finite lists <img src="https://s0.wp.com/latex.php?latex=%7Bx%7D&bg=ffffff&fg=000000&s=0" alt="{x}" class="latex" title="{x}" /> if the streaming condition holds for <img src="https://s0.wp.com/latex.php?latex=%7Bf%7D&bg=ffffff&fg=000000&s=0" alt="{f}" class="latex" title="{f}" /> and <img src="https://s0.wp.com/latex.php?latex=%7Bg%7D&bg=ffffff&fg=000000&s=0" alt="{g}" class="latex" title="{g}" />. In particular, </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bregroup%7D%5C%3Bn%5C%3B%5Cmathit%7Bxs%7D+%26%3D%26+%5Cmathit%7Bfmeta%7D%5C%3B%28%5Cmathit%7Bchunk%27%7D%5C%3Bn%29%5C%3B%28%5Cmathit%7Bunfoldr%7D%5C%3B%28%5Cmathit%7Bchunk%7D%5C%3Bn%29%29%5C%3B%28%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D%29%5C%3B%5B%5C%2C%5D%5C%3B%5Cmathit%7Bxs%7D+%5C%5C+%26%3D%26+%5Cmathit%7Bfstream%7D%5C%3B%28%5Cmathit%7Bchunk%27%7D%5C%3Bn%29%5C%3B%28%5Cmathit%7Bunfoldr%7D%5C%3B%28%5Cmathit%7Bchunk%7D%5C%3Bn%29%29%5C%3B%28%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D%29%5C%3B%5B%5C%2C%5D%5C%3B%5Cmathit%7Bxs%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{regroup}\;n\;\mathit{xs} &=& \mathit{fmeta}\;(\mathit{chunk'}\;n)\;(\mathit{unfoldr}\;(\mathit{chunk}\;n))\;(\mathbin{{+}\!\!\!{+}})\;[\,]\;\mathit{xs} \\ &=& \mathit{fstream}\;(\mathit{chunk'}\;n)\;(\mathit{unfoldr}\;(\mathit{chunk}\;n))\;(\mathbin{{+}\!\!\!{+}})\;[\,]\;\mathit{xs} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{regroup}\;n\;\mathit{xs} &=& \mathit{fmeta}\;(\mathit{chunk'}\;n)\;(\mathit{unfoldr}\;(\mathit{chunk}\;n))\;(\mathbin{{+}\!\!\!{+}})\;[\,]\;\mathit{xs} \\ &=& \mathit{fstream}\;(\mathit{chunk'}\;n)\;(\mathit{unfoldr}\;(\mathit{chunk}\;n))\;(\mathbin{{+}\!\!\!{+}})\;[\,]\;\mathit{xs} \end{array} " />
</p></blockquote>
<p> on finite inputs <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bxs%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{xs}}" class="latex" title="{\mathit{xs}}" />: the streaming condition does hold for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathbin%7B%7B%2B%7D%5C%21%5C%21%5C%21%7B%2B%7D%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathbin{{+}\!\!\!{+}}}" class="latex" title="{\mathbin{{+}\!\!\!{+}}}" /> and the more cautious <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bchunk%27%7D%5C%3Bn%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{chunk'}\;n}" class="latex" title="{\mathit{chunk'}\;n}" />, and once the input has been exhausted, the process can switch to the more aggressive <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bchunk%7D%5C%3Bn%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{chunk}\;n}" class="latex" title="{\mathit{chunk}\;n}" />. </p>
<p></p><h2> Infinite input </h2>
<p>
The main advantage of streaming is that it can allow the change-of-representation process also to work on infinite inputs. With the plain metamorphism, this is not possible: the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" /> will yield no result on an infinite input, and so the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}}" class="latex" title="{\mathit{unfoldr}}" /> will never get started, but the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstream%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stream}}" class="latex" title="{\mathit{stream}}" /> may be able to produce some outputs before having consumed all the inputs. For example, the streaming version of <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bregroup%7D%5C%3Bn%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{regroup}\;n}" class="latex" title="{\mathit{regroup}\;n}" /> also works for infinite lists, providing that the input does not end with an infinite tail of empty lists. And of course, if the input never runs out, then there is no need ever to switch to the more aggressive flushing phase.</p>
<p>
As a more interesting example, consider converting a fraction from base 3 to base 7: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7BfromBase3%7D+%26%3D%26+%5Cmathit%7Bfoldr%7D%5C%3B%5Cmathit%7Bstepr%7D%5C%3B0+%5Cquad+%5Cmathbf%7Bwhere%7D%5C%3B%5Cmathit%7Bstepr%7D%5C%3Bd%5C%3Bx+%3D+%28d%2Bx%29%2F3+%5C%5C+%5Cmathit%7BtoBase7%7D+%26%3D%26+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bnext%7D+%5Cquad+%5Cmathbf%7Bwhere%7D%5C%3B+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bnext%7D%5C%3B0+%26%3D%26+%5Cmathit%7BNothing%7D+%5C%5C+%5Cmathit%7Bnext%7D%5C%3Bx+%26%3D%26+%5Cmathbf%7Blet%7D%5C%3By%3D7%5Ctimes+x%5C%3B%5Cmathbf%7Bin%7D%5C%3B%5Cmathit%7BJust%7D%5C%3B%28%5Clfloor+y%5Crfloor%2C+y-%5Clfloor+y%5Crfloor%29+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{fromBase3} &=& \mathit{foldr}\;\mathit{stepr}\;0 \quad \mathbf{where}\;\mathit{stepr}\;d\;x = (d+x)/3 \\ \mathit{toBase7} &=& \mathit{unfoldr}\;\mathit{next} \quad \mathbf{where}\; \begin{array}[t]{@{}lcl@{}} \mathit{next}\;0 &=& \mathit{Nothing} \\ \mathit{next}\;x &=& \mathbf{let}\;y=7\times x\;\mathbf{in}\;\mathit{Just}\;(\lfloor y\rfloor, y-\lfloor y\rfloor) \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{fromBase3} &=& \mathit{foldr}\;\mathit{stepr}\;0 \quad \mathbf{where}\;\mathit{stepr}\;d\;x = (d+x)/3 \\ \mathit{toBase7} &=& \mathit{unfoldr}\;\mathit{next} \quad \mathbf{where}\; \begin{array}[t]{@{}lcl@{}} \mathit{next}\;0 &=& \mathit{Nothing} \\ \mathit{next}\;x &=& \mathbf{let}\;y=7\times x\;\mathbf{in}\;\mathit{Just}\;(\lfloor y\rfloor, y-\lfloor y\rfloor) \end{array} \end{array} " />
</p></blockquote>
<p> We assume that the input digits are all either 0, 1 or 2, so that the number being represented is in the unit interval.</p>
<p>
The fold in <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7BfromBase3%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{fromBase3}}" class="latex" title="{\mathit{fromBase3}}" /> is of the wrong kind; but we have also </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7BfromBase3%7D+%26%3D%26+%5Cmathit%7Bextract%7D+%5Ccdot+%5Cmathit%7Bfoldl%7D%5C%3B%5Cmathit%7Bstepl%7D%5C%3B%280%2C1%29+%5Cquad+%5Cmathbf%7Bwhere%7D%5C%3B+%5Cmathit%7Bstepl%7D%5C%3B%28u%2Cv%29%5C%3Bd+%3D+%28u+%5Ctimes+3+%2B+d%2C+v+%2F+3%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{fromBase3} &=& \mathit{extract} \cdot \mathit{foldl}\;\mathit{stepl}\;(0,1) \quad \mathbf{where}\; \mathit{stepl}\;(u,v)\;d = (u \times 3 + d, v / 3) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{fromBase3} &=& \mathit{extract} \cdot \mathit{foldl}\;\mathit{stepl}\;(0,1) \quad \mathbf{where}\; \mathit{stepl}\;(u,v)\;d = (u \times 3 + d, v / 3) \end{array} " />
</p></blockquote>
<p> Here, the intermediate state <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" /> can be seen as a defunctionalized representation of the function <img src="https://s0.wp.com/latex.php?latex=%7B%28v%5Ctimes%29+%5Ccdot+%28u%2B%29%7D&bg=ffffff&fg=000000&s=0" alt="{(v\times) \cdot (u+)}" class="latex" title="{(v\times) \cdot (u+)}" />, and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bextract%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{extract}}" class="latex" title="{\mathit{extract}}" /> applies this function to <img src="https://s0.wp.com/latex.php?latex=%7B0%7D&bg=ffffff&fg=000000&s=0" alt="{0}" class="latex" title="{0}" />: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bapply%7D%5C%3B%28u%2Cv%29%5C%3Bx+%26%3D%26+v+%5Ctimes+%28u+%2B+x%29+%5C%5C+%5Cmathit%7Bextract%7D%5C%3B%28u%2Cv%29+%26%3D%26+%5Cmathit%7Bapply%7D%5C%3B%28u%2Cv%29%5C%3B0+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{apply}\;(u,v)\;x &=& v \times (u + x) \\ \mathit{extract}\;(u,v) &=& \mathit{apply}\;(u,v)\;0 \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{apply}\;(u,v)\;x &=& v \times (u + x) \\ \mathit{extract}\;(u,v) &=& \mathit{apply}\;(u,v)\;0 \end{array} " />
</p></blockquote>
<p> Now there is an extra function <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bextract%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{extract}}" class="latex" title="{\mathit{extract}}" /> between the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bfoldl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{foldl}}" class="latex" title="{\mathit{foldl}}" /> and the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}}" class="latex" title="{\mathit{unfoldr}}" />; but that’s no obstacle, because it fuses with the <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bunfoldr%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{unfoldr}}" class="latex" title="{\mathit{unfoldr}}" />: </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7BtoBase7%7D+%5Ccdot+%5Cmathit%7Bextract%7D+%26%3D%26+%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bnext%27%7D+%5Cquad+%5Cmathbf%7Bwhere%7D%5C%3B+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bnext%27%7D%5C%3B%280%2Cv%29+%26%3D%26+%5Cmathit%7BNothing%7D+%5C%5C+%5Cmathit%7Bnext%27%7D%5C%3B%28u%2Cv%29+%26%3D%26+%5Cbegin%7Barray%7D%5Bt%5D%7B%40%7B%7Dl%7D+%5Cmathbf%7Blet%7D%5C%3By+%3D+%5Clfloor%7B7+%5Ctimes+u+%5Ctimes+v%7D%5Crfloor%5C%3B%5Cmathbf%7Bin%7D+%5C%5C+%5Cmathit%7BJust%7D%5C%3B%28y%2C%28u+-+y%2F%28v+%5Ctimes+7%29%2C+v+%5Ctimes+7%29%29+%5Cend%7Barray%7D+%5Cend%7Barray%7D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{toBase7} \cdot \mathit{extract} &=& \mathit{unfoldr}\;\mathit{next'} \quad \mathbf{where}\; \begin{array}[t]{@{}lcl@{}} \mathit{next'}\;(0,v) &=& \mathit{Nothing} \\ \mathit{next'}\;(u,v) &=& \begin{array}[t]{@{}l} \mathbf{let}\;y = \lfloor{7 \times u \times v}\rfloor\;\mathbf{in} \\ \mathit{Just}\;(y,(u - y/(v \times 7), v \times 7)) \end{array} \end{array} \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{toBase7} \cdot \mathit{extract} &=& \mathit{unfoldr}\;\mathit{next'} \quad \mathbf{where}\; \begin{array}[t]{@{}lcl@{}} \mathit{next'}\;(0,v) &=& \mathit{Nothing} \\ \mathit{next'}\;(u,v) &=& \begin{array}[t]{@{}l} \mathbf{let}\;y = \lfloor{7 \times u \times v}\rfloor\;\mathbf{in} \\ \mathit{Just}\;(y,(u - y/(v \times 7), v \times 7)) \end{array} \end{array} \end{array} " />
</p></blockquote>
<p> However, the streaming condition does not hold for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstepl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stepl}}" class="latex" title="{\mathit{stepl}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bnext%27%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{next'}}" class="latex" title="{\mathit{next'}}" />. For example, </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cbegin%7Barray%7D%7B%40%7B%7Dlcl%40%7B%7D%7D+%5Cmathit%7Bnext%27%7D%5C%3B%281%2C%7B%7B%7D%5E%7B1%5C%21%7D%2F_%7B%5C%213%7D%7D%29+%26%3D%26+%5Cmathit%7BJust%7D%5C%3B%282%2C+%28%7B%7B%7D%5E%7B1%5C%21%7D%2F_%7B%5C%217%7D%7D%2C%7B%7B%7D%5E%7B7%5C%21%7D%2F_%7B%5C%213%7D%7D%29%29+%5C%5C+%5Cmathit%7Bnext%27%7D%5C%3B%28%5Cmathit%7Bstepl%7D%5C%3B%281%2C%7B%7B%7D%5E%7B1%5C%21%7D%2F_%7B%5C%213%7D%7D%29%5C%3B1%29+%26%3D%26+%5Cmathit%7Bnext%27%7D%5C%3B%284%2C%7B%7B%7D%5E%7B1%5C%21%7D%2F_%7B%5C%219%7D%7D%29+%5C%5C+%26%3D%26+%5Cmathit%7BJust%7D%5C%3B%283%2C%28%7B%7B%7D%5E%7B1%5C%21%7D%2F_%7B%5C%217%7D%7D%2C%7B%7B%7D%5E%7B7%5C%21%7D%2F_%7B%5C%219%7D%7D%29%29+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \begin{array}{@{}lcl@{}} \mathit{next'}\;(1,{{}^{1\!}/_{\!3}}) &=& \mathit{Just}\;(2, ({{}^{1\!}/_{\!7}},{{}^{7\!}/_{\!3}})) \\ \mathit{next'}\;(\mathit{stepl}\;(1,{{}^{1\!}/_{\!3}})\;1) &=& \mathit{next'}\;(4,{{}^{1\!}/_{\!9}}) \\ &=& \mathit{Just}\;(3,({{}^{1\!}/_{\!7}},{{}^{7\!}/_{\!9}})) \end{array} " class="latex" title="\displaystyle \begin{array}{@{}lcl@{}} \mathit{next'}\;(1,{{}^{1\!}/_{\!3}}) &=& \mathit{Just}\;(2, ({{}^{1\!}/_{\!7}},{{}^{7\!}/_{\!3}})) \\ \mathit{next'}\;(\mathit{stepl}\;(1,{{}^{1\!}/_{\!3}})\;1) &=& \mathit{next'}\;(4,{{}^{1\!}/_{\!9}}) \\ &=& \mathit{Just}\;(3,({{}^{1\!}/_{\!7}},{{}^{7\!}/_{\!9}})) \end{array} " />
</p></blockquote>
<p> That is, <img src="https://s0.wp.com/latex.php?latex=%7B0.1_3+%5Csimeq+0.222_7%7D&bg=ffffff&fg=000000&s=0" alt="{0.1_3 \simeq 0.222_7}" class="latex" title="{0.1_3 \simeq 0.222_7}" />, but <img src="https://s0.wp.com/latex.php?latex=%7B0.11_3+%5Csimeq+0.305_7%7D&bg=ffffff&fg=000000&s=0" alt="{0.11_3 \simeq 0.305_7}" class="latex" title="{0.11_3 \simeq 0.305_7}" />, so it is premature to produce the first digit 2 in base 7 having consumed only the first digit 1 in base 3. The producer <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bnext%27%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{next'}}" class="latex" title="{\mathit{next'}}" /> is too aggressive; it should be more cautious while input remains that might invalidate a produced digit.</p>
<p>
Fortunately, on the assumption that the input digits are all 0, 1, or 2, the unconsumed input—a tail of the original input—again represents a number in the unit interval; so from the state <img src="https://s0.wp.com/latex.php?latex=%7B%28u%2Cv%29%7D&bg=ffffff&fg=000000&s=0" alt="{(u,v)}" class="latex" title="{(u,v)}" /> the range of possible unproduced outputs represents a number between <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bapply%7D%5C%3B%28u%2Cv%29%5C%3B0%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{apply}\;(u,v)\;0}" class="latex" title="{\mathit{apply}\;(u,v)\;0}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bapply%7D%5C%3B%28u%2Cv%29%5C%3B1%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{apply}\;(u,v)\;1}" class="latex" title="{\mathit{apply}\;(u,v)\;1}" />. If these both start with the same digit in base 7, then (and only then) is it safe to produce that digit. So we define </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bnext%27%27%7D%5C%3B%28u%2Cv%29+%3D+%5Cmathbf%7Bif%7D%5C%3B%5Clfloor%7Bu+%5Ctimes+v+%5Ctimes+7%7D%5Crfloor+%3D+%5Clfloor%7B%28u%2B1%29+%5Ctimes+v+%5Ctimes+7%7D%5Crfloor%5C%3B%5Cmathbf%7Bthen%7D%5C%3B%5Cmathit%7Bnext%27%7D%5C%3B%28u%2Cv%29%5C%3B%5Cmathbf%7Belse%7D%5C%3B%5Cmathit%7BNothing%7D+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{next''}\;(u,v) = \mathbf{if}\;\lfloor{u \times v \times 7}\rfloor = \lfloor{(u+1) \times v \times 7}\rfloor\;\mathbf{then}\;\mathit{next'}\;(u,v)\;\mathbf{else}\;\mathit{Nothing} " class="latex" title="\displaystyle \mathit{next''}\;(u,v) = \mathbf{if}\;\lfloor{u \times v \times 7}\rfloor = \lfloor{(u+1) \times v \times 7}\rfloor\;\mathbf{then}\;\mathit{next'}\;(u,v)\;\mathbf{else}\;\mathit{Nothing} " />
</p></blockquote>
<p> and we have </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bnext%27%7D+%3D+%5Cmathit%7Bapo%7D%5C%3B%5Cmathit%7Bnext%27%27%7D%5C%3B%28%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bnext%27%7D%29+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{unfoldr}\;\mathit{next'} = \mathit{apo}\;\mathit{next''}\;(\mathit{unfoldr}\;\mathit{next'}) " class="latex" title="\displaystyle \mathit{unfoldr}\;\mathit{next'} = \mathit{apo}\;\mathit{next''}\;(\mathit{unfoldr}\;\mathit{next'}) " />
</p></blockquote>
<p> Now, the streaming condition holds for <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bstepl%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{stepl}}" class="latex" title="{\mathit{stepl}}" /> and <img src="https://s0.wp.com/latex.php?latex=%7B%5Cmathit%7Bnext%27%27%7D%7D&bg=ffffff&fg=000000&s=0" alt="{\mathit{next''}}" class="latex" title="{\mathit{next''}}" /> (as you may check), and therefore </p>
<blockquote><p>
<img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathit%7BtoBase7%7D%5C%3B%28%5Cmathit%7BfromBase3%7D%5C%3Bx%29+%3D+%5Cmathit%7Bfstream%7D%5C%3B%5Cmathit%7Bnext%27%27%7D%5C%3B%28%5Cmathit%7Bunfoldr%7D%5C%3B%5Cmathit%7Bnext%27%7D%29%5C%3B%5Cmathit%7Bstepl%7D%5C%3B%280%2C1%29%5C%3Bx+&bg=ffffff&fg=000000&s=0" alt="\displaystyle \mathit{toBase7}\;(\mathit{fromBase3}\;x) = \mathit{fstream}\;\mathit{next''}\;(\mathit{unfoldr}\;\mathit{next'})\;\mathit{stepl}\;(0,1)\;x " class="latex" title="\displaystyle \mathit{toBase7}\;(\mathit{fromBase3}\;x) = \mathit{fstream}\;\mathit{next''}\;(\mathit{unfoldr}\;\mathit{next'})\;\mathit{stepl}\;(0,1)\;x " />
</p></blockquote>
<p> on finite digit sequences <img src="https://s0.wp.com/latex.php?latex=%7Bx%7D&bg=ffffff&fg=000000&s=0" alt="{x}" class="latex" title="{x}" /> in base 3. Moreover, the streaming program works also on <em>infinite</em> digit sequences, where the original does not. </p>
<p>
(Actually, the only way this could possibly produce a finite output in base 7 would be for the input to be all zeroes. Why? If we are happy to rule out this case, we could consider only the case of taking infinite input to infinite output, and not have to worry about reaching the end of the input or flushing the state.)</p>
<p><img src="https://pixel.wp.com/b.gif?host=patternsinfp.wordpress.com&blog=15593982&post=288&subd=patternsinfp&ref=&feed=1" alt="" height="1" border="0" width="1" /></p>Wed, 15 Nov 2017 12:30:18 +0000Manuel M T Chakravarty: Functional Confhttp://justtesting.org/post/167510227471
http://justtesting.org/post/167510227471
<p>This coming weekend, I will present <a href="https://functionalconf.com/proposal.html?id=3939">Haskell SpriteKit — a Purely Functional API for a Stateful Animation System and Physics Engine</a> as well as a workshop on <a href="https://functionalconf.com/proposal.html?id=4054">Functional Programming in Swift</a> at <a href="https://functionalconf.com/">Functional Conf</a> in Bangalore.</p>Wed, 15 Nov 2017 06:06:40 +0000Functional Jobs: Backend Ruby and Haskell engineer at Health eFilings (Full-time)urn:uuid:d3aba912-abbb-9ced-41b9-6d43c38de7ab
https://functionaljobs.com/jobs/9050-backend-ruby-and-haskell-engineer-at-health-efilings
<p>Our backend engineering team manages the ingestion and normalization of data sets, from data extraction through to product delivery. We want to work smarter instead of harder, and create domain specific languages, meta-programming etc. where possible.</p>
<p>Our current code base is written in Ruby and Coffee Script, but some new modules are being written in Haskell. You will be on the front lines of creating a Haskell-based infrastructure that is maintainable and can scale to support our needs as we grow.</p>
<p>We currently expect that about 80% of your work will be in Ruby/CoffeeScript, and 20% in Haskell, but that ratio will decrease over time as we move more of our functionality to Haskell. (The faster you can work to migrate functionality to Haskell, the more Haskell you will be doing.)</p>
<p><strong>WHAT WE WILL EXPECT FROM YOU</strong></p>
<p>You will have ownership of an entire module, including responsibility for:</p>
<ul>
<li>Creating new features in a clean and maintainable way</li>
<li>Re-factoring existing code to ensure that we stay agile</li>
<li>Reviewing teammates’ code and providing feedback</li>
<li>Keeping yourself focused and your projects on track</li>
<li>An “I can run through walls” mentality to ensure that goals are met</li>
<li>Answering questions from our implementation team and squashing bugs on a monthly support rotation</li>
</ul>
<p>We are a small team (four engineers), and so it is <em>critical</em> that you be a team player, willing to pitch in and help out your colleagues.</p>
<p><strong>WHAT YOU CAN EXPECT FROM US</strong></p>
<ul>
<li>Autonomy to solve problems in the way you best see fit</li>
<li>A manager who is accountable for ensuring you meet your professional goals</li>
<li>A team who helps each other and always strives to improve</li>
<li>The time to focus on creating the right solution, instead of the easiest one</li>
</ul>
<p><strong>REQUIREMENTS</strong></p>
<ul>
<li>Professional experience as a software engineer</li>
<li>Experience with Haskell and Ruby</li>
<li>A desire for continual self-improvement</li>
<li>An understanding of best practices regarding maintainability and scalability</li>
<li>Must have US work authorization and be located in the US (we cannot sponsor visas at this time)</li>
<li>There are no formal education requirements for this position</li>
</ul>
<p><strong>BONUS POINTS</strong></p>
<ul>
<li>Experience with data scraping and parsing</li>
</ul>
<p><strong>LOCATION</strong></p>
<p>This is expected to be a remote position, although our Madison, Wisconsin office is also available as a work location.</p>
<p>Get information on <a href="https://functionaljobs.com/jobs/9050-backend-ruby-and-haskell-engineer-at-health-efilings">how to apply</a> for this position.</p>Tue, 14 Nov 2017 19:26:52 +0000Tim Docker: Algebraic Data Types in Javahttp://twdkz.wordpress.com/?p=136
https://twdkz.wordpress.com/2017/11/14/algebraic-data-types-in-java/
<p>At <a href="http://www.helixta.com.au/">Helix</a> we often code backend services in java. I find modern java <em>acceptable</em> as a language for getting things done. As a long time haskell developer, however, I find java’s facilities for data types frustrating indeed. These frustrations are twofold. Java lacks support for algebraic data types (<a href="https://en.wikipedia.org/wiki/Algebraic_data_type">ADTs</a>), and requires large amounts of boilerplate to define even simple types.</p>
<p>When designing systems, I place great value in applying the "make illegal states unrepresentable" principle<a href="https://twdkz.wordpress.com/2017/11/14/algebraic-data-types-in-java/#fn1" id="fnref1" class="footnoteRef"><sup>1</sup></a>. Using ADTs to more accurately model data is a excellent step in this direction. However, it’s a burden to do in languages like java that lack support for <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a>.</p>
<p>Even for regular product types (ie records of fields) java can be tedious. Defining a record of a few fields should really only take a corresponding few lines of code. Yet for a useful value type in java one will generally need to write: constructors, accessors, a comparison function, a hash implementation, serialisation logic etc. It’s common in the java world to use IDEs to automatically generate this kind of boilerplate, but subtle bugs can creep in over time as the once generated code isn’t manually updated to reflect subsequent changes in the data model.</p>
<p>Hence, at Helix we now often use my <a href="https://github.com/timbod7/adl">ADL language</a> to define data types, and generate the corresponding java code from them. As a tiny example, these adl definitions (see complete file <a href="https://github.com/timbod7/adl/blob/master/haskell/compiler/tests/demo1/input/picture.adl">here</a>):</p>
<pre><code> struct Rectangle
{
Double width;
Double height;
};
union Picture
{
Circle circle;
Rectangle rectangle;
Vector<Picture> composed;
Translated<Picture> translated;
};</code></pre>
<p>result in the corresponding <a href="https://github.com/timbod7/adl/blob/master/haskell/compiler/tests/demo1/java-output/adl/picture/Rectangle.java">Rectangle.java</a> and <a href="https://github.com/timbod7/adl/blob/master/haskell/compiler/tests/demo1/java-output/adl/picture/Translated.java">Picture.java</a>. These two definitions alone correspond to 280 lines of java code (that you really don’t want to write and maintain). As can be seen in the <code>Translated<></code> type, <a href="https://en.wikipedia.org/wiki/Parametric_polymorphism">parametric polymorphism</a> is supported.</p>
<p>I find that being able to define data types concisely encourages me to build more accurate data models, resulting in systems that are more robust and better reflect the problem domain. And ADL’s multi language support (<a href="https://github.com/timbod7/adl/blob/master/doc/backend-java.md">java</a>, <a href="https://github.com/timbod7/adl/blob/master/doc/backend-haskell.md">haskell</a>, <a href="https://github.com/timbod7/adl/blob/master/doc/backend-typescript.md">typescript</a>) allows us to easily serialize and transfer the corresponding data values between our java services, and our typescript web and mobile UIs.</p>
<div class="footnotes">
<hr />
<ol>
<li id="fn1">
<p>attributed to Yaron Minsky<a href="https://twdkz.wordpress.com/2017/11/14/algebraic-data-types-in-java/#fnref1"><img src="https://s0.wp.com/wp-content/mu-plugins/wpcom-smileys/twemoji/2/72x72/21a9.png" alt="↩" style="height: 1em;" class="wp-smiley" /></a></p>
</li>
</ol>
</div><br /> <a href="http://feeds.wordpress.com/1.0/gocomments/twdkz.wordpress.com/136/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/comments/twdkz.wordpress.com/136/" alt="" border="0" /></a> <img src="https://pixel.wp.com/b.gif?host=twdkz.wordpress.com&blog=33863535&post=136&subd=twdkz&ref=&feed=1" alt="" height="1" border="0" width="1" />Mon, 13 Nov 2017 21:53:40 +0000Functional Jobs: Scala Developer at LeadIQ (Full-time)urn:uuid:4b50f6db-da7e-e2a5-11f0-e3de6c37fc42
https://functionaljobs.com/jobs/9049-scala-developer-at-leadiq
<p>Are you the type of engineer who punches juke boxes to make the music start? Do you consider riding your motorcycle off into the a sunset a personal hobby? Is architecting a system from the ground up no big deal to you? We're looking for full-time Scala developer to make this happen.</p>
<p><strong>The Product</strong></p>
<p>We are on a mission to revolutionize Sales industry using data science. Our product helps our customers to collect and enrich their target prospects. Our internal data processing combines human intelligence and data science to enable our customers to find perfect contact information and save to their existing platforms like Salesforce, etc.</p>
<p><strong>The Challenge</strong></p>
<ul>
<li><p>We are at an exciting stage in our growth. We are getting traction with big customers, scaling out, and solving increasingly complex engineering problems.</p></li>
<li><p>Our systems are mostly written in Scala. We have used Kafka as backbone to communicate between our API server and micro-services. Smart architecture design is crucial in order to guarantee our micro-services based systems run smoothly and reliably. </p></li>
<li><p>We're looking for someone who can drive our product backend integration features, refactor existing code for faster responses and becomes an important asset to the rest of the engineering team.</p></li>
<li><p>Data quality is one of the critical factors to make our product successful. We often have needs to process 3rd parties data and clean existing data using Spark. So you need to be comfortable writing Spark scripts.</p></li>
<li><p>We have very complex integrations with 3rd parties systems like Salesforce, etc. These integrations are core to what we're offering to our customers. We're looking for someone who is willing to listen to customer feedback to improve existing features and provide new features for customer success.</p></li>
</ul>
<p><strong>The Stack</strong></p>
<p>Scala, Kafka, Spark, MongoDB, ElasticSearch, Docker, Vue.js</p>
<p><strong>The Team</strong></p>
<p>We want team members with attributes like:</p>
<ul>
<li>Focus on delivering value to the customer </li>
<li>Strong belief in collaboration </li>
<li>Passion that drives you to execute and innovate </li>
<li>Ability to self-manage and take ownership of a feature </li>
<li>Ability to juggle many projects and responsibilities</li>
<li>Extremely entrepreneurial and self-driven</li>
<li>Not afraid of a steep learning curve </li>
<li>Passionate about building a big business that transforms the sales industry</li>
<li>Exceptional at writing scalable, production-ready code</li>
<li>Thrive in a fast-paced environment</li>
<li>Avoid over-engineering </li>
<li>Simple designs and fast execution </li>
<li>Discipline in following process and documenting your work</li>
</ul>
<p>These personality traits define the culture we are building and are more important to us than a particular set of technical skills.</p>
<p><strong>The Responsibilities</strong></p>
<p>If you join LeadIQ, you will learn a lot: In terms of technical ability there are many cool tools, technologies, patterns and other great developers that will sharpen your skills. Personally you be given the chance to step up, lead and make your mark in a growing startup as we tackle the challenges in our next phase of growth.</p>
<p>On the technical front, we need you skilled in:</p>
<ul>
<li>Scala (but experience in another functional language helps, e.g. Haskell or Clojure) </li>
<li>Play framework </li>
<li>Concurrency (futures, actors, basic understanding of threads)</li>
</ul>
<p>So if you feel like you're a good fit for us, drop us a line! We love meeting developers who are excited by our product!</p>
<p>Get information on <a href="https://functionaljobs.com/jobs/9049-scala-developer-at-leadiq">how to apply</a> for this position.</p>Mon, 13 Nov 2017 14:46:16 +0000Michael Snoyman: Future proofing test suiteshttps://www.snoyman.com/blog/2017/11/future-proofing-test-suites
https://www.snoyman.com/blog/2017/11/future-proofing-test-suites
<p>I'll start with the specific case I've seen pop up a few times
recently, and then expand to the general. If you're a package author
who has been affected by this, please note: I'm putting this
information into a blog post since it's easier to state this once and
link to it rather than rewrite an explanation on lots of different bug
trackers.</p><p><a href="https://www.stackage.org/package/hlint">hlint</a> is a great tool for
getting advice on improving your Haskell codebase (another great Neil
Mitchell product). And as such tools go, hlint has new versions which
improve its ability to provide useful advice. This means that,
sometimes, code which triggered no hlint warnings previously may
suddenly present with such warnings under a new hlint version.</p><p>Twice recently in my Stackage curation, I've seen a number of test
suites fail, even though the code for those packages was
unmodified. It turns out that the upgrade to a new version of hlint
caused a previously successful test suite to now fail. Clearly the
code isn't suddenly broken because a new version of hlint has been
released, but as far as the diagnostics of test suite failures are
concerned, that's exactly what happened.</p><h2 id="recommendation">Recommendation</h2><p>I do strongly recommend projects use hlint to get code
improvements. And I've seen some great results with using it as part
of the CI process, such as on Stack. (For the record: it wasn't my
idea and I didn't implement it. I was just pleasantly surprised when
my PRs failed because I had some style errors.) However, making the
test suite for the entire package fail because of a new version of
hlint is too much. Therefore:</p><ul><li><p><b>DO</b> Have some way to run hlint from your CI process, if you
want these warnings to block PRs. There are two approaches I can
think of:</p><ul><li>The way Stack does it: have a
<a href="https://github.com/commercialhaskell/stack/blob/46121be1b96465f1164e3f84cafa19c7369da9cc/.travis.yml#L39">separate part of the build matrix</a>
just for style errors. The cabal file for the project itself
knows nothing about hlint.</li><li>Via a test suite in your cabal file which is disabled by
default. Then: turn on that test suite with a flag from your CI
configuration.</li></ul></li><li><p><b>DON'T</b> Set up your package which is uploaded to Hackage/built
by Stackage such that it will fail if a style-based error occurs.</p></li></ul><h2 id="general-recommendation">General recommendation</h2><p>The general takeaway from this is: when you're building your code on
CI, be as strict as you want. Set high standards, block PRs, call
master broken, for whatever trivial or non-trivial issues you deem
worthy. Turn on <code>-Wall -Werror</code>, respect hlint, error out if someone
uses tabs* or includes trailing whitespace. That's all good.</p><p>* Cue necessary tabs-vs-spaces argument</p><p><i>However</i>, when you're releasing your code elsewhere, make the tests
as lenient as possible on optional features. If the code fails to
build: that's a problem. If the code builds, but returns incorrect
runtime results: that's a problem. These should stop build systems
like Stackage from including your package. But stylistic issues, or
newly introduced warnings from the compiler, or myriad other issues,
should not trigger a failure for downstream consumers of your package.</p>Sun, 12 Nov 2017 17:00:00 +0000Neil Mitchell: Ghcid with VS Codetag:blogger.com,1999:blog-7094652.post-6143070707802774065
http://neilmitchell.blogspot.com/2017/11/ghcid-with-vs-code.html
<p><em>Summary: New versions of Ghcid and the VS Code extension work even better together.</em></p><p>I've just released <a href="https://hackage.haskell.org/package/ghcid">Ghcid v0.6.8</a> and the associated VS Code extension <a href="https://marketplace.visualstudio.com/items?itemName=ndmitchell.haskell-ghcid">haskell-ghcid v0.2.0</a>. Together they vastly simplify the Ghcid VS Code experience.</p><p><strong>Ghcid reads .ghcid files</strong></p><p>A new feature in Ghcid is that if there is a <code>.ghcid</code> file in the current directory it will load it as additional arguments. For example, in the Shake repo I have <a href="https://github.com/ndmitchell/shake/blob/master/.ghcid">a <code>.ghcid</code> file</a>:</p><pre><code>-c "ghci -fno-code -ferror-spans"<br /></code></pre><p>Which tells <code>ghcid</code> to not guess at the command (e.g. using <code>stack</code> if you have a <code>.stack-work</code>) but always run <code>ghci -fno-code -ferror-spans</code>. This command works because I have <a href="https://github.com/ndmitchell/shake/blob/master/.ghci">a <code>.ghci</code> file</a> which loads all the necessary files, while <code>-fno-code</code> speeds up compilation and <code>-ferror-spans</code> gives better error highlighting.</p><p><strong>Ghcid VS Code starts ghcid</strong></p><p>A new feature in the VS Code extension is the action <code>Start Ghcid</code> which starts a new <code>ghcid</code> terminal, writes the output to a temporary file, and uses that output to populate the Problems pane. Importantly, the extension runs <code>ghcid</code> with no command line arguments, so having a sensible <code>.ghcid</code> lets you control what it does.</p><p>The effect of these changes is that to start <code>ghcid</code> in VS Code is now a few key strokes, whereas before it required special flags, opening files, running commands etc.</p>Fri, 10 Nov 2017 23:06:00 +0000noreply@blogger.com (Neil Mitchell)Tweag I/O: Nix on the <br/> Windows Subsystem for Linuxhttp://www.tweag.io/posts/2017-11-10-nix-on-wsl.html
http://www.tweag.io/posts/2017-11-10-nix-on-wsl.html
<div>Jonas Chevalier</div><p>Nix on Windows: does it run yet? That's the question I wondered about
while testing the latest NixOS release, version 17.09. To that end,
I had the idea of running the Nix installation process from inside
the <a href="https://msdn.microsoft.com/en-gb/commandline/wsl/about">Windows Subsystem for Linux (WSL)</a> see if it worked. And it
worked! Success!</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Testing the NixOS 17.09 release under Windows Subsystem for Linux. Works like a charm. <a href="https://t.co/19qvXjDpDv">pic.twitter.com/19qvXjDpDv</a></p>— zimbatm (@zimbatm) <a href="https://twitter.com/zimbatm/status/911992348858601474?ref_src=twsrc%5Etfw">September 24, 2017</a></blockquote>
<script async="async" charset="utf-8" src="http://platform.twitter.com/widgets.js"></script>
<p>So what does this mean?</p>
<p>You might remember that
the
<a href="https://en.wikipedia.org/wiki/Microsoft_POSIX_subsystem">Windows NT kernel used to have a POSIX layer</a>.
Unfortunately, The POSIX layer always had compatibility issues with
BSD and Linux software, because typical applications seldom fit
completely and entirely within the confines of an age old API.
Nevertheless, the NT kernel was designed from the start to support
different subsystems, not just Win32, and the POSIX layer of old was
a step in the right direction. The WSL is a revival of that idea but
with a specific focus on the Linux ABI. It means that it is now
possible to run Linux software natively on Windows. Think of it as
reverse <a href="https://www.winehq.org/">Wine</a>. Linux software can execute Windows software and
<em>vice versa</em>.</p>
<p>It's not perfect yet. I/O and symlink resolution seem to be slow and not all Linux syscalls have been implemented yet. This is more about the promised land that Microsoft is showing. WSL is not available on the server edition yet, but it looks like they are going to deliver on it.</p>
<p>At Tweag.io we often use Nix to declaratively specify reproducible build environments for our projects and those of our clients. Nix is a good fit for project that mix different languages. It works really well at providing reproducible builds and compose the various parts of the project with external dependencies. Unfortunately it is also not supported on Windows so we have to decide upfront whether to use it based in part on whether Windows is going to become a target platform or not. Thanks to WSL it looks like we will have an escape hatch, at least for non graphical applications.</p>
<p>Another potential use-case that I see is for Haskell development. Today, a lot of good software is being developed directly on top of Linux and macOS. For some of these projects Windows is not a prime target environment anymore. The Glasgow Haskell Compiler (GHC) is actually quite well behaved on Windows when compiling pure Haskell code. But as soon as C library dependencies are involved, the story gets a lot more complicated. In that case, deploying via WSL might just be easier than aiming for a native Windows port.</p>
<h2>How to install</h2>
<p>Enable and install WSL following these instructions:
https://msdn.microsoft.com/en-us/commandline/wsl/install_guide.</p>
<p>Make sure to have the latest version of Windows 10 installed. I had this version at the time of install:</p>
<ul>
<li><strong>Windows Edition:</strong> Windows 10 Pro</li>
<li><strong>Windows Version:</strong> 1703</li>
<li><strong>Windows OS Build:</strong> 15063.540</li>
<li><strong>System Type:</strong> 64-bit operating system</li>
</ul>
<p>Start the “Bash On Ubuntu On Windows” program and type <code>curl https://nixos.org/nix/install | sh</code>.</p>
<h2>Known issues</h2>
<p>WSL is an experimental subsystem still. At this point in time, there
are still important issues to know about. Here are the workarounds
I came up with:</p>
<ul>
<li><strong><code>curl</code> is hanging.</strong> Hit Ctrl+C and retry.</li>
<li><strong>Nix installation crash.</strong> Older versions of WSL didn't support all
the syscalls needed by Nix. Update Windows and try again.</li>
<li><strong><code>nix-shell</code> is broken.</strong> Fails with synchronous I/O disk error
https://github.com/NixOS/nix/issues/1203. Here's a workaraund: edit
/etc/nix/nix.conf and add use-sqlite-wal=false</li>
<li><strong>It’s slow.</strong> Yes, especially I/O and symlinks seem to be quite
slow. The only solution here is to wait for Microsoft to optimise
their syscalls.</li>
<li><strong>Nix environment is not started in new logins.</strong> Workaround: Run
<code>source ~/.profile</code></li>
</ul>
<h2>Conclusion</h2>
<p>For now, it's just a technology preview that opens new possibilities.
Hopefully in the future, when the performance of I/O operations
improves, it will also be enjoyable to develop Linux programs under
WSL directly. Meanwhile, Microsoft has put out useful resources to go
further with WSL:</p>
<ul>
<li>the <a href="https://msdn.microsoft.com/en-gb/commandline/wsl/faq">WSL FAQ</a>,</li>
<li>the <a href="https://github.com/Microsoft/BashOnWindows">Github project</a>.</li>
</ul>Fri, 10 Nov 2017 00:00:00 +0000Douglas M. Auclair (geophf): October 2017 1Liner 1HaskellADay problems and solutionstag:blogger.com,1999:blog-4650294074444534066.post-412131735509851009
http://logicaltypes.blogspot.com/2017/11/october-2017-1liner-1haskelladay.html
<ul style="color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 20th, 2017:<br />You have a list of numbers: [1,2,3,4]<br />You have a list of the same length of number fns: [succ, id, id, succ]<br />You want: [2,2,3,5]</li><ul style="line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;"> 🇪🇺 Cλément D 🌈 🐇 @clementd zipWith (flip ($)) ?</li><ul style="line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;"> he adds: `zipWith (flip id)` is a bit shorter tho</li></ul><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Simon Courtenage @SCourtenage zipWith ($) [succ,id,id,succ] [1,2,3,4]</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">lukasz @lukaszklekot getZipList $ ZipList [succ, id, id, succ] <*> ZipList [1, 2, 3, 4]</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Alexey Radkov @sheshanaag (map (uncurry ($)) .) . zip</li></ul><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 5th, 2017: "reverse the sequencing"<br />You have [[(1,2),(1,3),(1,7)],[(9,2)],[(11,3)]]<br />You want [(1,[2,3,7]),(9,[2]),(11,[3])]</li><ul style="line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">bazzargh @bazzargh map ((,) <$> head.(map fst) <*> (map snd))</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">bazzargh @bazzargh map ((first head).unzip)</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Chris Martin @chris__martin \x -> [(a, b : fmap snd xs) | Just ((a, b) :| xs) <- fmap="" li="" nonempty="" x=""></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Simon Courtenage @SCourtenage fmap (\x -> (fst . head $ x, fmap snd x))</li><ul style="line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Denis Stoyanov 🐜 @xgrommx Your solution nice) but u can do it with point free style like</li><ul style="line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">fmap(fst.head &&& fmap snd)</li></ul></ul><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Denis Stoyanov 🐜 @xgrommx My solution is ugly, but I wanna to solve it with traverse)</li><ul style="line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">fmap(first head . traverse (first (:[])))</li></ul><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Andreas Källberg @Anka213 map$fst.head&&&map snd</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Scott Fleischma @scottfleischman<br />traverse<br /> $ _1<br /> (\case<br /> [y] -> Just y<br /> _ -> Nothing<br /> . nub<br /> )<br /> . unzip<br /> :: [[(Int, Int)]] -> Maybe [(Int, [Int])]</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Scott Fleischman @scottfleischman<br />let</li> sing [] = Left "Too few"<br /> sing [x] = Right x<br /> sing (_ : _) = Left "Too many"<br /> valid = sing . nub<br /> go = _1 valid . unzip<br />in traverse go <li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">matt @themattchan map ((head *** id ) . unzip)</li></ul><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 3rd, 2017:<br />you have [(1,[2,3,4]),(10,[5,6,7])]<br />you want [(1,2),(1,3),(1,4),(10,5),(10,6),(10,7)]<br /><br />or, generally: [(a,[b])] -> [(a,b)]<br /><br />Go!</li><br /><ul style="line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">bazzargh @bazzargh (uncurry (zip . repeat) =<<)</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Bruno @Brun0Cad (=<<) sequence</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Denis Stoyanov 🐜 @xgrommx fmap (uncurry (liftA2(,) . (:[])))</li><ul style="line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Darren G @Kludgy I like that this doesn't unnecessarily implicate the sequentiality of bind.</li></ul><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">Darren G @Kludgy Funny this same product came up at work last week.<br />concatMap $ \(a,bs) -> fmap (\b -> (a,b)) bs</li></ul></ul>Sun, 05 Nov 2017 04:35:49 +0000noreply@blogger.com (geophf)Neil Mitchell: Understanding HLint rulestag:blogger.com,1999:blog-7094652.post-3706216615062285076
http://neilmitchell.blogspot.com/2017/11/understanding-hlint-rules.html
<p><em>Summary: I added a degenerate foldr to map rule in the new version of HLint, here I describe how it works.</em></p><p>I've just released <a href="https://hackage.haskell.org/package/hlint-2.0.10">HLint 2.0.10</a>, which includes a rule to recognise uses of <code>foldr</code> that should really be <code>map</code>. As an example:</p><pre><code>foldr (\curr acc -> (+1) curr : acc) []<br /></code></pre><p>Can be rewritten as:</p><pre><code>map (\curr -> (+1) curr)<br /></code></pre><p>Which is much more readable (and then subsequently HLint will suggest <code>map (+1)</code>, which is vastly clearer than the initial <code>foldr</code>). The change required to HLint was to add a rule to the <a href="https://github.com/ndmitchell/hlint/blob/master/data/hlint.yaml"><code>hlint.yaml</code></a> saying:</p><pre><code>- warn: {lhs: "foldr (\\c a -> x : a) []", rhs: "map (\\c -> x)"}<br /></code></pre><p>You can read this statement as saying if you see <code>foldr (\c a -> x : a) []</code>, suggest <code>map (\c -> x)</code> as a warning. The HLint matching engine then applies that template to every subexpression in your program. In the rest of the post I'll talk through the steps HLint performs.</p><p><strong>Step 1: Unification</strong></p><p>The first step is to try unifying the template <code>foldr (\c a -> x : a) []</code> against the users subexpression, namely <code>foldr (\curr acc -> (+1) curr : acc) []</code>. HLint is trying to find assignments for the single-letter variables in the template (namely <code>c</code>, <code>a</code> and <code>x</code>) which cause it to match the subexpression. Unification proceeds top-down, and if it finds anything concrete that does not match (e.g. the user had written <code>foldl</code>) then it fails. In this case the unification succeeds with the bindings:</p><ul><li><code>c</code> = <code>curr</code> (from the first argument to the lambda)</li><li><code>a</code> = <code>acc</code> (from the second argument to the lambda)</li><li><code>x</code> = <code>(+1) curr</code> (from before the cons)</li><li><code>a</code> = <code>acc</code> (from after the cons)</li></ul><p>An example of a subexpression that would have failed unification is <code>foldl (\curr acc -> (+1) curr : acc) []</code>.</p><p><strong>Step 2: Validity</strong></p><p>The next step is to check that any value which has been bound more than once is equal in all bindings. In our case only <code>a</code> has been used twice, and it always binds to <code>acc</code>, so the unification is valid.</p><p>An example of a subexpression that would have failed validity is <code>foldr (\curr acc -> (+1) curr : xs) []</code>.</p><p><strong>Step 3: Substitution</strong></p><p>Now we've got some bindings, we can substitute them into the RHS, namely <code>map (\c -> x)</code>. We replace <code>c</code> and <code>x</code> using the bindings above. Note that <code>a</code> isn't mentioned on the RHS, so we don't use it. After substitution we get:</p><pre><code>map (\curr -> (+1) curr)<br /></code></pre><p><strong>Step 4: Free variable check</strong></p><p>Consider the expression <code>foldr (\curr acc -> f acc : acc) []</code>. Using the rules above we'd end up with <code>map (\curr -> f acc)</code>, which is terrible, since we've gone from referring to a locally bound <code>acc</code> to whatever <code>acc</code> is in scope (if any). To solve that, we check that the result doesn't introduce any new free variables:</p><pre><code>(freeVars result \\ freeVars hintRuleRHS) `isSubsetOf` freeVars original<br /></code></pre><p>Specifically any free variables introduced in the result, which weren't in the RHS (excluding the fake unification variables), must have been in the original subexpression.</p><p>With that, for <code>foldr</code>, we're done. There are a handful of other steps that apply in some cases.</p><p><strong>Step A: Dot expansion in the template</strong></p><p>If you write a hint <code>map f (map g x) ==> map (f . g) x</code> then HLint notices that also implies the rule <code>map f . map g ==> map (f . g)</code> and adds it. As a result, you shouldn't write your HLint rules in point-free style.</p><p><strong>Step B: Dot/dollar expansion in the subexpression</strong></p><p>When matching a subexpression HLint will expand <code>f $ x</code> and <code>(f . g) x</code> if doing so results in a match. These operators are used commonly enough that they are often treated more like brackets than functions.</p><p><strong>Step C: Scope matching</strong></p><p>When unifying qualified function names, HLint uses the active imports to guess whether they match. If you have <code>import qualified Data.Vector as V</code> then the subexpression <code>V.length</code> will unify with <code>Data.Vector.length</code>. Since HLint doesn't have complete import information it uses a few heuristics to figure out matching.</p><p><strong>Step D: Scope moving</strong></p><p>Similarly to scope matching on the LHS of a rule, after matching, HLint tries to requalify any necessary values on the RHS. As an example, assuming we are producing <code>Data.Vector.null</code>, if we know about <code>import qualified Data.Vector as V</code> then we suggest <code>V.null</code>.</p><p><strong>Full code</strong></p><p>To see the full code and all supporting definitions go to <a href="https://github.com/ndmitchell/hlint/blob/f4466eed8a8bf6beccfd11052f2e3cfb074f2b44/src/Hint/Match.hs#L100-L114">the HLint source</a>, which defines <code>matchIdea</code> - here I show a gently simplified version. Given scope information, a rule (LHS and RHS) and a subexpression, we optionally produce a resulting expression after substitution.</p><pre><code>matchIdea :: Scope -> HintRule -> Exp_ -> Maybe Exp_<br />matchIdea s HintRule{..} original = do<br /> u <- unifyExp hintRuleLHS original<br /> u <- validSubst u<br /> -- need to check free vars before unqualification, but after subst (with e)<br /> -- need to unqualify before substitution (with res)<br /> let result = substitute u hintRuleRHS<br /> guard $ (freeVars result Set.\\ Set.filter (not . isUnifyVar) (freeVars hintRuleRHS))<br /> `Set.isSubsetOf` freeVars original<br /> -- check no unexpected new free variables<br /> return result<br /></code></pre>Sat, 04 Nov 2017 12:07:00 +0000noreply@blogger.com (Neil Mitchell)Gabriel Gonzalez: Semantic integrity checks are the next generation of semantic versioningtag:blogger.com,1999:blog-1777990983847811806.post-4671919550588694847
http://www.haskellforall.com/2017/11/semantic-integrity-checks-are-next.html
<html xmlns="http://www.w3.org/1999/xhtml"><head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/> <meta content="text/css" http-equiv="Content-Style-Type"/> <meta content="pandoc" name="generator"/> <title></title> <style type="text/css">code{white-space: pre;}</style> <style type="text/css">div.sourceCode { overflow-x: auto; } table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode { margin: 0; padding: 0; vertical-align: baseline; border: none; } table.sourceCode { width: 100%; line-height: 100%; } td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; } td.sourceCode { padding-left: 5px; } code > span.kw { color: #007020; font-weight: bold; } /* Keyword */ code > span.dt { color: #902000; } /* DataType */ code > span.dv { color: #40a070; } /* DecVal */ code > span.bn { color: #40a070; } /* BaseN */ code > span.fl { color: #40a070; } /* Float */ code > span.ch { color: #4070a0; } /* Char */ code > span.st { color: #4070a0; } /* String */ code > span.co { color: #60a0b0; font-style: italic; } /* Comment */ code > span.ot { color: #007020; } /* Other */ code > span.al { color: #ff0000; font-weight: bold; } /* Alert */ code > span.fu { color: #06287e; } /* Function */ code > span.er { color: #ff0000; font-weight: bold; } /* Error */ code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ code > span.cn { color: #880000; } /* Constant */ code > span.sc { color: #4070a0; } /* SpecialChar */ code > span.vs { color: #4070a0; } /* VerbatimString */ code > span.ss { color: #bb6688; } /* SpecialString */ code > span.im { } /* Import */ code > span.va { color: #19177c; } /* Variable */ code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ code > span.op { color: #666666; } /* Operator */ code > span.bu { } /* BuiltIn */ code > span.ex { } /* Extension */ code > span.pp { color: #bc7a00; } /* Preprocessor */ code > span.at { color: #7d9029; } /* Attribute */ code > span.do { color: #ba2121; font-style: italic; } /* Documentation */ code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ </style></head><body><p>The <a href="https://github.com/dhall-lang/dhall-lang">Dhall configuration language</a> just added support for "semantic integrity checks". This post explains what "semantic integrity check" means, motivates the new feature, and compares to semantic versioning.</p><h2 id="the-problem">The problem</h2><p>I added this feature in response to user concerns about code injection in Dhall configuration files.</p><p>We'll illustrate the problem using the following <code>example.dhall</code> configuration file which derives a summary of student information from a list of students:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"> <span class="co">-- Example of an expression imported by URL</span><br /> <span class="kw">let</span> map <span class="fu">=</span> http<span class="fu">://</span>prelude<span class="fu">.</span>dhall<span class="fu">-</span>lang<span class="fu">.</span>org<span class="fu">/</span><span class="dt">List</span><span class="fu">/</span>map<br /><br /> <span class="co">-- Example of an expression imported by path</span><br /><span class="kw">in</span> <span class="kw">let</span> students <span class="fu">=</span> <span class="fu">./</span>students<span class="fu">.</span>dhall<br /><br /><span class="kw">in</span> <span class="kw">let</span> getName <span class="fu">=</span> λ(student <span class="fu">:</span> { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> }) <span class="ot">→</span> student<span class="fu">.</span>name<br /><br /><span class="kw">in</span> { classSize <span class="fu">=</span> <span class="dt">List</span><span class="fu">/</span>length { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> } students<br /> , names <span class="fu">=</span> map { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> } <span class="dt">Text</span> getName students<br /> }</code></pre></div><p>This configuration imports a helper function named <code>map</code> from the Dhall Prelude by URL:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"> <span class="kw">let</span> map <span class="fu">=</span> http<span class="fu">://</span>prelude<span class="fu">.</span>dhall<span class="fu">-</span>lang<span class="fu">.</span>org<span class="fu">/</span><span class="dt">List</span><span class="fu">/</span>map<br /><br /><span class="kw">in</span> <span class="fu">...</span></code></pre></div><p>... and that URL currently hosts a text file encoding the following Dhall function:</p><pre class="shell"><code>$ curl -L http://prelude.dhall-lang.org/List/map<br />{-<br />Tranform a list by applying a function to each element<br /><br />Examples:<br /><br />./map Natural Bool Natural/even ([+2, +3, +5] : List Natural)<br />= [True, False, False] : List Bool<br /><br />./map Natural Bool Natural/even ([] : List Natural)<br />= [] : List Bool<br />-}<br />let map : ∀(a : Type) → ∀(b : Type) → (a → b) → List a → List b<br /> = λ(a : Type)<br /> → λ(b : Type)<br /> → λ(f : a → b)<br /> → λ(xs : List a)<br /> → List/build<br /> b<br /> ( λ(list : Type)<br /> → λ(cons : b → list → list)<br /> → List/fold a xs list (λ(x : a) → cons (f x))<br /> )<br /><br />in map</code></pre><p>Similarly, our example configuration imports student data from another configuration file by path:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="fu">...</span><br /><br /><span class="kw">in</span> <span class="kw">let</span> students <span class="fu">=</span> <span class="fu">./</span>students<span class="fu">.</span>dhall<br /><br /><span class="fu">...</span></code></pre></div><p>... and we'll assume that file contains the following list of student records:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">[ { name <span class="fu">=</span> <span class="st">"Jane Doe"</span> , age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span> }<br />, { name <span class="fu">=</span> <span class="st">"John Rivera"</span> , age <span class="fu">=</span> <span class="fu">+</span><span class="dv">18</span> }<br />, { name <span class="fu">=</span> <span class="st">"Alice O'Hare"</span>, age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span> }<br />]</code></pre></div><p>Values, functions, and types are all Dhall expressions, so we can inject all of them in our code via URLs or paths. When we interpret a Dhall configuration file these imports get substituted with their contents and then we evaluate the fully resolved configuration file as an expression in a functional language:</p><pre class="shell"><code>$ dhall <<< './example.dhall' | dhall-format<br />{ classSize : Natural, names : List Text }<br /><br />{ classSize = +3<br />, names = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text<br />}</code></pre><p>Users were concerned that these imports could be compromised, resulting in malicious code injection</p><h2 id="the-solution">The solution</h2><p>The latest release of Dhall added support for import integrity checks to address user concerns about malicious tampering. We can use these integrity checks to "freeze" our imports by adding a SHA-256 hash after each import.</p><p>First, we ask the <code>dhall-hash</code> utility to compute the current hash for our imports:</p><pre class="shell"><code>$ dhall-hash <<< 'http://prelude.dhall-lang.org/List/map'<br />sha256:3063e9b34fd4235165a7a46e3ee3e0d0d7cded5da16f5572cc9e459ed5452fbb<br />$ dhall-hash <<< './students.dhall' <br />sha256:6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06</code></pre><p>... and then we append the hash after each import to freeze the import:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"> <span class="kw">let</span> map <span class="fu">=</span><br /> http<span class="fu">://</span>prelude<span class="fu">.</span>dhall<span class="fu">-</span>lang<span class="fu">.</span>org<span class="fu">/</span><span class="dt">List</span><span class="fu">/</span>map sha256<span class="fu">:</span>3<span class="fl">063e9</span>b34fd4235165a7a46e3ee3e0d0d7cded5da16f5572cc9e459ed5452fbb<br /><br /><span class="kw">in</span> <span class="kw">let</span> students <span class="fu">=</span><br /> <span class="fu">./</span>students<span class="fu">.</span>dhall sha256<span class="fu">:</span>6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06<br /><br /><span class="kw">in</span> <span class="kw">let</span> getName <span class="fu">=</span> λ(student <span class="fu">:</span> { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> }) <span class="ot">→</span> student<span class="fu">.</span>name<br /><br /><span class="kw">in</span> { classSize <span class="fu">=</span> length { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> } students<br /> , names <span class="fu">=</span> map { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> } <span class="dt">Text</span> getName students<br /> } </code></pre></div><p>Once you add these integrity checks the Dhall interpreter will enforce them when resolving imports. In this case, the example configuration still successfully evaluates to the same result after adding the integrity checks:</p><pre class="shell"><code>$ dhall <<< './example.dhall' | dhall-format<br />{ classSize : Natural, names : List Text }<br /><br />{ classSize = +3<br />, names = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text<br />}</code></pre><p>The integrity check passes because we haven't yet modified any of our imports.</p><h2 id="semantic-integrity">Semantic integrity</h2><p>Once you freeze an import with a hash, Dhall guarantees that the <em>meaning</em> of the import never changes. These are <em>semantic</em> hashes, not textual hashes.</p><p>For example, suppose that we modify <code>./students.dhall</code> to add a comment, reorder record fields, and modify the formatting, like this:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="co">-- Class of 2017</span><br /><br />[ { age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span>, name <span class="fu">=</span> <span class="st">"Jane Doe"</span> },<br /> { name <span class="fu">=</span> <span class="st">"John Rivera"</span> , age <span class="fu">=</span> <span class="fu">+</span><span class="dv">18</span> },<br /> { name <span class="fu">=</span> <span class="st">"Alice O'Hare"</span>, age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span> } ]</code></pre></div><p>These changes do not affect the computed hash of the file and the interpreter still accepts the <code>./students.dhall</code> import that we protected with an integrity check:</p><pre class="shell"><code>$ dhall <<< './example.dhall' | dhall-format # Still succeeds<br />{ classSize : Natural, names : List Text }<br /><br />{ classSize = +3<br />, names = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text<br />}</code></pre><p>The Dhall interpreter accepted the import of <code>./students.dhall</code> because the semantic hash never changed:</p><pre class="shell"><code>$ dhall-hash <<< './students.dhall' <br />sha256:6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06</code></pre><p>However, now suppose we try to change the substance of the file by modifying John's age:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="co">-- Class of 2017</span><br /><br />[ { age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span>, name <span class="fu">=</span> <span class="st">"Jane Doe"</span> },<br /> { name <span class="fu">=</span> <span class="st">"John Rivera"</span> , age <span class="fu">=</span> <span class="fu">+</span><span class="dv">20</span> },<br /> { name <span class="fu">=</span> <span class="st">"Alice O'Hare"</span>, age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span> } ]</code></pre></div><p>Now the semantic integrity check fails:</p><pre class="shell"><code>$ dhall <<< './example.dhall'<br /><br />Error: Import integrity check failed<br /><br />Expected hash:<br /><br />↳ 6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06<br /><br />Actual hash:<br /><br />↳ 808d921914de5349f50ac656bed93c2894dfe35401991e1ca0c89861834023fb</code></pre><p>Dhall recognizes that this is no longer the same expression and rejects the import. Only an import that represents the same value can pass the check.</p><p>This means, for example, that malicious users cannot tamper with our imports, even if we were to distribute the imported code over an insecure channel. The worst that an attacker can do is cause our configuration to reject the import, but they cannot trick the configuration into silently accepting the wrong expression.</p><h2 id="refactoring">Refactoring</h2><p>We can use these integrity checks to do more than just secure code. We can also repurpose these checks to assert that our code refactors are safe and behavior-preserving.</p><p>For example, suppose that we change the student list to:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="co">-- Class of 2017</span><br /><br /> <span class="kw">let</span> double <span class="fu">=</span> λ(x <span class="fu">:</span> <span class="dt">Natural</span>) <span class="ot">→</span> x <span class="fu">*</span> <span class="fu">+</span><span class="dv">2</span><br /><br /><span class="kw">in</span> [ { name <span class="fu">=</span> <span class="st">"Jane Doe"</span> , age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span> }<br /> , { name <span class="fu">=</span> <span class="st">"John Rivera"</span> , age <span class="fu">=</span> double <span class="fu">+</span><span class="dv">9</span> }<br /> , { name <span class="fu">=</span> <span class="st">"Alice O'Hare"</span>, age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span> }<br /> ]</code></pre></div><p>This will still pass the integrity check because the student list still evaluates to the same expected result.</p><p>We can also refactor our project layout, too. For example, we could modify the student list to import the <code>double</code> function from another file:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="co">-- Class of 2017</span><br /><br />[ { name <span class="fu">=</span> <span class="st">"Jane Doe"</span> , age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span> }<br />, { name <span class="fu">=</span> <span class="st">"John Rivera"</span> , age <span class="fu">=</span> <span class="fu">./</span>double<span class="fu">.</span>dhall <span class="fu">+</span><span class="dv">9</span> }<br />, { name <span class="fu">=</span> <span class="st">"Alice O'Hare"</span>, age <span class="fu">=</span> <span class="fu">+</span><span class="dv">19</span> }<br />]</code></pre></div><p>... where <code>./double.dhall</code> has the following contents:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">λ(x <span class="fu">:</span> <span class="dt">Natural</span>) <span class="ot">→</span> x <span class="fu">*</span> <span class="fu">+</span><span class="dv">2</span></code></pre></div><p>... and the integrity check would still pass.</p><p>I originally introduced semantic integrity checks to protect against malicious code modification then later realized that they can also be used to protect against non-malicious modifications (such as a refactor gone wrong).</p><h1 id="textual-hashes">Textual hashes</h1><p>The semantic hash provides a more information than a textual hash of the import. For example, suppose we changed our <code>./double.dhall</code> function to triple the argument:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">λ(x <span class="fu">:</span> <span class="dt">Natural</span>) <span class="ot">→</span> x <span class="fu">*</span> <span class="fu">+</span><span class="dv">3</span></code></pre></div><p>A textual hash of the <code>./students.dhall</code> import would not detect this change because the real change took place in the text of another file that <code>./students.dhall</code> imported. However, A semantic hash can follow these imports to detect transitive changes to dependencies.</p><p>The semantic hash is also more flexible than a textual hash because the semantic hash does not change when we make cosmetic changes like refactoring, reformatting, or commenting code.</p><h2 id="caveats">Caveats</h2><p>Dhall's semantic versioning can reject some behavior-preserving changes to functions. Dhall only attempts to detect if two functions are β-equivalent (i.e. the same if fully β-reduced).</p><p>For example, the following two functions are equivalent, but will not produce the same hash:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">λ(x <span class="fu">:</span> <span class="dt">Bool</span>) <span class="ot">→</span> x</code></pre></div><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">λ(x <span class="fu">:</span> <span class="dt">Bool</span>) <span class="ot">→</span> <span class="kw">if</span> x <span class="kw">then</span> <span class="dt">True</span> <span class="kw">else</span> <span class="dt">False</span></code></pre></div><p>Similarly, Dhall's semantic hash cannot detect that these two functions are the same:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">λ(x <span class="fu">:</span> <span class="dt">Natural</span>) <span class="ot">→</span> x <span class="fu">*</span> <span class="fu">+</span><span class="dv">2</span></code></pre></div><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell">λ(x <span class="fu">:</span> <span class="dt">Natural</span>) <span class="ot">→</span> x <span class="fu">+</span> x</code></pre></div><p>On the other hand, Dhall will (almost) never give two semantically distinct expressions the same hash. Only an astronomically improbable hash collision can cause this and at the time of this writing there is no known vulnerability in the SHA-256 hash algorithm.</p><p>Dhall will support other hash algorithms should SHA-256 ever be broken. This is why Dhall prefixes the hash with the algorithm to leave the door open for new hash algorithms.</p><h2 id="semantic-versioning">Semantic versioning</h2><p>You might wonder how semantic integrity checks compare to <a href="http://semver.org/">semantic versioning</a>. I like to think of semantic integrity checks and semantic versions as two special cases of the following abstract interface:</p><ul><li>a package publishes a version string for each official release</li><li>you can compare two version strings to detect a breaking change to the package</li></ul><p>Semantic versioning is one special case of that abstract interface where:</p><ul><li>the version string has a major number and minor number</li><li>a difference in major version numbers signals a breaking change</li></ul><p>Some variations on semantic versioning propose independently versioning each exported function/value/type instead of versioning the package as a whole. Also, some languages (like Elm) mechanically enforce semantic versioning by detecting API changes programmatically and forcing a major version bump if there is a breaking change.</p><p>A semantic integrity check is another special case of that abstract interface where:</p><ul><li>the version string is a SHA-256 hash</li><li>if two hashes are different then that signals a breaking change</li></ul><p>The key difference between semantic versioning and semantic integrity checks is how we define "a breaking change". Semantic version numbers (usually) treat changes to <em>types</em> as breaking changes whereas semantic integrity checks treat changes to <em>values</em> as breaking changes. (To be totally pedantic: semantic integrity checks treat changes to <em>expressions</em> as breaking changes, and in a language like Dhall everything is an expression, including types).</p><p>This does not imply that semantic integrity checks are better than semantic version numbers. Sometimes you <em>want</em> to automatically pick up small changes or improvements from your dependencies without adjusting a hash. In cases like those you want the expected type to be the contract with your dependency and you don't want to pin the exact value.</p><p>For example, we could "simulate" semantic versioning in Dhall by attaching a type annotation to our <code>./students.dhall</code> import like this:</p><div class="sourceCode"><pre class="sourceCode haskell"><code class="sourceCode haskell"> <span class="kw">let</span> map <span class="fu">=</span><br /> http<span class="fu">://</span>prelude<span class="fu">.</span>dhall<span class="fu">-</span>lang<span class="fu">.</span>org<span class="fu">/</span><span class="dt">List</span><span class="fu">/</span>map sha256<span class="fu">:</span>3<span class="fl">063e9</span>b34fd4235165a7a46e3ee3e0d0d7cded5da16f5572cc9e459ed5452fbb<br /><br /><span class="kw">in</span> <span class="kw">let</span> students <span class="fu">=</span><br /> <span class="fu">./</span>students<span class="fu">.</span>dhall <span class="fu">:</span> <span class="dt">List</span> { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> }<br /><br /><span class="kw">in</span> <span class="kw">let</span> getName <span class="fu">=</span> λ(student <span class="fu">:</span> { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> }) <span class="ot">→</span> student<span class="fu">.</span>name<br /><br /><span class="kw">in</span> { classSize <span class="fu">=</span> <span class="dt">List</span><span class="fu">/</span>length { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> } students<br /> , names <span class="fu">=</span> map { name <span class="fu">:</span> <span class="dt">Text</span>, age <span class="fu">:</span> <span class="dt">Natural</span> } <span class="dt">Text</span> getName students<br /> } </code></pre></div><p>... and now we can add or remove students from our imported list without breaking anything. We've used the type system as a coarser integrity check to state that certain changes to our configuration file's meaning are okay.</p><h2 id="conclusion">Conclusion</h2><p>You can think of a semantic integrity check as a "value annotation" (i.e. the term-level equivalent of a type annotation). Instead of declaring an expected type we declare an expected value summarized as a hash.</p><p>This is why the title of this post declares that "semantic integrity checks are the next generation of semantic versioning". If you think of a semantic version as a concise summary of an imported package's type, then a semantic integrity check is a concise summary of an imported package's value.</p></body></html>Sat, 04 Nov 2017 03:45:40 +0000noreply@blogger.com (Gabriel Gonzalez)Keegan McAllister: On depression, privilege, and online activismtag:blogger.com,1999:blog-1563623855220143059.post-5769838398217923874
http://mainisusuallyafunction.blogspot.com/2014/06/on-depression-privilege-and-online.html
<p><i>Update (November 2017):</i> I'm leaving this up as a snapshot of how I felt at the time. Since then a lot has changed in my life, I'm much less angry in general and I no longer give a shit what the toxic assholes think of me, which is pretty great!</p> <hr /> <p>[Content warning: depression, privilege, online activism]</p><p>This isn't a general account of my experiences with depression. Many people have written about that, and I don't have much to add. But there's one aspect that I don't hear about very often. It's something that bothers me a lot, and others have told me that it bothers them too.</p><p>The thing is, I'm not just a person with a mental illness. I'm also a well-off white guy, and I enjoy a whole set of unearned privileges from that. Every day people around the world are harassed, abused, and killed over things I never have to worry about. Even in mundane daily life, most everyone is <a href="http://whatever.scalzi.com/2012/05/15/straight-white-male-the-lowest-difficulty-setting-there-is/">playing on a higher difficulty setting</a> than I ever will.</p><p>I've thought about this a lot over the past few years, and I'm trying to understand how I can help make the world more fair and less oppressive. So I give money and I volunteer a little and I speak up when it seems useful, but mostly I listen. I listen to the experiences of people who are different from me. I try to get some understanding of how they feel and why.</p><p>How is this related to depression? Because the reality of privilege and oppression is fucking depressing. Of course it's depressing to those who are directly harmed. That's a lot of what I read about, and some of the despair transfers to me. But my profiting from the suffering of others in a way that I mostly can't change is also depressing, at least if I make an attempt not to ignore it.</p><p>And my distress over my role in systems of oppression brings its own layer of guilt. People are actually suffering and I feel sorry for myself because I'm dimly aware of it? But this comes from the voice that has always taunted me about depression. “How can you be sad? Your life is great. If you had real problems you wouldn't be so pathetic. You're not really sick. You're just a whiner.”</p><p>All of which is part of the disease. I need to own it and work on it every day. But it seems like every time I read an online discussion about social justice, I take a huge step backwards.</p><p>It's hard to shrug off the “men are horrible” comments when I spend so much effort trying to convince myself that I'm not horrible. When I hear people gloating about delicious white male tears, I think about all the times when I would come home from work and collapse in bed crying. Is this what they want my life to be?</p><p>I can't give myself permission to tune out, because the same people lecture constantly about my obligation to be a good ally, which mostly takes the form of “shut up and listen.” And then when I'm upset by the things they say, the response is “This isn't for you! Why are you listening?”</p><p>A local group, one that had recently invited me to hang out as a guest, retweeted a member's declaration to would-be allies: “We're not friends. Fuck you.” Can you see why it feels like they're trying to hurt me?</p><hr /><p>Let me be clear: I truly don't care if people in a room somewhere are talking about how men are the worst. I don't feel oppressed by it, and I have no desire to argue with it. But I can't handle direct exposure.</p><p>And don't tell me that I'm too stupid to understand why they say these things. I know intellectually that it's not about me. I understand the need to vent and the importance of building solidarity. None of that matters on the emotional level where these comments register like a punch to the gut. I <em>do</em> feel this way, even if I shouldn't and I wish I didn't.</p><p>I'm talking about mental health, triggers, and unintentionally hurtful speech. Does that sound familiar? One reason I was drawn to intersectional feminism is that it seemed to have a good set of ground rules for how to treat everyone decently. But now I feel like I'm excluded from protection. “Men are horrible” is apparently the one form of speech where intent is all that matters, and I'm a bad person if it triggers something. I've been told it's offensive that I would even try to describe my experience in those terms.</p><p>It hurts a whole lot to try and really feel someone's pain, and then realize they don't even <em>slightly</em> give a shit about me. It hurts even more when they'll bend over backwards for anyone <em>except</em> me.</p><p>Look, I get it. You argue all the time with trolls who claim that men have it just as bad as women and will shout “what about the men” as a way to disrupt any discussion. When you're engaged in meme warfare, you can't show them any human empathy. They certainly wouldn't return the favor. And if my voice sounds a little like theirs, that's just too bad for me.</p><p>I know that this article will serve as ammunition for some people with views I find disgusting. That sucks, but I'm done using political strategy as a reason to stay silent. I understand tone policing as a derailing tactic, and I understand the need to call it out. But at this point it seems there's no room for a sincere request for kindness, especially coming from someone who doesn't get much benefit of the doubt. (The Geek Feminism Wiki <a href="http://geekfeminism.wikia.com/wiki/Tone_argument?oldid=23472#Civility">basically says</a> that asking for kindness is tone policing if and only if you're a man.)</p><p>I'm not trying to silence anyone here. I'm not jumping in and derailing an existing conversation. I'm writing on my own blog, on my own schedule, about my own feelings. But I'm told that even this is crossing a line.</p><p>I know that I can't dictate how others feel about our fucked-up world. Does that mean I must absolutely suppress the way I feel? Even when we agree about the substance of what's wrong? I know that if I ask someone to share their life experiences, they have a right to express anger. When does expressing anger become sustained, deliberate cruelty?</p><p>“People are being oppressed and you're asking us to care about your feelings?” Yes, I am asking you to care. Just a little bit. I don't claim that my feelings should be a top priority. I hope it wouldn't come up very often. But according to the outspoken few who <a href="http://www.smbc-comics.com/?id=2939">set the tone</a>, I'm <em>never</em> allowed to bring it up. I don't deserve to ask them to be nice.</p><p>And that's why I can no longer have anything to do with this movement. It's really that simple. I guess it says something about my state of mind that I felt the need to attach 1,700 words of preemptive defenses.</p><hr /><p>The truth is, when I'm not allowed to say or even think “not all men,” part of me hears “Yes, all men, especially you.” And if I'm ever confused about whether I'm allowed to say “not all men,” there are a dozen unprompted reminders every day. Little jokes, repeated constantly to set the climate about what will and won't be tolerated.</p><p>When you treat me like one of the trolls, I start to believe that I am one. Guys who say “I support feminism but sometimes they go too far” are usually trying to excuse sexist behavior. So what do I conclude about myself when I have the same thought?</p><p>I get that “ally” is not a label you self-apply, it's a thing you do, and the label comes from others. The problem is, if a hundred people say I'm a good ally, and one person says I'm a sexist asshole, who do you think I'm going to believe?</p><p>I'm not allowed to stand up for myself, because doing so is automatically an act of oppression. If a woman treats me like shit, and she's being “more feminist” than me, I conclude that I deserve to be treated like shit. That is the model I've learned of a good ally.</p><p>I'm not a good ally, or even a bad one. I'm collateral damage.</p><p>If the point of all this is to give me a tiny little taste of the invalidation that others experience on a regular basis, then congratulations, it worked. You've made your point. Now that you've broken me, how can I possibly help you, when it seems like I'm part of the problem just by existing? It feels like all I can do is engage in emotional self-harm to repay the debt of how I was born.</p><p>I can't just take a break “until I feel better.” My depressive symptoms will always come and go, and some thoughts will reliably bring them back. I spent years reading about how the most important thing I can do, as a winner of the birth lottery, is to be an ally to marginalized people. And now I've realized that I'm too sick and weak to do it.</p><p>Even if I give up on being an ally, I can't avoid this subject. It affects a lot of my friends, and I feel even worse when I ask them not to talk about it around me. I don't want to silence anyone. At least I've mostly stopped using Twitter.</p><p>So this is how I feel, but I'm not sure anyone else can do anything about it. Really, most of the people I've talked to have been sympathetic. Maybe I need to learn not to let bullies get to me, even when they're bullying in service of a cause I support. They don't seem to get much pushback from the wider community, at any rate.</p><p>What gives me hope is, I recognize that my participation in the endless shouting online wasn't really useful to anyone. If I can let myself ignore all that, maybe I can recover some of my energy for other activities that actually help people.</p><p>That's all I have to say right now. Thank you for listening to me.</p>Sat, 04 Nov 2017 00:42:22 +0000noreply@blogger.com (keegan)Brent Yorgey: Sum of heights in a binary treehttp://byorgey.wordpress.com/?p=2073
https://byorgey.wordpress.com/2017/11/03/sum-of-heights-in-a-binary-tree/
<p><em>Executive summary: every year when teaching data structures I always forget how to analyze the cost of building a binary heap, which amounts to summing the heights of all the nodes in a full binary tree. So I’m writing down the (lovely) proof here in the hopes that I will remember it next time.</em></p>
<p>Suppose you have a full binary tree and you do an operation on every node, where the cost of the operation is proportional to the height of that node. That is, the cost for each of the <img src="https://s0.wp.com/latex.php?latex=n%2F2&bg=ffffff&fg=333333&s=0" alt="n/2" class="latex" title="n/2" /> leaves is <img src="https://s0.wp.com/latex.php?latex=0&bg=ffffff&fg=333333&s=0" alt="0" class="latex" title="0" />, for each of the <img src="https://s0.wp.com/latex.php?latex=n%2F4&bg=ffffff&fg=333333&s=0" alt="n/4" class="latex" title="n/4" /> nodes in the next level up the cost is <img src="https://s0.wp.com/latex.php?latex=1&bg=ffffff&fg=333333&s=0" alt="1" class="latex" title="1" />, and so on. We can visualize the scenario like this:</p>
<div style="text-align: center;">
<p><img src="https://byorgey.files.wordpress.com/2017/11/163d289d2af7bf4f.png?w=640" /></p>
</div>
<p>As a function of the total number of nodes <img src="https://s0.wp.com/latex.php?latex=n&bg=ffffff&fg=333333&s=0" alt="n" class="latex" title="n" />, how expensive is this? We can see that <img src="https://s0.wp.com/latex.php?latex=O%28n+%5Clg+n%29&bg=ffffff&fg=333333&s=0" alt="O(n \lg n)" class="latex" title="O(n \lg n)" /> is an upper bound, since there are <img src="https://s0.wp.com/latex.php?latex=n&bg=ffffff&fg=333333&s=0" alt="n" class="latex" title="n" /> nodes and the height of each node is at most <img src="https://s0.wp.com/latex.php?latex=%5Clg+n&bg=ffffff&fg=333333&s=0" alt="\lg n" class="latex" title="\lg n" />. But it seems like it might actually be faster than this in reality, since, intuitively, <em>most</em> of the nodes have a height which is much smaller than <img src="https://s0.wp.com/latex.php?latex=%5Clg+n&bg=ffffff&fg=333333&s=0" alt="\lg n" class="latex" title="\lg n" />.</p>
<p>(One specific motivation for this scenario is that we can build a <a href="https://en.wikipedia.org/wiki/binary%20heap">binary heap</a> from an arbitrary set of data by looping over the nodes from the bottom up and calling <code>reheapDown</code> on each; in the worst case <code>reheapDown</code> takes time proportional to the height of the node, as in this scenario. But it doesn’t matter if you don’t know about binary heaps.)</p>
<p>Let’s take the same tree and put a dollar at every node, for a total of <img src="https://s0.wp.com/latex.php?latex=%5C%24n&bg=ffffff&fg=333333&s=0" alt="\$n" class="latex" title="\$n" />:</p>
<div style="text-align: center;">
<p><img src="https://byorgey.files.wordpress.com/2017/11/e06819c343da6ed3.png?w=640" /></p>
</div>
<p>Now imagine sliding all the money as far up and to the right as it will go. That is, we take each dollar, and keep moving it up as long as it is a left child. As soon as we reach a node which is a right child we stop. The tree ends up looking like this:</p>
<div style="text-align: center;">
<p><img src="https://byorgey.files.wordpress.com/2017/11/e3505964e049eb59.png?w=640" /></p>
</div>
<p>Now take each pile of money and move it up one step to its parent, except the money at the root of the tree, which you can put in your pocket.</p>
<div style="text-align: center;">
<p><img src="https://byorgey.files.wordpress.com/2017/11/3571afa7a86984a0.png?w=640" /></p>
</div>
<p>And voilà! We now have exactly enough money at each node to pay for the cost of the operations, and we even have a bit left over (which we can use to buy coffee). But we started with <img src="https://s0.wp.com/latex.php?latex=%5C%24n&bg=ffffff&fg=333333&s=0" alt="\$n" class="latex" title="\$n" /> and only shuffled money around; this shows that the total cost is actually <img src="https://s0.wp.com/latex.php?latex=O%28n%29&bg=ffffff&fg=333333&s=0" alt="O(n)" class="latex" title="O(n)" />.</p>
<p>Exercise for the reader: what does this have to do with the number of bit flips needed to count from <img src="https://s0.wp.com/latex.php?latex=1&bg=ffffff&fg=333333&s=0" alt="1" class="latex" title="1" /> to <img src="https://s0.wp.com/latex.php?latex=n&bg=ffffff&fg=333333&s=0" alt="n" class="latex" title="n" /> with a binary counter?</p><br /> <a href="http://feeds.wordpress.com/1.0/gocomments/byorgey.wordpress.com/2073/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/comments/byorgey.wordpress.com/2073/" alt="" border="0" /></a> <img src="https://pixel.wp.com/b.gif?host=byorgey.wordpress.com&blog=1152889&post=2073&subd=byorgey&ref=&feed=1" alt="" height="1" border="0" width="1" />Fri, 03 Nov 2017 15:06:40 +0000Robert Harper: PFPL Commentaryhttp://existentialtype.wordpress.com/?p=1403
https://existentialtype.wordpress.com/2016/06/03/pfpl-commentary/
<p>I am building a <a href="http://www.cs.cmu.edu/~rwh/pfpl.html" target="_blank" rel="noopener">web page</a> devoted to the 2nd edition of <em><a href="http://www.cambridge.org/us/academic/subjects/computer-science/programming-languages-and-applied-logic/practical-foundations-programming-languages-2nd-edition?format=HB" target="_blank" rel="noopener">Practical Foundations for Programming Languages</a></em>, recently published by Cambridge University Press. Besides an errata, the web site features a <a href="http://www.cs.cmu.edu/~rwh/pfpl/commentary.pdf" target="_blank" rel="noopener">commentary</a> on the text explaining major design decisions and suggesting alternatives. I also plan to include additional exercises and to make sample solutions available to faculty teaching from the book.</p>
<p>The purpose of the commentary is to provide the “back story” for the development, which is often only hinted at, or is written between the lines, in <em>PFPL</em> itself. To emphasize enduring principles over passing fads, I have refrained from discussing particular languages in the book. But this makes it difficult for many readers to see the relevance. One purpose of the commentary is to clarify these connections by explaining <em>why</em> I said what I said.</p>
<p>As a starting point, I explain why I ignore the familiar concept of a “paradigm” in my account of languages. The idea seems to have been inspired by Kuhn’s (in)famous book <em>The Structure of Scientific Revolutions</em>, and was perhaps a useful device at one time. But by now the idea of a paradigm is just too vague to be useful, and there are many better ways to explain and systematize language structure. And so I have avoided it.</p>
<p>I plan for the commentary to be a living document that I will revise and expand as the need arises. I hope for it to provide some useful background for readers in general, and teachers in particular. I wish for the standard undergraduate PL course to evolve from a superficial taxonomy of the weird animals in the language zoo to a systematic study of the general theory of computation. Perhaps <em>PFPL</em> can contribute to effecting that change.</p>
<p><em>Update</em>: As I had hoped, I have been making <em>many</em> new additions to the commentary, exposing alternatives, explaining decisions, and expanding on topics in <em>PFPL</em>. There are also a few errors noted in the errata; so far, nothing major has come up. (The sections on safety are safely sound.)</p><br />Filed under: <a href="https://existentialtype.wordpress.com/category/research/">Research</a>, <a href="https://existentialtype.wordpress.com/category/teaching-2/">Teaching</a> <a href="http://feeds.wordpress.com/1.0/gocomments/existentialtype.wordpress.com/1403/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/comments/existentialtype.wordpress.com/1403/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/godelicious/existentialtype.wordpress.com/1403/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/delicious/existentialtype.wordpress.com/1403/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gofacebook/existentialtype.wordpress.com/1403/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/facebook/existentialtype.wordpress.com/1403/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gotwitter/existentialtype.wordpress.com/1403/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/twitter/existentialtype.wordpress.com/1403/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gostumble/existentialtype.wordpress.com/1403/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/stumble/existentialtype.wordpress.com/1403/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/godigg/existentialtype.wordpress.com/1403/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/digg/existentialtype.wordpress.com/1403/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/goreddit/existentialtype.wordpress.com/1403/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/reddit/existentialtype.wordpress.com/1403/" alt="" border="0" /></a> <img src="https://pixel.wp.com/b.gif?host=existentialtype.wordpress.com&blog=2157150&post=1403&subd=existentialtype&ref=&feed=1" alt="" height="1" border="0" width="1" />Thu, 02 Nov 2017 19:45:35 +0000Robert Harper: It Is What It Is (And Nothing Else)http://existentialtype.wordpress.com/?p=1133
https://existentialtype.wordpress.com/2016/02/22/it-is-what-it-is-and-nothing-else/
<p>A recent discussion of introductory computer science education led to the topic of teaching recursion. I was surprised to learn that students are being taught that recursion requires understanding something called a “stack” that is nowhere in evidence in their code. Few, if any, students master the concept, which is usually “covered” only briefly. Worst, they are encouraged to believe that recursion is a mysterious bit of esoterica that is best ignored.</p>
<p>And thus is lost one of the most important and beautiful concepts in computing.</p>
<p>The discussion then moved on to the implementation of recursion in certain inexplicably popular languages for teaching programming. As it turns out, the compilers mis-implement recursion, causing unwarranted space usage in common cases. Recursion is dismissed as problematic and unimportant, and the compiler error is elevated to a “design principle” — to be serpentine is to do it wrong.</p>
<p>And thus is lost one of the most important and beautiful concepts in computing.</p>
<p>And yet, for all the stack-based resistance to the concept, <em>recursion</em><em> has nothing to do with a stack</em>. Teaching recursion does not need any mumbo-jumbo about “stacks”. Implementing recursion does not require a “stack”. The idea that the two concepts are related is simply mistaken.</p>
<p>What, then, is recursion? It is nothing more than <em>self-reference</em>, the ability to name a computation for use within the computation itself. <em>Recursion is what it is</em>, and nothing more. No stacks, no tail calls, no proper or improper forms, no optimizations, just self-reference pure and simple. Recursion is not tied to “procedures” or “functions” or “methods”; one can have self-referential values of all types.</p>
<p>Somehow these very simple facts, which date back to the early 1930’s, have been replaced by damaging myths that impede teaching and using recursion in programs. It is both a conceptual and a practical loss. For example, the most effective methods for expressing <a href="https://existentialtype.wordpress.com/2011/03/17/parallelism-is-not-concurrency/" target="_blank" rel="noopener">parallelism</a> in programs rely heavily on recursive self-reference; much would be lost without it. And the allegation that “real programmers don’t use recursion” is beyond absurd: the very concept of a digital computer is grounded in recursive self-reference (the cross-connection of gates to form a latch). (Which, needless to say, does not involve a stack.) Not only do real programmers use recursion, there could not even be programmers were it not for recursion.</p>
<p>I have no explanation for why this terrible misconception persists. But I do know that when it comes to programming languages, attitude trumps reality every time. Facts? We don’t need no stinking facts around here, amigo. You must be some kind of mathematician.</p>
<p>If all the textbooks are wrong, what is right? How <em>should</em> one explain recursion? It’s simple. If you want to refer to yourself, you need to give yourself a name. “I” will do, but so will any other name, by the miracle of α-conversion. A computation is given a name using a <em>fixed point</em> (not <em>fixpoint</em>, dammit) operator: <em>fix x is e</em> stands for the expression <em>e</em> named <em>x</em> for use within <em>e</em>. Using it, the textbook example of the factorial function is written thus:</p>
<pre style="padding-left: 30px;">fix f is fun n : nat in case n {zero => 1 | succ(n') => n * f n'}.</pre>
<p>Let us call this whole expression <em>fact,</em> for convenience. If we wish to evaluate it, perhaps because we wish to apply it to an argument, its value is</p>
<pre style="padding-left: 30px;">fun n : nat in case n {zero => 1 | succ(n') => n * <em>fact</em> n'}.</pre>
<p>The recursion has been <em>unrolled</em> one step ahead of execution. If we reach <em>fact</em> again, as we will for a positive argument, <em>fact</em> is evaluated again, in the same way, and the computation continues. <em>There are no stacks involved in this explanation</em>.</p>
<p>Nor is there a stack involved in the implementation of fixed points. It is only necessary to make sure that the named computation does indeed name itself. This can be achieved by a number of means, including circular data structures (non-well-founded abstract syntax), but the most elegant method is by <em>self-application</em>. Simply arrange that a self-referential computation has an implicit argument with which it refers to itself. Any use of the computation unrolls the self-reference, ensuring that the invariant is maintained. No storage allocation is required.</p>
<p>Consequently, a self-referential functions such as</p>
<pre style="padding-left: 30px;">fix f is fun (n : nat, m:nat) in case n {zero => m | succ(n') => f (n',n*m)}</pre>
<p>execute without needing any asymptotically significant space. It is quite literally a loop, and <em>no special arrangement</em> is required to make sure that this is the case. All that is required is to implement recursion properly (as self-reference), and you’re done. <em>There is no such thing as tail-call optimization. </em>It’s not a matter of optimization, but of proper implementation. Calling it an optimization suggests it is optional, or unnecessary, or provided only as a favor, when it is more accurately described as a matter of getting it right.</p>
<p>So what, then, is the source of the confusion? The problem seems to be a too-close association between compound expressions and recursive functions or procedures. Consider the classic definition of factorial given earlier. The body of the definition involves the expression</p>
<pre style="padding-left: 30px;">n * <em>fact</em> n'</pre>
<p>where there is a pending multiplication to be accounted for. Once the recursive call (to itself) completes, the multiplication can be carried out, and it is necessary to keep track of this pending obligation. <em>But this phenomenon has nothing whatsoever to do with recursion.</em> If you write</p>
<pre style="padding-left: 30px;">n * <em>square </em>n'</pre>
<p>then it is equally necessary to record where the external call is to return its value. In typical accounts of recursion, the two issues get confused, a regrettable tragedy of error.</p>
<p>Really, the need for a stack arises the moment one introduces compound expressions. This can be explained in several ways, none of which need pictures or diagrams or any discussion about frames or pointers or any extra-linguistic concepts whatsoever. The best way, in my opinion, is to use Plotkin’s structural operational semantics, as described in my <em>Practical Foundations for Programming Languages (Second Edition)</em> on Cambridge University Press.</p>
<p>There is no reason, nor any possibility, to avoid recursion in programming. But folk wisdom would have it otherwise. That’s just the trouble with folk wisdom, everyone knows it’s true, even when it’s not.</p>
<p><em>Update</em>: Dan Piponi and Andreas Rossberg called attention to a pertinent point regarding stacks and recursion. The conventional notion of a run-time stack records two distinct things, the <em>control state</em> of the program (such as subroutine return addresses, or, more abstractly, pending computations, or continuations), and the <em>data state</em> of the program (a term I just made up because I don’t know a better one, for managing multiple simultaneous activations of a given procedure or function). Fortran (back in the day) didn’t permit multiple activations, meaning that at most one instance of a procedure can be in play at a given time. One consequence is that α-equivalence can be neglected: the arguments of a procedure can be placed in a statically determined spot for the call. As a member of the Algol-60 design committee Dijkstra argued, successfully, for admitting multiple procedure activations (and hence, with a little extra arrangement, recursive/self-referential procedures). Doing so requires that α-equivalence be implemented properly; two activations of the same procedure cannot share the same argument locations. The data stack implements α-equivalence using de Bruijn indices (stack slots); arguments are passed on the data stack using activation records in the now-classic manner invented by Dijkstra for the purpose. It is not self-reference that gives rise to the need for a stack, but rather re-entrancy of procedures, which can arise in several ways, not just recursion. Moreover, recursion does not always require re-entrancy—the so-called tail call optimization is just the observation that certain recursive procedures are not, in fact, re-entrant. (Every looping construct illustrates this principle, albeit on an <em>ad hoc</em> basis, rather than as a general principle.)</p><br />Filed under: <a href="https://existentialtype.wordpress.com/category/programming/">Programming</a>, <a href="https://existentialtype.wordpress.com/category/teaching-2/">Teaching</a> <a href="http://feeds.wordpress.com/1.0/gocomments/existentialtype.wordpress.com/1133/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/comments/existentialtype.wordpress.com/1133/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/godelicious/existentialtype.wordpress.com/1133/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/delicious/existentialtype.wordpress.com/1133/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gofacebook/existentialtype.wordpress.com/1133/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/facebook/existentialtype.wordpress.com/1133/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gotwitter/existentialtype.wordpress.com/1133/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/twitter/existentialtype.wordpress.com/1133/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/gostumble/existentialtype.wordpress.com/1133/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/stumble/existentialtype.wordpress.com/1133/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/godigg/existentialtype.wordpress.com/1133/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/digg/existentialtype.wordpress.com/1133/" alt="" border="0" /></a> <a href="http://feeds.wordpress.com/1.0/goreddit/existentialtype.wordpress.com/1133/" rel="nofollow"><img src="http://feeds.wordpress.com/1.0/reddit/existentialtype.wordpress.com/1133/" alt="" border="0" /></a> <img src="https://pixel.wp.com/b.gif?host=existentialtype.wordpress.com&blog=2157150&post=1133&subd=existentialtype&ref=&feed=1" alt="" height="1" border="0" width="1" />Thu, 02 Nov 2017 18:24:32 +0000Tweag I/O: The Exodus to Streamgard,<br/> an epic poemhttp://www.tweag.io/posts/2017-11-01-streaming-and-foldl.html
http://www.tweag.io/posts/2017-11-01-streaming-and-foldl.html
<div>Yves Parès</div><p><em><span class="dropcap">I</span><span style="font-variant: small-caps;">f</span>
Haskell was a god, often would he be depicted with the ravens Modularity and
Abstraction flying above him, hovering the world and reporting to him every
detail of our whereabouts. Haskell would sit on the Throne of Purity and look
upon the world with <span class="tooltip" title="Yes, of course Haskell would be one-eyed. And he'd have a list of like 200 awe-inspiring nicknames, like 'The Monadbringer' or 'The Father of all things pure', but that's another story.">an eye</span> full of wisdom. And in his hand, the mighty Haskell would
wield the Spear of Lazy Lists, which is said to have the power to tackle each
and every problem the world might have to face. And to honour him, we would
code and abstract everything with lazy lists. For millenia would lists be used
to map, filter, separate, merge, group, <span class="tooltip" title="Full cosmogony in the religion of Haskell is left as an exercise to the reader.">and
so forth</span>.</em></p>
<p><em><span class="dropcap">B</span><span style="font-variant: small-caps;">ut</span>, one day, the <span class="tooltip" title="Yes, all that buildup for a lousy pun">
Real-<a href="https://en.wikipedia.org/wiki/J%C3%B6rmungandr">World Serpent</a></span>, son of the wicked <span class="tooltip" title="Also seen written as 'Folður'">Foldr</span>, would come. And the Real-World Serpent
carries an eternal hatred towards lazy lists. Oh, that dreaded Serpent, that
will throw everything it can muster to prevent us from staying within the warm
comfort of abstraction and laziness. The Serpent will assemble its minions,
<a href="http://www.tweag.io/posts/2017-07-27-streaming-programs.html"><em>Early-close</em> and <em>Strictness of
effects</em></a>, and
unleash its wrath upon our world. Foldl, son of Haskell and brother of Foldr,
would lead humanity to its last bastion, Streamgard, and organize the final
fight...</em></p>
<p>So, long story short,
<a href="http://hackage.haskell.org/package/streaming"><code>streaming</code></a> is a library that
allows you to leverage the insights you have gained while manipulating lazy
lists in Haskell to handle effectful streams of data. We already talked about
<code>streaming</code> on this blog, with
<a href="http://www.tweag.io/posts/2017-07-27-streaming-programs.html">this post</a>
discussing the IO part and
<a href="http://www.tweag.io/posts/2017-10-05-streaming2.html">this one</a> comparing it to
<a href="http://hackage.haskell.org/package/pipes">pipes</a> and
<a href="http://hackage.haskell.org/package/conduit">conduit</a>. Here, we will be using
<code>streaming</code> for highly efficient data processing and filtering. To this effect, we will use
it conjointly with another library,
<a href="http://hackage.haskell.org/package/foldl"><code>foldl</code></a>, which gives us an
<code>Applicative</code> interface to the usual list functions. In this blog post we will
apply them to the task of computing some statistics about a distribution of
data. We want to be able to:</p>
<ul>
<li>process the input data stream <em>only once (aka in one pass)</em>,</li>
<li>never repeat the effects that were used to produce that data stream,</li>
<li>maintain the possibility to use the input stream as if it were a list, for
instance by splitting it into two subparts, sending each subpart to be
processed by a specific function.</li>
</ul>
<p>So lets imagine that the statistics I want to compute on my input data
distributions take the shape of a simple summary. This is what I want to obtain
in the end:</p>
<pre><code class="language-haskell">data Summary v a = Summary
{ summaryLength :: Int
, summaryMins :: [a]
, summaryMaxes :: [a]
, summaryMean :: v
, summaryStdDev :: v
}
deriving (Show)
</code></pre>
<p>Nothing too fancy here, I just want to be able to compute the length, the <code>n</code>
smallest elements, the <code>n'</code> biggest elements, the mean and the standard deviation
of my distribution. We distinguish the types <code>a</code> and <code>v</code> here because our input
distribution does not have to be numerical, as long as we have a projection <code>a -> v</code> available. This way, we can compute a summary of a stream of <code>(Double, String)</code> tuples, for instance, if the projection is just <code>fst</code>.</p>
<p>So let's have a little reminder of our conditions. We want to be able to read
the input data only once. But, we still want modularity and reusability. We do
not want to have to recode our <code>Summary</code>-computing function every time we want
to add a new field, and we would like to reuse already existing functions
computing these statistics. And this is where the <code>foldl</code> package comes in.</p>
<p>This package defines a type <code>Fold</code> as follows:</p>
<pre><code class="language-haskell">data Fold a b = forall acc. Fold (acc -> a -> acc) acc (acc -> b)
</code></pre>
<p>You might recognize here the typical arguments of the classical <code>foldl</code> function
of the <code>Prelude</code>: <code>a</code> is the type of each element of the input stream we
consume, the first field <code>(acc -> a -> acc)</code> is an accumulation function and the
second field <code>acc</code> is the initial value of the accumulator. The new component
is the <code>b</code> type parameter and the last field <code>(acc -> b)</code>. This one is called
<em>extract</em>. It is used to extract the final value out of the accumulator. This is
necessary so that <code>Fold a</code> can be a <code>Functor</code> and therefore an
<code>Applicative</code>. See the
<a href="http://www.haskellforall.com/2013/08/composable-streaming-folds.html">original blog post</a>
and <a href="https://www.youtube.com/watch?v=6a5Ti0r8Q2s">this talk</a> by Gabriel Gonzalez
for more detail, though be aware that <code>Fold</code> had a different shape back then.</p>
<p>One of the central ideas of the <code>foldl</code> library is that <code>Fold</code> implements the
<code>Applicative</code> type class:</p>
<pre><code class="language-haskell">instance Applicative (Fold a)
</code></pre>
<p>Crucially, this instance combines two <code>Fold</code>s, into a guaranteed one-pass
traversal of the data. Therefore we can safely decompose the computation of a
<code>Summary</code> as follows:</p>
<pre><code class="language-haskell">import qualified Control.Foldl as L
import Data.Function (on)
summarizeBy :: (Floating v, Ord v)
=> (a -> v) -> Int -> Int -> L.Fold a (Summary v a)
summarizeBy f nMins nMaxes = Summary
<$> L.length
<*> collect ((>=) `on` f) nMins
<*> collect ((<=) `on` f) nMaxes
<*> L.premap f L.mean
<*> L.premap f L.std
</code></pre>
<p>What's happening here? We are using a few of the functions already present in
the <code>foldl</code> package and a new one, so let's delve into it a bit. The function
<code>summarizeBy</code> takes a projection <code>f</code>, which we talked about earlier, the number
of smallest elements we want to collect and the number of biggest elements. Then
our five statistics are computed:</p>
<ul>
<li><code>L.length :: L.Fold a Int</code> gives us the number of elements in the input.</li>
<li><code>collect</code>, which we will define a bit later, accumulates either the mins or
the maxes given a comparison function.</li>
<li><code>L.mean</code> gives us the average. We use <code>L.premap f</code> to turn it into a fold that
will work on our projection <code>f</code>.</li>
<li><code>L.std</code> gives us the standard deviation.</li>
</ul>
<p>The combination of the above gives us a <code>Fold a (Summary v a)</code>, something that
will consume a stream of <code>a</code>'s and output a summary. At this point, nothing is
consumed, we have only composed folds together, and a <code>Fold</code> is agnostic of the
exact nature of the input. Running it on any <code>Foldable</code> datatype for instance is
just a matter of calling:</p>
<pre><code class="language-haskell">L.fold (summarizeBy id 3 3) [1..100]
</code></pre>
<p>The only function <span class="tooltip" title="The foldl package provides 'minimum' and 'maximum', but here we want more than that.">not provided by the <code>foldl</code> package</span> is the <code>collect</code> function. Defining it as a brand new
<code>Fold</code> is simple:</p>
<pre><code class="language-haskell">import Data.Sequence as Seq
collect :: (a -> a -> Bool) -> Int -> L.Fold a [a]
collect skipPred n = L.Fold insertPop Seq.empty (L.fold L.list)
where
insertPop acc x
| Seq.length acc < n = insert x acc
| otherwise = pop (insert x acc)
insert x s = let (before, after) = Seq.spanl (skipPred x) s
in before <> Seq.singleton x <> after
pop s = case viewr s of
s' :> _ -> s'
_ -> s
</code></pre>
<p>Here we manually defined a new <code>Fold</code> from the three elements we mentioned
earlier: an accumulation function (<code>insertPop</code>), an initial accumulator value
(<code>Seq.empty</code>) and an <em>extract</em> function (<code>(L.fold L.list)</code>, which also uses a
<code>Fold</code> to turn the final sequence into a plain list).</p>
<p>Now, the astute reader will notice we left <code>streaming</code> aside. Let's get back to
it. Let's use as an input the classic
<a href="https://github.com/caesar0301/awesome-public-datasets/blob/master/Datasets/titanic.csv.zip">Titanic dataset</a>:</p>
<pre><code class="language-csv">PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
...
</code></pre>
<p>We want to get two different summaries for the fares: one for the passengers
that survived and one for those who did not. First, let's load the CSV into a
<code>stream</code> by using the
<a href="http://hackage.haskell.org/package/streaming-cassava"><code>streaming-cassava</code></a> and
<a href="https://hackage.haskell.org/package/streaming-bytestring"><code>streaming-bytestring</code></a>
packages:</p>
<pre><code class="language-haskell">{-# LANGUAGE OverloadedStrings #-}
import Control.Monad (mzero)
import qualified Data.ByteString.Streaming as BS
import Streaming
import Streaming.Cassava
data Passenger { name :: !String, fare :: !Double, survived :: !Bool }
deriving (Show)
instance FromNamedRecord Passenger where
parsedNamedRecord m =
Person <$> m .: "Name" <*> m .: "Fare" <*> (toBool =<< (m .: "Survived"))
where toBool 0 = return False
toBool 1 = return True
toBool _ = mzero
streamCsv :: (MonadResource m) => Stream (Of Passenger) m ()
streamCsv = decodeByName (BS.readFile ".../titanic.csv")
</code></pre>
<p>Nothing too fancy here, just a bit of required boilerplate to be able to read
<code>Passenger</code>s from the CSV file. <code>MonadResource</code> is necessary to track the files
opened by our program. The type <code>Stream (Of Passenger) m ()</code> means that we will
be manipulating a stream whose elements are <code>Passenger</code>s, that will run some
effects in a monad <code>m</code> and return no result in the end.</p>
<p>Now, lets split that input in two different substreams:</p>
<pre><code class="language-haskell">import qualified Streaming.Prelude as S
aliveDead :: Stream (Of Passenger) (Stream (Of Passenger) m) ()
aliveDead = S.partition survived streamCsv
</code></pre>
<p>Let's look at the type of <code>aliveDead</code>: it is a <code>Stream</code> over another
<code>Stream</code>. <code>Stream (Of a)</code> is actually a monad transformer, the way the
partitioning happens is by creating two layers: one for the live passengers and
one for the dead ones. It's not exactly a tuple of two streams (as it would be
with <code>Data.List.partition</code>), but is has the same advantages: each layer can be
processed by different functions which don't have to know where the stream they
process lies in the monad stack. Therefore, each one of these functions can be
expressed as:</p>
<pre><code class="language-haskell">summarizePassengers
:: (Monad m) => Stream (Of Passenger) m a -> m (Of (Summary Double Passenger) a)
summarizePassengers = L.purely S.fold (summarizeBy fare 3 3)
</code></pre>
<p>where <code>m</code> can be any monad. This can be the bottom MonadResource or another
<code>Stream</code>, <code>summarizePassengers</code> does not mind and does not have to! <code>Of</code> behaves
like a tuple, so it simply means that we return both the newly computed
<code>Summary</code> and an <code>a</code> (<code>a</code> may just be <code>()</code>, but here we have to be a little more
general). <code>S.fold</code> is the basic folding function for streams. <code>L.purely fn f</code>
"unpacks" a <code>Fold</code> <code>f</code> and calls a folding function <code>fn</code>. So now, getting our
summaries is just a matter of</p>
<pre><code class="language-haskell">runAll = runResourceT $ do
(summaryAlive :> summaryDead :> ()) <-
summarizePassengers $ summarizePassengers aliveDead
...
</code></pre>
<p>So in the end, we splitted the input file in two substreams, we computed various
statistics twice, and despite all this <code>streaming</code> and <code>foldl</code> guarantee that
the input will be read <em>only once</em> in bounded memory.</p>
<p>These techniques are currently being applied by Tweag I/O in the context of a
project with <a href="http://www.novadiscovery.com">Novadiscovery</a>. Novadiscovery is a
consulting company for <em>in silico</em> clinical trials, namely simulation of virtual
patients through biomodeling. Parts of this blog post are actual code from the
tools we develop with them.</p>Wed, 01 Nov 2017 00:00:00 +0000Douglas M. Auclair (geophf): October 2017 1HaskellADay problems and solutionstag:blogger.com,1999:blog-4650294074444534066.post-8329363547735877793
http://logicaltypes.blogspot.com/2017/10/october-2017-1haskelladay-problems-and.html
<ul style="color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 1.4; margin: 0.5em 0px; padding: 0px 2.5em;"><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 31st, 2017: Tuesday's #haskell problem has you <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D31/Exercise.hs">save off the special characters</a> from yesterday's exercise to a properties file. Today's #haskell solution: <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D31/Solution.hs">properties file created. We also output the special characters in context</a>, which is nice. </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 30th, 2017: Monday's #haskell problem we <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D30/Exercise.hs">contextually find special characters</a> in documents. Whoa! Today's #haskell solution finds that there are <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D30/Solution.hs">a lot of special characters</a> in these documents! </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 27th, 2017: 2017-10-27, a date where the <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D27/Exercise.hs">month/day is an anagram of the year</a>. How many days from today are anagrams of 2017? What are they? Today's #haskell solution: <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D27/Solution.hs">and we have a winner!</a> A bunch of winners, actually for date anagrams in 2017. </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 26th, 2017: Today's #haskell problem does what Google calls <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D26/Exercise.hs">artificial-artificial intelligence analyses</a> on NYT article archives. Today's #haskell solution: <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D26/Solution.hs">from one spreadsheet comes many</a>! Nietzsche would be so proud of me rn. </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 25th, 2017: Today's #haskell problem, instead of counting articles by topic, we <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D25/Exercise.hs">divide articles into browsable/graphable topics</a>. Today's #haskell solution we <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D25/Solution.hs">graph a slice of our NYT article archive</a>. <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://3.bp.blogspot.com/-hdxYSR0UYyA/WfIJzaCkUJI/AAAAAAAAB8w/dmYGzplmXq4hD9yj-cYGW05h2b_dDRAXQCLcBGAs/s1600/hurricanes-graph.png"><img src="https://3.bp.blogspot.com/-hdxYSR0UYyA/WfIJzaCkUJI/AAAAAAAAB8w/dmYGzplmXq4hD9yj-cYGW05h2b_dDRAXQCLcBGAs/s320/hurricanes-graph.png" style="" height="207" border="0" width="320" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 24th, 2017: Tuesday's #haskell exercise we read back in the articles stored as JSON then <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D24/Exercise.hs">partition them by subcategory</a>. Okay! <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D24/Solution.hs">Partitioned articles</a> for today's #haskell solution. </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 23th, 2017: Monday's #haskell problem we start to analyze our NYT article archive, <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D23/Exercise.hs">selecting a topic to dissect</a>. In the #haskell solution <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D23/Solution.hs">we read articles from our data store, then we save them out to file as JSON.</a></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 20th, 2017: Today's #haskell problem: NYT article archive through a different lens, as <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D20/Exercise.hs">nodes and relations in a graph database</a>. Today's #haskell solution: we wanted #graph <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D20/Solution.hs">we got graph</a>. <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://1.bp.blogspot.com/-aRuo9On_NhI/WeprhUSZNeI/AAAAAAAAB8Y/7nowigMxM8wGtgkyZDgaSChhd0mRnJxMQCLcBGAs/s1600/graph-3-related-to-tennis.png"><img src="https://1.bp.blogspot.com/-aRuo9On_NhI/WeprhUSZNeI/AAAAAAAAB8Y/7nowigMxM8wGtgkyZDgaSChhd0mRnJxMQCLcBGAs/s320/graph-3-related-to-tennis.png" style="" height="223" border="0" width="320" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 19th, 2017: Customer: "Ooh! We love the charts you're making for us, but..." TIL that <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D19/Exercise.hs">customers have big 'but's</a>. Today's #haskell solution: And now we can <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D19/Solution.hs">read .gitignore-style configuration files</a>.</li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 18th, 2017: Today's #haskell problem we <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D18/Exercise.hs">archive the topics (and article topicality)</a> of the NYT article set. We group data; <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D18/Solution.hs">we regroup data</a>. Ah! The life of a Data Scientist! Today's #haskell solution via @d3js_org #dataviz <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://1.bp.blogspot.com/-x3wqNm2gkEE/WejEpU89sLI/AAAAAAAAB8A/Yeqvma5L2P4ypxJeOac6bEizdJq4b63YwCLcBGAs/s1600/stacked-chart.png"><img src="https://1.bp.blogspot.com/-x3wqNm2gkEE/WejEpU89sLI/AAAAAAAAB8A/Yeqvma5L2P4ypxJeOac6bEizdJq4b63YwCLcBGAs/s320/stacked-chart.png" style="" height="162" border="0" width="320" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 17th, 2017: Today's #haskell problem: we build an app that <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D17/Exercise.hs">queries the database and generates circle reports</a>. A bit of database work then a bit of JSON to get us our <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D17/Solution.hs">charting tool</a> for today's #haskell solution.<div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://2.bp.blogspot.com/-Z5eRzvl8sCs/WeY6lesDaHI/AAAAAAAAB7U/IV9_6-abw3kwgFj5a3U1su_hojSSa0RHACLcBGAs/s1600/2nd-half-2017.png"><img src="https://2.bp.blogspot.com/-Z5eRzvl8sCs/WeY6lesDaHI/AAAAAAAAB7U/IV9_6-abw3kwgFj5a3U1su_hojSSa0RHACLcBGAs/s320/2nd-half-2017.png" style="" height="318" border="0" width="320" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 16th, 2017: Thanks to @ahnqir we're looking at <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D16/Exercise.hs">cities and skyscrapers</a>for today's #haskell problem. <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D16/Solution.hs">Ooh! Bar chart!</a> Today's #haskell solution exclaims (proclaims? declaims?) "Look at Hong Kong with its highrises!" <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://1.bp.blogspot.com/-kEjx0ZDghSs/WeTX2teu04I/AAAAAAAAB64/YB2iohwk69Y7ldQc_593j0TGqHhVJuNuwCLcBGAs/s1600/nyc.jpg"><img src="https://1.bp.blogspot.com/-kEjx0ZDghSs/WeTX2teu04I/AAAAAAAAB64/YB2iohwk69Y7ldQc_593j0TGqHhVJuNuwCLcBGAs/s320/nyc.jpg" style="" height="180" border="0" width="320" /></a><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://1.bp.blogspot.com/-e4HBYkd_EZY/WeZ04eLNAmI/AAAAAAAAB7o/-kSxDYJaJnwTHGMft5V_eicPB75G9w1cwCLcBGAs/s1600/bar-chart.png"><img src="https://1.bp.blogspot.com/-e4HBYkd_EZY/WeZ04eLNAmI/AAAAAAAAB7o/-kSxDYJaJnwTHGMft5V_eicPB75G9w1cwCLcBGAs/s320/bar-chart.png" style="" height="170" border="0" width="320" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 13th, 2017: Yesterday we built a little app (scanner), today we build something a little bigger: a <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D13/Exercise.hs">#haskell #ETL application</a>! Today we pull together <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D13/Solution.hs">NYT article parsing and database functions</a> to create a #haskell ETL app! YAY! </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 12th, 2017: Thursday's #haskell exercise is to <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D12/Exercise.hs">build an app named scanner</a>. I, for one, welcome our scanner overlords. Today's #haskell solution is the l<a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D12/Solution.hs">ittle scanner app that could</a>! <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://3.bp.blogspot.com/-0b-ZZdMcf-g/WeBJCNvLOVI/AAAAAAAAB6g/kcY8DfiTfNA4OOA0rCqxE6ybm6iykER1QCLcBGAs/s1600/Scanners.jpg"><img src="https://3.bp.blogspot.com/-0b-ZZdMcf-g/WeBJCNvLOVI/AAAAAAAAB6g/kcY8DfiTfNA4OOA0rCqxE6ybm6iykER1QCLcBGAs/s320/Scanners.jpg" style="" height="320" border="0" width="213" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 11th, 2017: Wednesday's #haskell problem: TIL that <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D11/Exercise.hs">not all data in production is pristine</a>. SHOCKER! So, okay, we know which document we are in when the l<a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D11/Solution.hs">ess-than-pristine data fandangos on our perfect parser</a>. Good. </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 10th, 2017:<br />Boss: "That's a good chart, but <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D10/Exercise.hs">can I have it in a spreadsheet?</a>"<br />me: "But ..."<br />Boss: "Now."<br /><br />Today's #haskell solution:<br /><br />me: Ya wanna spreadsheet, boss? HERE'S YER SPREADSHEET! YA HAPPY?<br />Boss: <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D10/Solution.hs">um, ... 'yes'</a>? <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://2.bp.blogspot.com/-IoTETsjBE3o/Wd1BP-Vr1kI/AAAAAAAAB58/dX8PwV7Z5qU0HczGOQv52OJPl3iQmhaLwCLcBGAs/s1600/le-spread-sheetz.png"><img src="https://2.bp.blogspot.com/-IoTETsjBE3o/Wd1BP-Vr1kI/AAAAAAAAB58/dX8PwV7Z5qU0HczGOQv52OJPl3iQmhaLwCLcBGAs/s320/le-spread-sheetz.png" style="" height="262" border="0" width="320" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 9th, 2017: Today's #haskell problem asks: <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D09/Exercise.hs">can we chart just a few of the topics</a> of the week archive of the NYT? Sure we can! Today's #haskell solution provides the <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D09/Solution.hs">top 5 topics then all topics with 10 or more articles</a>. <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://1.bp.blogspot.com/-xGgFalgUoTY/WdwZQbNVUqI/AAAAAAAAB5c/e2x2LWUKoYEpRDGbMvH7qNWpbI69ATGLACLcBGAs/s1600/top5-topics.png"><img src="https://1.bp.blogspot.com/-xGgFalgUoTY/WdwZQbNVUqI/AAAAAAAAB5c/e2x2LWUKoYEpRDGbMvH7qNWpbI69ATGLACLcBGAs/s320/top5-topics.png" style="" height="311" border="0" width="320" /></a></div><br /><div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://4.bp.blogspot.com/-bLxLzsgKXSA/WdwZSqEFMRI/AAAAAAAAB5g/Pm__TYwBZf8mhFFKQyHP_P1SSs9E3PAWQCLcBGAs/s1600/21circles-of-circles.png"><img src="https://4.bp.blogspot.com/-bLxLzsgKXSA/WdwZSqEFMRI/AAAAAAAAB5g/Pm__TYwBZf8mhFFKQyHP_P1SSs9E3PAWQCLcBGAs/s320/21circles-of-circles.png" style="" height="311" border="0" width="320" /></a></div> </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 5th, 2017: Friday's #haskell exercise works with <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D05/Exercise.hs">data-as-JSON and charting the analyses</a> on the data. Today's #haskell solution has <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D05/Solution.hs">a LOT of circles in its visualization</a> of NYT article topics. A LOT. <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://4.bp.blogspot.com/-meXiIKavF1c/WdhOMFS6wTI/AAAAAAAAB5E/cDxHFXvna1As8-7qdA3mn_K43ZLigtwygCLcBGAs/s1600/topic-circles.png"><img src="https://4.bp.blogspot.com/-meXiIKavF1c/WdhOMFS6wTI/AAAAAAAAB5E/cDxHFXvna1As8-7qdA3mn_K43ZLigtwygCLcBGAs/s320/topic-circles.png" style="" height="311" border="0" width="320" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 4th, 2017: Today's #haskell exercise uses the NYT archive we've stored <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D04/Exercise.hs">to look at trending topics and to visualize them</a>. Today's #haskell solution we <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D04/Solution.hs">grab and group data</a> we have in a data store. </li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 3rd, 2017: Building a set of words you key off of in your search for articles? <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D03/Exercise.hs">MemoizingTable</a> is today's #haskell problem. We parse a sliced of the NYT article archive and <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D03/Solution.hs">store articles and their respective subjects</a>. <div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://2.bp.blogspot.com/-DVPh_JM00p4/WdUglZ11khI/AAAAAAAAB4o/GWB8MJRL8pcIGSFi2KeoZxx-eX9fuxnLQCLcBGAs/s1600/subject-pivot-table.png"><img src="https://2.bp.blogspot.com/-DVPh_JM00p4/WdUglZ11khI/AAAAAAAAB4o/GWB8MJRL8pcIGSFi2KeoZxx-eX9fuxnLQCLcBGAs/s320/subject-pivot-table.png" style="" height="241" border="0" width="320" /></a></div><br /><div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://1.bp.blogspot.com/-_qDSgDvWyXc/WdUgl06BrCI/AAAAAAAAB4s/FqGmmXS1j-0qL0UAallzJzB-mpzXv6oZQCLcBGAs/s1600/subjects.png"><img src="https://1.bp.blogspot.com/-_qDSgDvWyXc/WdUgl06BrCI/AAAAAAAAB4s/FqGmmXS1j-0qL0UAallzJzB-mpzXv6oZQCLcBGAs/s320/subjects.png" style="" height="241" border="0" width="320" /></a></div></li><li style="border: none; margin: 0px 0px 0.25em; padding: 0.25em 0px;">October 2nd, 2017: Monday's #haskell problem looks at representing <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D02/Exercise.hs">SQL join- or pivot-tables</a> generally. Today's solution uses the <a style="color: #7d181e; text-decoration: none;" href="https://github.com/geophf/1HaskellADay/blob/master/exercises/HAD/Y2017/M10/D02/Solution.hs">Pivot values in #haskell</a> to store data on a remote PostgreSQL database. </li><div style="clear: both; text-align: center;" class="separator"><a style="color: #7d181e; margin-left: 1em; margin-right: 1em; text-decoration: none;" href="https://2.bp.blogspot.com/-OHf6v0i555A/WdNdVcc3K_I/AAAAAAAAB4Q/pZ6IF-wFCoQAVWvDQgHRPZ1vMe6gel1mwCLcBGAs/s1600/nyt-10-articles-with-staging.png"><img src="https://2.bp.blogspot.com/-OHf6v0i555A/WdNdVcc3K_I/AAAAAAAAB4Q/pZ6IF-wFCoQAVWvDQgHRPZ1vMe6gel1mwCLcBGAs/s320/nyt-10-articles-with-staging.png" style="" height="216" border="0" width="320" /></a></div></ul>Tue, 31 Oct 2017 21:32:32 +0000noreply@blogger.com (geophf)Christopher Allen: How I make stewhttp://bitemyapp.com//posts/2017-10-29-how-i-make-stew.html
http://bitemyapp.com//posts/2017-10-29-how-i-make-stew.html
<div class="info">
</div>
<div class="post">
<p>I think the first thing my mother taught me to cook was Kraft Mac-n-Cheese at age 9. Fortunately, I’ve been able to move past that since then. My repertoire is a bit limited but I like to think that by zeroing in on specific kinds of meals, I’m able to make them go a bit farther. A friend of mine asked how I do crockpot recipes and after stewing on it for awhile I thought I would write a post explaining my thought process.</p>
<p>So, first the reason I cook at home is usually to practice. I’m a bachelor and probably wouldn’t bother if cooking wasn’t necessary for me to either eat healthy or to entertain. The exception to “cooking-as-practice” are my stews. I usually make stews to eat healthier, avoid going out, and sometimes as part of a weight cut.</p>
<p>Another difference for stew from other kinds of food I make is that I’m more comfortable composing arbitrary ingredients in stews and knowing what I’ll get out of it. Most people, especially sedentary programmers, could benefit from eating less as well as eating healthier and I find stews help a lot with portion control. When I’m on a protein-sparing modified fast (PSMF) I’ll often make a single stew containing 1 or 2 pounds of meat for the whole week.</p>
<p>For equipment I use a fairly ordinary slow cooker that has an off, low, and high setting. The nice thing about my slow cooker is that the crock pot can be lifted out of the heating element so that I can easily store the food in the fridge. I don’t typically freeze my stews in gladware/tupperware as I’m fine eating the same thing for a week straight but people with families may want to store and alternate.</p>
<p>Now to the ingredients and process!</p>
<h2 id="meat-and-vegetables-and-not-much-else">Meat and vegetables (and not much else)</h2>
<p>Typically I’ll use a red meat (beef, pork, lamb, goat) and a complement of vegetables. Sometimes I’ll make stews with chicken but I typically don’t as I don’t feel they contribute as much flavor to the stew. A near constant in my stews is mushrooms as the umami (savoriness) they bring is vital for a good stew. Other common ingredients are carrots, celery, bell peppers, and tomatoes. Often recipes will call for stewed or crushed tomatoes. I will typically use tomato paste and water separately instead so that I can more finely control the tomatoey-ness.</p>
<p>Another thing is that stews often call for potatoes, I never use them or any other starchy ingredient as I do not need empty or low-fiber calories.</p>
<h1 id="i-stopped-trying-to-replicate-curries">I stopped trying to replicate curries</h1>
<p>One of the key things that improved my stews was not attempting to mimic things like curries which are very heavily laden with seasonings and instead focusing on getting the most out of my ingredients through browning. I still add seasoning (especially salt!) but it’s there to complement the ingredients, not to be the primary source of flavor.</p>
<p>They never really came out that well and the super-strong flavors meant that I tired of what I made much faster than if it was more subtle.</p>
<h1 id="you-have-to-bring-out-the-flavor-in-your-ingredients">You have to bring out the flavor in your ingredients</h1>
<p><em>Before</em> you put them in the slow cooker. This is particularly vital for the meat and mushrooms, but it applies to vegetables depending on your preference. The trick here is to <a href="https://en.wikipedia.org/wiki/Maillard_reaction">purge moisture and brown</a> the ingredients in a skillet or pan before throwing them in the slow cooker. This will make all the difference in the quality of your stew, especially if you do a good job of this with the meat and mushrooms! Butter-browned mushrooms are manna from heaven when done right! They should taste good by themselves before being added to the stew.</p>
<p>I strongly recommend getting good at browning ingredients and learning to cook for moisture targets if this isn’t something you’re already comfortable with. I got much better at this by learning to cook Sichuan food from <a href="http://amzn.to/2zS2W8d">Fuchsia Dunlop’s <em>Land of Plenty</em></a> and it has improved everything I prepare. I also appreciate vegetables much more as a result.</p>
<h1 id="seasoning">Seasoning</h1>
<p>Some common standard spices for stews:</p>
<ul>
<li>Salt</li>
<li>Pepper</li>
<li>Paprika</li>
<li>Parsley</li>
<li>Thyme</li>
<li>Oregano</li>
<li>Basil</li>
<li>Rosemary</li>
<li>MSG (hate if you want, but it helps and works with the mushrooms and meat to make the stew fantastically savory)</li>
</ul>
<p>I usually end up using more salt than any recipe I’m working off of calls for. I start low and add more salt in one hour intervals until I’m satisfied with my taste-test from the crockpot.</p>
<h1 id="composing-ingredients-not-functions">Composing ingredients, not functions</h1>
<p>I mentally categorize ingredients by what flavor or texture component they bring to the stew. I’ll put a <code>(B)</code> next to ingredients that I feel should be browned.</p>
<h2 id="savory">Savory</h2>
<ul>
<li><p>Beef, pork, lamb, goat (B)</p></li>
<li><p>Mushrooms, usually baby bella because of the surface area to volume ratio and flavor (B)</p></li>
<li><p>Tomatoes, usually as a paste for flavor density. I am more likely to use tomato paste with a beef or lamb than I am pork or chicken.</p></li>
</ul>
<h2 id="sweet">Sweet</h2>
<ul>
<li><p>Carrots</p></li>
<li><p>Peas</p></li>
<li><p>Corn</p></li>
</ul>
<p>I don’t typically use corn but it’s a solid option if you want it</p>
<ul>
<li>Peppers</li>
</ul>
<p>Bell peppers are typically neutral but others like chili peppers will taste a little sweet. I no longer try to make my stews spicy though.</p>
<ul>
<li><p>Red cabbage</p></li>
<li><p>Onion (B)</p></li>
</ul>
<p>It depends on the onion. Vidalia is going to be most sweet, red onion middling, yellow or white onion is going to be the most astringent.</p>
<h2 id="neutral">Neutral</h2>
<ul>
<li><p>Celery</p></li>
<li><p>Bell peppers</p></li>
</ul>
<p>You can brown bell peppers but I don’t think there’s much point when it’ll hit the texture I want just from slow cooking and I can’t tell much difference in the flavor.</p>
<h2 id="astringent-or-sulphuric">Astringent or sulphuric</h2>
<ul>
<li>Onion (B)</li>
</ul>
<p>I usually sautee some onions and also throw in some raw onions to get both sides of the onion flavor. Raw onions are more astringent, sauteed/browned onions are more savory and sweet as the sulphuric components denature in heat.</p>
<ul>
<li>Garlic cloves (B)</li>
</ul>
<p>Garlic is pretty much an always-ingredient in my stews along with onions and mushrooms. I strongly recommended learning to brown garlic cloves if you aren’t already in the practice of doing so as it does wonderful things for the flavor of garlic. You can sautee some of the ingredients you need to brown together. Don’t brown too many ingredients at once in the skillet as it will make it hard to purge moisture accurately for each ingredient and get the timing of the browning right.</p>
<ul>
<li><p>White cabbage</p></li>
<li><p>Bok choy</p></li>
</ul>
<p>I don’t have as much experience with bok choy and am uncertain whether it should be browned or not but it’s been a nice replacement for white cabbage in the past.</p>
<ul>
<li>Broccoli</li>
</ul>
<p>Confess I do not care for broccoli much and so rarely use it but I think broccoli would be improved by cooking in butter before addition to the crockpot.</p>
<ul>
<li>Cauliflower</li>
</ul>
<p>Sister to broccoli.</p>
<ul>
<li>Artichoke</li>
</ul>
<p>No strong opinions here, it’s been good when I’ve used it. Similar to asparagus.</p>
<ul>
<li>Asparagus (B)</li>
</ul>
<p>I will often roast asparagus on the grill in olive oil, salt, and pepper and it’s lovely. You can do similar in the pan before adding to a crockpot.</p>
<ul>
<li>Turnip</li>
</ul>
<p>I used turnip in my most recent stew and thought it added a nice edge to the stew. Only (downside?) is that the turnips will retain their fibrous texture even after a good slow cook so if you want them to disintegrate you may need to cook separately. This may be necessary if you have children you don’t want detecting the presence of vegetables. Downside to doing so may be less fiber content in the stew.</p>
<h3 id="side-note-on-astringent">Side note on “astringent”</h3>
<p>Pre-modern Europeans used garlic as an antibiotic for wounds.</p>
<h2 id="citrusacidic">Citrus/acidic</h2>
<p>I don’t do as much of these as I’m typically going for a balance of sweet/astringent/savory in my stews but they’re an option if it’s something you want. I won’t listen them out as I don’t have as much experience with them in stews but these are going to be your fruits, berries, and some varieties of nuts. I’ve considered roasting walnuts for addition to a stew but haven’t tried it yet.</p>
<h2 id="oils-and-fats">Oils and fats</h2>
<p>Typically when I make a stew I’m trying to make it “fatty” without it developing a grotesque top layer of fat on top when it settles in the fridge. To that end, I’ve started using <em>slightly</em> leaner meats than the 70/30 I used to use, more like 80/20 or 85/15. The browning purges some fat as well. To complement the meat fats, I’ll sometimes add button to the stew while it’s slow cooking. I’ll taste to determine whether or not it needs more fat.</p>
<h2 id="bones-not-the-rapper">Bones (not the rapper)</h2>
<p>The last stew I made was with beef rib. I should’ve separated the beef from the bones before putting them in the crockpot. I ended up having to do this after it was mostly done slow cooking. I thought the slow cooking would break down the connective tissue between the bone and protein but it did not.</p>
<p>That said, bones, especially if they have some marrow in them, are <em>excellent</em> for stews and worth getting ahold of if you can. Goat can be good for this as the meat often comes on the bone with some good marrow.</p>
<h1 id="an-example-recipe">An example recipe</h1>
<p>This is from memory what my last stew was, I’ll try to link tweets describing it as well. It was a modified version of this <a href="https://yoursandmineareours.com/low-carb-slow-cooker-beef-stew/">low carb beef stew recipe</a>.</p>
<ul>
<li><p>Beef rib, browned</p></li>
<li><p><a href="https://twitter.com/bitemyapp/status/920063230302494720">Mushrooms (baby bella)</a> I browned them in a little bit of butter and some worcestershire (wurster-shur) sauce. Note the color.</p></li>
<li><p>Tomato paste</p></li>
<li><p>Beef stock. If you want a very strong flavor, use beef consomme.</p></li>
<li><p>Some added water. Not a lot, I don’t care much how thick the stew gets for the most part as long as everything gets cooked properly and the texture is right. I’ll often heat water with my electric kettle and pour it over reheated stew that doesn’t have enough liquid. Tastes great, not watery/bland.</p></li>
<li><p>Salt, oregano, thyme, fresh ground black pepper, paprika, garlic powder</p></li>
<li><p>Raw onion, minced (I use a fork to mince) garlic cloves</p></li>
<li><p>Carrots, turnips, bell peppers.</p></li>
</ul>
<p>I didn’t add any xanthan gum, it was plenty thick at the end. I only had a 1 and a half, maybe 2 pounds of beef rib in the crockpot.</p>
<h1 id="coda">Coda</h1>
<p>I hope this helps some of y’all in making tasty, healthy food! I really like stews and think they’re a wonderfully adaptable and accessible way to make food.</p>
</div>
<div class="blurb">
<p>
I know this site is a bit of a disaster zone, but if you like my writing or think you could learn something useful from me, please <a href="http://haskellbook.com/">take a look at the Haskell book</a> I've been writing. There's a free sample available too!
</p>
</div>
<div class="footer">
<p>
Posted on October 29, 2017
</p>
</div>Sun, 29 Oct 2017 00:00:00 +0000