CPS fold -- fold with early exit
The most general function to traverse a data-structures is the fold
function. But fold
has one
problem that is sometimes not optimal. It always traverses the whole data-structure and we cannot
abort the recusion early.
But sometimes, that is exactly what we want to do. For example when we want to search for a specific element in a list, when we found it, we don’t want to go through the remaing list. When we want to check if all elements in a list satisfy a specific predicate then we also can stop on the first element that does not satisfy our predicate. And dozens of other cases where an early abort could be helpful.
We always can write our own recursive functions for those cases, but then we must ensure that we get tail-recursion right. Wouldn’t it be better if we could abstract it just like fold? In this article I explain how to write a CPS fold that allows us to do this.
Continuation
The important concept in implementing a CPS fold is a so-called Continuation function or often just named CPS (Continuation-Passing Style). The idea of CPS is that we just pass an additional function as an argument that the user can call to explicitly recurs. This way the user of the CPS fold is in control of the recursion. The user then can decide if he wants to continue traversing a data-structure or return a value instead. But before we look into how we implement the function, let’s see some use cases in how we use such a function.
We name our function foldk
. Besides that, foldk
looks nearly the same as fold
,
the only difference is that the function we pass to foldk
now receives three arguments instead
of just two.
|
|
When we provide the same code then we already see how it differs. fold
always runs through all
elements of the list. It computes the accumulator and does all recursion on itself.
foldk
on the other hand don’t do any recursion on it’s own. foldk
always just do a single
step. It just extract one element from our list and calls the folder function with the provided
acc
and the first element of our list.
That’s why we get 1
as a result. It just calculates acc + x
or 0 + 1
in the above example
and then it immediately ends. We must explicitly tell foldk
when it should recurs.
That’s the reason why we have the third argument. k
is the continuation function. k
expects
the next accumulator. When we call k
we start recurring again on the next element in our list.
The primary difference to fold
is that the user of foldk
has explicit control when
recurring should happen.
|
|
When we want to traverse all elements of our list, then we just call k
with the next accumulator.
But if we wanted to do that, we also could just use fold
. So here is a more practical example
in comparison to fold
.
|
|
Now we are getting different results. What fold
did was: Pick every element that is smaller
than 11 and add them together. foldk
on the other hand runs as long all elements are smaller
than 11
and only add those together it saw up to this point. As soon he encounter a bigger
number it will stop traversing the list. fold
calculated 5 + 10 + 10 + 5
while foldk
just summed up the first two elements 5 + 10
.
Implementing foldk
Implementing foldk
is actually pretty easy, let’s go over it:
|
|
First we start with the general pattern to traverse a list. We test if we either have an empty list or we extract the first element of our list. Then we think what we do in both cases.
The empty case is pretty easy. When we reached the end of our list, then we cannot advance forward
anymore, that means we just return the accumulator acc
.
Otherwise when we have an element we need to do something with every element. That is what our f
is for. So we just call f acc x ???
. In a normal fold
we do the recursion inside fold
, but
in foldk
we want to give the user the ability to recurs, that is why our third argument
is now a continuation function (fun lacc -> foldk f lacc xs)
. The Continuation function expects
the next accumulator. As you can see, when the user decides to call k
, it just calls foldk
again.
Let’s go over a simple example to see how it works:
|
|
The result of our function is 15
. Let’s see step-by-step how we got this result. When we call foldk
we start with 0
as our acc
and the list [1;2;3;4;5]
as our starting list. As a reminder this is
foldk
.
|
|
When i write f acc x k
in the code section i refer to the whole right hand side of x::xs
that
means f acc x (fun lacc -> foldk f lacc xs)
. I just use []
and x::xs
to represent the
pattern matching in the foldk
function.
Code | Evaluation / Description |
---|---|
foldk f 0 [1..5] |
First call, we start foldk |
[] |
No, we did not reach the end |
x::xs |
1::[2;3;4;5] / Yes, it maches |
f acc x k |
f 0 1 (fun lacc -> foldk f lacc [2;3;4;5]) / We now execute f |
k (acc + x) |
k (0 + 1) / k is the lambda function passed to f |
foldk f 1 [2;3;4;5] |
|
[] |
No |
x::xs |
2::[3;4;5] |
f acc x k |
f 1 2 (fun lacc -> foldk f lacc [3;4;5]) |
k (acc + x) |
k (1 + 2) |
foldk f 3 [3;4;5] |
|
[] |
No |
x::xs |
3::[4;5] |
f acc x k |
f 3 3 (fun lacc -> foldk f lacc [4;5]) |
k (acc + x) |
k (3 + 3) |
foldk f 6 [4;5] |
|
[] |
No |
x::xs |
4::[5] |
f acc x k |
f 6 4 (fun lacc -> foldk f lacc [5]) |
k (acc + x) |
k (6 + 4) |
foldk f 10 [5] |
|
[] |
No |
x::xs |
5::[] |
f acc x k |
f 10 5 (fun lacc -> foldk f lacc []) |
k (acc + x) |
k (10 + 5) |
foldk f 15 [] |
|
[] -> acc |
[] -> 15 / Yes, we just return acc (15) |
And one-more time with an example that stops earlier:
|
|
Code | Evaluation / Description |
---|---|
foldk f 0 [1..5] |
First call, we start foldk |
[] |
No |
x::xs |
1::[2;3;4;5] |
f acc x k |
f 0 1 (fun lacc -> foldk f lacc [2;3;4;5]) |
x < 3 |
1 < 3 / True, then branch |
k (acc + x) |
k (0 + 1) |
foldk f 1 [2;3;4;5] |
|
[] |
No |
x::xs |
2::[3;4;5] |
f acc x k |
f 1 2 (fun lacc -> foldk f lacc [3;4;5]) |
x < 3 |
2 < 3 / True, then branch |
k (acc + x) |
k (1 + 2) |
foldk f 3 [3;4;5] |
|
[] |
No |
x::xs |
3::[4;5] |
f acc x k |
f 3 3 (fun lacc -> foldk f lacc [4;5]) |
x < 3 |
3 < 3 / False, else branch |
else acc |
else 3 The pattern match on x::xs now returns 3 |
Implementing some other functions
Now, let’s use our foldk
function to implement some other functions. First tryPick
|
|
We start with None
as our default value. Once we found a matching x
we just return it,
otherwise we recurs. When we reach the end without finding an element, we return acc
that still contains None
.
|
|
Very similar to tryPick
. We start with false
and return it when we reach the end
of the list without finding our wanted element. Otherwise we immediately return true
.
I think the rest of the function are nearly self-specking as they are all very similar.
|
|
The last line is btw. not the exact behaviour of List.take
. The standard implementation throws
an exception if the input list has not enough elements. We can achieve the same by checking
the collected
field after foldk
finished and throw an exception if it is not the same
as amount
. But I like my behaviour more than the default implementation.
I could continue by implementing further functions, but I think at this point it should
be obvious how foldk
works and how we use it.
Summary
Implementing a CPS fold gives the control when to recurs to the caller of foldk
.
Some task are easier solved by foldk
. Some other task can sometimes be more efficient as we don’t
need to traverse the complete data-structure. In general if you ever wanted something similar
to break
or continue
as you know it from imperative looping constructs. With foldk
you have this ability and it works fine with immutable data-structures.