Previous Up Next

Chapter 7 Programming with continuations

Continuation-passing style (CPS) was introduced in chapter 6 as a semantic device, with CPS transformations making reduction strategies explicit in functional languages. In this chapter, we show that CPS and variants are also a useful programming technique in functional languages, giving functions access to the continuations of their calls. This makes it possible to implement a number of advanced control structures (generators, coroutines, backtracking, etc.) as library functions in any functional language.

7.1 Writing functions in CPS

The defining characteristic of CPS is that functions do not return their results normally; instead, they always pass their results to continuations given as additional arguments. For example, consider the naive factorial function written in OCaml in the standard direct style:

let rec fact n = if n = 0 then 1 else n * fact (n-1)

In continuation-passing style (CPS), this function takes an extra argument k, which is the continuation to call with the result of the computation:

let rec cps_fact n k = if n = 0 then k 1 else cps_fact (n-1) (fun r -> k (n * r))

Note the change in types: intint for the direct style function, but ∀ α . int → (int → α) → α for the CPS function. Here, α is the type of the final result of the continuation. It is determined by the continuation passed to cps_fact.

The function cps_fact above can be produced mechanically, by applying one of the call-by-value CPS transformations from section 6.5 to the definition of fact. However, CPS transformations are designed to be applied to the whole program, while CPS programming often introduces continuations selectively: not for all functions, but only for those functions that benefit from CPS. For example, instead of converting all callers of cps_fact to CPS, some callers can remain in direct style, calling ⁠cps_fact n (⁠fun r -> r⁠) as the equivalent of ⁠fact n.

Why write functions in CPS? One reason is that recursive functions in CPS run in constant stack space: all recursive calls are tail calls. This does not mean that they run in constant space: the activation records that would be allocated on the stack in direct style are allocated (in the form of function closures) in the heap in CPS. Using heap space instead of stack space can still be advantageous in the case of language implementations that artificially limit the size of the stack.

But the main motivation for programming in CPS has nothing to do with these implementation considerations: it is to give CPS functions the ability to manipulate their continuations in ways other than just calling them with a final result.

7.2 Iterators

Consider binary trees with values at the nodes:

type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree

Here is an internal iterator (in the terminology of section 4.1) over trees:

let rec tree_iter (f: 'a -> unit) (t: 'a tree) = match t with | Leaf -> () | Node(l, x, r) -> tree_iter f l; f x; tree_iter f r

Let us rewrite it in CPS, adding continuation arguments to tree_iter and to the argument f:

let rec tree_iter (f: 'a -> (unit -> 'b) -> 'b) (t: 'a tree) (k: unit -> 'b) = match t with | Leaf -> k () | Node(l, x, r) -> tree_iter f l (fun () -> f x (fun () -> tree_iter f r k))

As shown by its type above, the function f takes as argument not only an element x of the tree, but also a continuation that captures the traversal of the tree elements that come after x. If f is written in standard CPS, it will do something with x and then invoke k, as in

tree_iter (fun x k -> print_int x; k ()) some_tree (fun () -> ())

However, f can use k differently. For example, it can store k in a reference for later use, thus obtaining a Python-style generator (imperative external iterator) over trees:

let tree_generator (t: 'a tree) : unit -> 'a = let rec next = ref (fun () -> tree_iter (fun x k -> next := k; x) t (fun () -> raise StopIteration)) in fun () -> !next ()

When the generating function returned by tree_generator t is first called, tree_iter starts traversing the tree, calling fun x k -> ... on the leftmost element x of the tree. This function stores the continuation of the traversal in reference next and then returns x to the caller. Subsequent calls to the generating function will restart the traversal where it left off, stopping at the next element in an in-order traversal and returning it.

We can also obtain a purely-functional external iterator from the tree_iter function, using lists computed on demand as the result:

type 'a enum = Done | More of 'a * (unit -> 'a enum) let tree_enumerator (t: 'a tree) : 'a enum = tree_iter (fun x k -> More(x, k)) t (fun () -> Done)

Given a non-empty tree t, calling ⁠tree_enumerator t returns More(x, k) where x is the leftmost element of t and k applied to () enumerates the following elements of t.

7.3 Asynchronous and concurrent programming

Computations written in CPS can suspend themselves at any point in the program, by storing the continuation in global data and relinquishing control entirely. An external agent such as a scheduler or a graphical user interface (GUI) can then restart the continuation when the time comes.

Consider a dialog box in a GUI application written in CPS:

dialog_box_yesno (sprintf "%s already exist.\nDo you want to replace it?" filename) (function Yes -> ... | No -> ...)

The function dialog_box_yesno can attach the given continuation as callbacks to the “Yes” and “No” buttons, then return control to the main event loop. This way, the GUI remains responsive while the dialog box is displayed.

Generalizing this pattern, we can easily implement cooperative multithreading for CPS computations. Suspended threads are represented as ⁠unit -> unit continuations and stored in a global queue named ready. The schedule function restarts one of these suspended threads.

let ready : (unit -> unit) Queue.t = Queue.create () let schedule () = match Queue.take_opt ready with | None -> () | Some k -> k ()

The threading library provides three operations: yield, to suspend the calling thread and restart another thread; terminate, to stop the execution of the calling thread, but keep other threads running; and spawn f to start the computation f() in a new thread.

let yield (k: unit -> unit) = Queue.add k ready; schedule() let terminate () = schedule () let spawn (f: unit -> unit) = Queue.add f ready

Note that yield takes a continuation as an argument, forcing the caller to be written in CPS. For example, here is a process that prints integers from 1 to count, with the given prefix, yielding control after each print:

let process name count = let rec proc n = if n > count then terminate () else begin printf "%s%d " name n; yield (fun () -> proc (n + 1)) end in proc 1

When run in isolation, process behaves sequentially: process "A" 5 prints A1 A2 A3 A4 A5. When several processes are run concurrently, they interleave execution at the yield points. For example,

spawn (fun () -> process "A" 5); spawn (fun () -> process "B" 3); process "C" 6

prints C1 A1 B1 C2 A2 B2 C3 A3 B3 C4 A4 C5 A5 C6.

7.4 Errors and exceptions

A function written in CPS can take several continuations instead of just one. Each continuation corresponds to a different way to exit the function, not unlike the multiple return points of Fortran 77 subroutines (see section 3.4). For example, here is a quadratic equation solver with two continuations, one for the case where there are real solutions and one for the other case:

let quadratic a b c k1 k2 = let d = b *. b -. 4. *. a *. c in if d < 0.0 then k2 () else begin let d = sqrt d in let x1 = (-. b +. d) /. (2. *. a) and x2 = (-. b -. d) /. (2. *. a) in k1 x1 x2 end

We have also chosen to pass the two roots x1 and x2 as two arguments to the continuation k1 instead of as a pair (x1, x2), to show that we can return several results this way. A typical use of quadratic would then be:

quadratic a b c (fun x1 x2 -> printf "Solutions: %g %g\n" x1 x2) (fun () -> printf "No real solutions\n")

This technique of using two continuations, one for the success case and one for the failure case, can be used to implement structured exceptions, via a variant of the CPS transformation known as the double-barreled CPS transformation; see section 9.3.

7.5 Backtracking

Like exceptions, Prolog-style backtracking can also be implemented using two continuations, one for success and one for failure, the latter triggering backtracking to the last choice point. The main difference with exceptions is that the success continuation takes the failure continuation as an argument, in addition to the returned value.

For example, consider regular expression matching. A regular expression parser is a function that takes a word w (a list of characters) and two continuations ksucc and kfail. If w can be decomposed as w = w1 . w2 with w1 matching the regular expression, then ksucc is applied to w2 (the rest of the input) and to kfail. Otherwise, kfail is called. This is reflected in the following OCaml types:

type 'a parser = char list -> 'a success -> 'a failure -> 'a and 'a success = char list -> 'a failure -> 'a and 'a failure = unit -> 'a

Here are parsers for single characters (char), for empty words (epsilon), and for the empty language (empty):

let char c : 'a parser = fun w succ fail -> match w with | c' :: w' when c' = c -> succ w' fail | _ -> fail () let epsilon : 'a parser = fun w succ fail -> succ w fail let empty : 'a parser = fun w succ fail -> fail ()

Here are parser combinators which, given parsers for regular expressions r and r′, construct parsers for concatenation (r . r′), alternation (rr′), and repetition (r*):

let seq (r: 'a parser) (r': 'a parser) : 'a parser = fun w succ fail -> r w (fun w' fail' -> r' w' succ fail') fail let alt (r: 'a parser) (r': 'a parser) : 'a parser = fun w succ fail -> r w succ (fun () -> r' w succ fail) let rec star (r: 'a parser) : 'a parser = fun w succ fail -> r w (fun w' fail -> if List.length w' < List.length w then star r w' succ fail else fail ()) (fun () -> succ w fail)

The repetition star r always succeeds: either by matching r then star r again, or by matching the empty string. However, when matching r, we must make sure that at least one character of the input was matched, hence the check List.length w’ < List.length w and the backtracking if this check fails.

The implementation of star r above follows a greedy strategy: it tries to match r as many times as possible, then goes back in case of failure. Here is an alternative implementation of star r that follows a non-greedy strategy, trying to match r as few times as possible.

let rec star (r: 'a parser) : 'a parser = fun w succ fail -> succ w (fun () -> r w (fun w' fail -> if List.length w' < List.length w then star r w' succ fail else fail ()) fail)

To determine whether a word matches a regular expression, we call its parser with a failure continuation that returns false and a success continuation that returns true if all characters were matched, or backtracks otherwise:

let matches w r = r w (fun w' fail -> if w' = [] then true else fail ()) (fun () -> false)

As an example where backtracking is necessary, consider matching the word w = ab against (aa.b): the first alternative, a, produces a partial match, leaving b unmatched; backtracking is necessary to try the second alternative a.b, which matches w completely.

7.6 Generating, filtering and counting

A function written in standard CPS never inspects the result of the continuation it takes as an argument: the function either calls the continuation in tail position, or wraps it in another continuation. This is reflected in the type of the function, which is often polymorphic in the return type of the continuation, as in the cps_fact example in section 7.1.

For some applications, it may be useful for a function to use the value returned by its continuation, as a kind of feedback from the continuation about the suitability of the result passed to the continuation. For example, the continuation can return a Boolean value, where true means “the result you gave me is correct” and false means “please give me another result”.

Here is the regular expression matching example above rewritten in this style. A parser now has a single continuation of type ⁠char list -> bool, which takes as argument the rest of the input word (the suffix that did not match) and returning true if this residual input is recognized and false otherwise. The parser itself returns true if the input word was fully matched, and false otherwise.

type parser = char list -> (char list -> bool) -> bool let char c : parser = fun w k -> match w with | c' :: w' when c' = c -> k w' | _ -> false let epsilon : parser = fun w k -> k w let empty : parser = fun w k -> false let seq (r: parser) (r': parser) : parser = fun w k -> r w (fun w' -> r' w' k) let alt (r: parser) (r': parser) : parser = fun w k -> r w k || r' w k let rec star (r: parser) : parser = fun w k -> r w (fun w' -> List.length w' < List.length w && star r w' k) || k w let matches w (r: parser) : bool = r w (fun w' -> w' = [])

In the code of section 7.5, backtracking was triggered by invoking the fail continuation; here, it is triggered by returning false. The Boolean “or” (the || operator) is used to try another alternative if the first alternative fails.

Instead of returning Booleans that say whether a solution was found, continuations can return integers that count how many solutions exist. In this style, the base functions enumerate several possibilities, pass them to the continuation, and sum the returned counts.

For example, here is a function that enumerates Boolean values, and another that enumerates integers between lo and hi:

let bool k = k false + k true let rec int lo hi k = if lo <= hi then k lo + int (lo + 1) hi k else 0

We can use the int generator to enumerate all the throws of three 6-sided dice and count how many throws result in 16 or more. (The answer is 10.)

int 1 6 (fun d1 -> int 1 6 (fun d2 -> int 1 6 (fun d3 -> if d1 + d2 + d3 >= 16 then 1 else 0)))

For a more interesting example, we can enumerate all AVL balanced binary trees of height h, measuring them using the continuation k:

let rec avltree h k = if h < 0 then 0 else if h = 0 then k Leaf else avltree2 (h-1) (h-1) k + avltree2 (h-2) (h-1) k + avltree2 (h-1) (h-2) k and avltree2 hl hr k = avltree hl (fun l -> avltree hr (fun r -> k (Node(l, 0, r))))

In an AVL tree, the heights of the two subtrees of a node must differ by at most one. Thus, if a tree has height h, its two subtrees have heights h−1, h−1 or h−1, h−2 or h−2, h−1. The helper function avltree2 generates nodes (non-empty trees) with value 0 and subtrees of respective heights hl and hr. By evaluating ⁠avltree 4 (⁠fun _ -> 1), we learn that there are 315 AVL trees of height 4.

Finally, instead of returning integer solution counts, we can also return probabilities represented as floating-point numbers in the [0,1] range. Continuations k, then, play the role of measures for the different possibilities enumerated by the functions. For example, here are the enumerators for Booleans and for integers in the [lo,hi] range, using measures instead of counts:

let bool k = 0.5 *. (k false +. k true) let int lo hi k = let rec sum i = if i <= hi then k i +. sum (i + 1) else 0.0 in sum lo /. float (hi - lo + 1)

Then, we can compute the probability of scoring exactly 15 on a three 6-sided dice as follows. (The answer is about 4.6%.)

int 1 6 (fun d1 -> int 1 6 (fun d2 -> int 1 6 (fun d3 -> if d1 + d2 + d3 = 15 then 1.0 else 0.0)))

7.7 Further reading

Chapters 5 and 6 of Friedman and Wand (2008) show other examples of programming with continuations, in particular for writing interpreters for languages featuring exceptions or concurrency.

CPS can also be used as an intermediate representation in optimizing compilers for functional languages, as in the RABBIT (Steele, 1978) and SML/NJ (Appel, 1992) compilers. Kennedy (2007) discusses advanced compiler optimizations that are easy to perform on a CPS-based intermediate representation.


Previous Up Next