content/blog/2016/10/iterate-timing.markdown @ bc927ac92d41

lolutils
author Steve Losh <steve@stevelosh.com>
date Fri, 13 Jan 2017 22:22:49 +0000
parents 653f385c1bee
children f5556130bda1
+++
title = "Customizing Common Lisp's Iterate: Timing"
snip = "How long is this loop?"
date = 2016-10-10T14:50:00Z
draft = false

+++

This entry is the second in a series about customizing Common Lisp's [iterate][]
library.  The full series:

* [Averaging](/blog/2016/09/iterate-averaging/)
* [Timing](/blog/2016/10/iterate-timing/)

This post will show how to make a `timing` clause that keeps track of how much
time has elapsed between iterations (and across the entire loop).  I've found it
handy when doing rough benchmarks for things like [maze generation][mazes].

[mazes]: http://www.amazon.com/dp/1680500554/?tag=stelos-20
[iterate]: https://common-lisp.net/project/iterate/

<div id="toc"></div>

## End Result

Before we look at the code, let's look at what we're aiming for:

```lisp
(defun to-seconds (internal-time-units)
  (/ internal-time-units internal-time-units-per-second 1.0))

(iterate
  (repeat 5)
  (sleep (random 3.0))
  (timing real-time :since-start-into total-time
                    :per-iteration-into loop-time)
  (collect loop-time :into loop-times)
  (finally
    (format t "Individual times: ~S~%Total time: ~F seconds~%"
            (mapcar #'to-seconds loop-times)
            (to-seconds total-time))))
; =>
Individual times: (0.948 1.359 2.828 2.891 2.744)
Total time: 10.77 seconds
```

The `timing` clause is what we'll be implementing.

## Code

The code for `timing` is fairly short but does have a few interesting things to
talk about.  I've omitted the docstring so it'll fit on one screen:

```lisp
(defmacro-clause (TIMING time-type &optional
                  SINCE-START-INTO since-var
                  PER-ITERATION-INTO per-var)
  (let ((timing-function (ecase time-type
                           ((real-time) #'get-internal-real-time)
                           ((run-time) #'get-internal-run-time)))
        (since-var (or since-var
                       (when (null per-var)
                         iterate::*result-var*))))
    (with-gensyms (start-time current-time previous-time)
      `(progn
        (with ,start-time = (funcall ,timing-function))
        (for ,current-time = (funcall ,timing-function))
        ,@(when since-var
            `((for ,since-var = (- ,current-time ,start-time))))
        ,@(when per-var
            `((for ,previous-time :previous ,current-time
                                  :initially ,start-time)
              (for ,per-var = (- ,current-time ,previous-time))))))))
```

Times are recorded in "internal time units" to be consistent with other Common
Lisp functions like [get-internal-real-time][].  Some implementations will be
more precise than others.

[get-internal-real-time]: http://www.lispworks.com/documentation/HyperSpec/Body/f_get_in.htm#get-internal-real-time

### Argument Pairing

One thing that I sometimes find annoying about `iterate` clauses is that they
must contain *pairs* of arguments.  You can't have an odd number of arguments,
like:

```
(TIMING &optional SINCE-START since-var PER-ITERATION per-var).
```

Often, however, you can make the best of this restriction and come up with
something at least *moderately* useful to use as an extra argument to pad things
out.  For the `timing` clause I decided to make the second argument define which
type of time to measure (process run time or wall clock time).

### Symbols & Keywords

In this implementation the type of time to record is given as a symbol in this
package's namespace.  This means that you'll want to export the `real-time` and
`run-time` symbols from this package, not just the `timing` symbol.  This is
a little annoying, but I like how it looks when used.

Another option would be to change these to keyword symbols, so you would say
`(timing :real-time ...)` instead.  Or you could just support both (`ecase`
makes this easy).

Note the use of `ecase` when checking the time type to signal an error at
macroexpansion time with a helpful message if the user passed something
incorrect:

```
(iterate (timing foo :since-start-into some-var))
; =>
debugger invoked on a SB-KERNEL:CASE-FAILURE in thread
 #<THREAD "main thread" RUNNING {10028F7473}>:
   FOO fell through ECASE expression. Wanted one of (REAL-TIME RUN-TIME).

...
```

### Variables

To fit with the general pattern of `iterate` clauses, we try to do something
intelligent with the variables depending on what the user says they want.  If
the user doesn't specify any variable names, we return the total loop time as
the result of the loop:

```
(macroexpand-iterate '(timing real-time))
; =>
(PROGN
 (WITH #:START-TIME516 = (FUNCALL #<CLOSURE GET-INTERNAL-REAL-TIME>))
 (FOR #:CURRENT-TIME517 = (FUNCALL #<CLOSURE GET-INTERNAL-REAL-TIME>))
 (FOR ITERATE::*RESULT-VAR* = (- #:CURRENT-TIME517 #:START-TIME516)))

(iterate (repeat 1) (sleep 2.0) (timing real-time))
; =>
2003
```

There's a bit of inefficiency here, because we're recalculating the time on
every iteration even though we really only need to do it once at the very end.
But I suppose someone could be accessing `iterate::*result-var*` in their loop
body like any other variable, so this will do the right thing in that case (even
if that case *is* horrifying).

If the user does specify a variable for either measurement, we assume they know
what they're doing and don't touch the loop result variable:

```
(macroexpand-iterate
  '(timing real-time :since-start-into x))
; =>
(PROGN
 (WITH #:START-TIME525 = (FUNCALL #<CLOSURE GET-INTERNAL-REAL-TIME>))
 (FOR #:CURRENT-TIME526 = (FUNCALL #<CLOSURE GET-INTERNAL-REAL-TIME>))
 (FOR X = (- #:CURRENT-TIME526 #:START-TIME525)))

(macroexpand-iterate
  '(timing real-time :per-iteration-into x))
; =>
(PROGN
 (WITH #:START-TIME528 = (FUNCALL #<CLOSURE GET-INTERNAL-REAL-TIME>))
 (FOR #:CURRENT-TIME529 = (FUNCALL #<CLOSURE GET-INTERNAL-REAL-TIME>))
 (FOR #:PREVIOUS-TIME530 :PREVIOUS #:CURRENT-TIME529 :INITIALLY
  #:START-TIME528)
 (FOR X = (- #:CURRENT-TIME529 #:PREVIOUS-TIME530)))
```

### Splicing

Another thing to note is the use of `,@` to splice the appropriate clauses into
the final `progn`:

```lisp
; ...
        ,@(when since-var
            `((for ,since-var = (- ,current-time ,start-time))))
        ,@(when per-var
            `((for ,previous-time :previous ,current-time
                                  :initially ,start-time)
              (for ,per-var = (- ,current-time ,previous-time))))
; ...
```

For the `since-var` clause we could have gotten away with just using a plain `,`
and returning the `for` clause itself:

```lisp
; ...
        ,(when since-var
           `(for ,since-var = (- ,current-time ,start-time)))
; ...
```

This would work, but when `since-var` is null (when the user wants only the
per-iteration time) the `when` statement would evaluate to `nil`, and would
leave a stray `nil` in the `progn`:

```
(macroexpand-iterate
  '(timing real-time :per-iteration-into x))
; =>
(PROGN
 (WITH #:START-TIME531 = (FUNCALL #<CLOSURE GET-INTERNAL-REAL-TIME>))
 (FOR #:CURRENT-TIME532 = (FUNCALL #<CLOSURE GET-INTERNAL-REAL-TIME>))
 NIL
 (FOR #:PREVIOUS-TIME533 :PREVIOUS #:CURRENT-TIME532 :INITIALLY
  #:START-TIME531)
 (FOR X = (- #:CURRENT-TIME532 #:PREVIOUS-TIME533)))
```

Any decent compiler will optimize this away, but it's ugly to read during
macroexpansion.  If we're going to write this thing let's do it *right* and not
be lazy.

Also notice that we don't define the `previous-time` clause at all unless the
user wants per-iteration time, because the since-start time doesn't need it at
all.  No reason to waste cycles calculating an extra value on every iteration if
we don't have to.

## Composing

One of the great things about `iterate` is that the extensions you make can use
each other.  For example, we can use last post's `averaging` clause with the one
we just made to calculate the average time per loop iteration:

```lisp
(iterate
  (repeat 20)
  (sleep (random 1.0))
  (timing real-time :per-iteration-into time)
  (averaging (to-seconds time)))
; =>
0.51555
```