--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/content/blog/2022/04/fun-with-macros-do-file.markdown Mon Apr 18 21:22:22 2022 -0400
@@ -0,0 +1,243 @@
+(:title "Fun with Macros: Do-File"
+ :snip "Part 3 in a series of short posts about fun Common Lisp Macros."
+ :date "2022-04-19T13:15:00Z"
+ :draft t)
+
+It's been a while, but it's time to take a look at another fun little Common
+Lisp macro: `do-file`.
+
+<div id="toc"></div>
+
+## Usage
+
+The macro we'll be taking a look at today is called `do-file`. It's used to
+open a file and iterate over the contents using a reader function, saving you
+some tedious boilerplate.
+
+First let's look at some examples of how you could use it. Processing each
+line of a file is the default:
+
+```lisp
+(do-file (line "foo.txt")
+ (unless (string= "" line)
+ (write-line (string-upcase line))))
+```
+
+Using a different reader function and [another
+macro](/blog/2018/05/fun-with-macros-gathering/) to gather data from inside the
+iteration:
+
+```lisp
+(gathering
+ (do-file (n :reader #'read-integer)
+ (when (primep n)
+ (gather n))))
+```
+
+Passing along options to the underlying `open`, and returning early:
+
+```lisp
+(do-file (form "foo.lisp" :reader #'read :external-format :EBCDIC-US)
+ (when (eq form :stop)
+ (return :stopped-early))
+ (print form))
+```
+
+All of these could of course be done in other ways. You could have a separate
+function that reads the file into a sequence and then pass that to `mapcar` or
+something else, but it can be wasteful to cons up the entire list if you're only
+going to process items and don't need to retain then (or if you're going to stop
+early).
+
+You could also write a `mapc-file` that takes a function instead of making this
+a macro, but sometimes it's nice to not have to wrap things in a thunk. It's
+probably worth having that function as an additional tool in the toolbox though!
+
+## Implementation
+
+Here's the full implementation of the macro:
+
+```lisp
+(let ((eof (gensym "EOF")))
+ (defmacro do-file ((symbol path
+ &rest open-options
+ &key (reader '#'read-line) &allow-other-keys)
+ &body body)
+ "Iterate over the contents of `file` using `reader`.
+
+ During iteration, `symbol` will be set to successive values read from the
+ file by `reader`.
+
+ `reader` can be any function that conforms to the usual reading interface,
+ i.e. anything that can handle `(read-foo stream eof-error-p eof-value)`.
+
+ Any keyword arguments other than `:reader` will be passed along to `open`.
+
+ If `nil` is used for one of the `:if-…` options to `open` and this results
+ in `open` returning `nil`, no iteration will take place.
+
+ An implicit block named `nil` surrounds the iteration, so `return` can be
+ used to terminate early.
+
+ Returns `nil`.
+
+ Examples:
+
+ (do-file (line \"foo.txt\")
+ (print line))
+
+ (do-file (form \"foo.lisp\" :reader #'read :external-format :EBCDIC-US)
+ (when (eq form :stop)
+ (return :stopped-early))
+ (print form))
+
+ (do-file (line \"does-not-exist.txt\" :if-does-not-exist nil)
+ (this-will-not-be-executed))
+
+ "
+ (let ((open-options (alexandria:remove-from-plist open-options :reader)))
+ (alexandria:with-gensyms (stream)
+ (alexandria:once-only (path reader)
+ `(when-let ((,stream (open ,path :direction :input ,@open-options)))
+ (unwind-protect
+ (do ((,symbol
+ (funcall ,reader ,stream nil ',eof)
+ (funcall ,reader ,stream nil ',eof)))
+ ((eq ,symbol ',eof))
+ ,@body)
+ (close ,stream))))))))
+```
+
+There are a few interesting things to talk about here.
+
+### Let Over Defmacro
+
+The very first line is unusual: instead of the `defmacro` being the top level
+form, we wrap it in a `let` to generate one single unique EOF sentinel object:
+
+```lisp
+(let ((eof (gensym "EOF")))
+ (defmacro do-file (…)
+ …))
+```
+
+We could put the `let` inside the macro, but then we'd be generating a separate
+EOF object for every use of the macro, which is wasteful.
+
+### &rest and &key
+
+Note how the argument list of the macro takes both `&rest` and `&key` arguments, and uses
+`&allow-other-keys` to let the macro take arbitrary keyword arguments
+
+
+```lisp
+(defmacro do-file ((symbol path
+ &rest open-options
+ &key (reader '#'read-line) &allow-other-keys)
+ &body body)
+ (let ((open-options (alexandria:remove-from-plist open-options :reader)))
+ …
+ (when-let ((,stream (open ,path :direction :input ,@open-options)))
+ …)))
+```
+
+We pass along any keyword arguments we get (aside from the special `:reader`
+argument for this macro) to `open`. Using `&allow-other-keys` means we don't
+need to hardcode all the possible options to `open`, and also allows for
+additional implementation-specific options to be passed to `open` if the user
+wants.
+
+We could have omitted the keyword arguments entirely, taken the arguments as
+a raw `&rest`, and pulled out `:reader` ourselves with `getf`. But doing it
+this way means we don't have to fiddle around doing that, and also can also
+provide slightly nicer documentation in an editor when it shows the macro's
+argument list in the status bar. We'll also get a nicer error if we
+accidentally pass an odd number of keyword arguments.
+
+One more thing before we move on: note the extra level of quoting for the
+`(reader '#'read-line)` default value. It's important to remember that this is
+a *macro*, and so when someone writes `(do-file (… :reader #'foo) …)` the macro
+isn't getting the *function* `foo` because it's not evaluated yet, it's getting
+the *list* `(function foo)`. But the default value is *evaluated* when the
+argument is missing, so we need the extra layer of quoting to make sure the
+result makes sense and matches what we'd be getting normally.
+
+### Macros Using Macros
+
+We use `with-gensyms` and `once-only` from Alexandria to maintain good hygiene
+in the macro. We also use [`when-let`](/blog/2018/07/fun-with-macros-if-let/)
+to avoid some more boilerplate:
+
+```lisp
+(defmacro do-file (…)
+ (alexandria:with-gensyms (stream)
+ (alexandria:once-only (path reader)
+ `(when-let ((,stream (open ,path :direction :input ,@open-options)))
+ (unwind-protect
+ (do …)
+ (close ,stream))))))
+```
+
+### Don't Loop
+
+Finally we get to the meat of the macro:
+
+```lisp
+(do ((,symbol
+ (funcall ,reader ,stream nil ',eof)
+ (funcall ,reader ,stream nil ',eof)))
+ ((eq ,symbol ',eof))
+ ,@body)
+```
+
+Unfortunately we need to use the tedious `do` instead of `loop` here to avoid an
+annoying bug: if we expanded into a `loop` call, and the user is calling this
+from their *own* loop, and they use `(loop-finish)` in the body code, then it
+would finish *our* loop instead of *their* loop, which would very confusing.
+
+Imagine the user wrote this very contrived example:
+
+```lisp
+(defun find-the-cat (&rest paths)
+ (loop
+ :with result = nil
+ :for (path . remaining) :on paths
+ :for i :from 1
+ :do (do-file (line path)
+ (when (string= line "meow")
+ (setf result path)
+ (loop-finish))) ;; This should obviously go to the finally below.
+ :finally
+ (when result
+ (format t "Found cat after searching ~D files (did not search ~D other~:P)."
+ i (length remaining))
+ (return result))))
+```
+
+If `do-file` expanded into a `loop` form, then the `(loop-finish)` would only
+terminate *that* loop.
+
+The same issue kind of applies with the implicit block named `nil` around `do`.
+But this is much less surprising for a macro named `do-…`, and we've documented
+it in the docstring, so that's probably okay.
+
+### Repetition Allergies
+
+Using `do` here is a little annoying because the init form and the step form are
+exactly the same. If you're allergic to repeating yourself you could use `#n=`
+and `#n#` reader macros to get around it:
+
+```lisp
+(do ((,symbol #1=(funcall ,reader ,stream nil ',eof) #1#))
+ ((eq ,symbol ',eof))
+ ,@body)
+```
+
+I find this more confusing than helpful, but to each their own.
+
+## Result
+
+We've got a nice little macro for easily iterating over files piece by piece.
+It can take any reader function that conforms to the usual `(read-foo stream
+eof-error-p eof-value)` interface, which means we can write our own reader
+functions that will compose nicely with the macro.