a9d8b7a86226

Finish post
[view raw] [browse files]
author Steve Losh <steve@stevelosh.com>
date Tue, 16 Mar 2021 23:15:35 -0400
parents c635f15b37c1
children fc26c1c090dd
branches/tags (none)
files content/blog/2021/01/small-common-lisp-cli-programs.markdown content/blog/2021/03/small-common-lisp-cli-programs.markdown static/images/blog/2021/03/batchcolored.png static/images/blog/2021/03/batchcoloredfull.png static/images/blog/2021/03/grepcolored.png static/images/blog/2021/03/uncolored.png

Changes

--- a/content/blog/2021/01/small-common-lisp-cli-programs.markdown	Sun Dec 27 22:08:26 2020 -0500
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,854 +0,0 @@
-(:title "Writing Small CLI Programs in Common Lisp"
- :snip "Somewhere between tiny shell scripts and full projects."
- :date "2021-01-02T15:50:00Z"
- :draft t)
-
-I've found Common Lisp to be a good language for this.  But it can be a little
-intimidating to get started, especially for beginners, because Common Lisp is
-a very flexible language and doesn't lock you into one way of working.  In this
-post I'll describe how I write small, stand-alone command line programs in
-Common Lisp.
-
-<div id="toc"></div>
-
-## Requirements
-
-When you're writing programs in Common Lisp, you've got a lot of options.  For
-this use case, laying out the requirements I have helped me decide on an
-approach.
-
-First, each new program should be one single file.  A few ancillary files for
-all scripts together (e.g. a `Makefile`) are okay, but adding a new program
-should mean adding one single file.  For larger programs a full project
-directory and ASDF system are great, but for small programs having one file per
-program reduces the overhead quite a bit.
-
-The programs need to be able to be developed in the typical Common Lisp
-interactive style (in my case: with Swank and VLIME).  The interactive
-development is one of the best parts of working in Common Lisp, and I won't give
-it up.  In particular, this means that a shell-script style approach, with
-`#!/path/to/sbcl --script` and the top and directly running code at the top
-level in the file does not work for two main reasons:
-
-* Loading that file will fail due to the shebang unless you have some ugly
-  reader macros in your startup file.
-* The program will need to do things like parsing command-line arguments and
-  exiting with an error code, and running `exit` would kill the Swank process.
-
-The programs need to be able to use libraries, so Quicklisp will need to be
-involved.  Common Lisp has a lot of nice things built-in, but there's some
-libraries that are just too useful to pass up.
-
-The programs will need to have proper user interfaces.  Command line arguments
-must be robustly parsed (e.g. collapsing `-a -b -c foo -d` into `-abc foo -d`
-should work as expected), malformed or unknown options must be caught instead of
-dropping them on the floor, error messages should be meaningful, and the
-`--help` should be thoroughly and thoughtfully written.
-
-Relying on some basic conventions (e.g. a command `foo` is always in `foo.lisp`
-and defines a package `foo` with a function called `toplevel`) is okay if it
-makes my life easier.  These programs are just for me, so I don't have to worry
-about people wanting to create executables with spaces in the name or something.
-
-## Solution Skeleton
-
-After trying a number of different approaches, I've settled on a solution that
-I'm pretty happy with.  First I'll describe the general approach, then we'll
-look at one actual example program in its entirety.
-
-### Directory Structure
-
-I keep all my small single-file Common Lisp programs in a `lisp` directory
-inside my dotfiles repository.  Its contents look like this:
-
-```
-…/lisp/
-    bin/
-        foo
-        bar
-    man/
-        man1/
-            foo.1
-            bar.1
-    build-binary.sh
-    build-manual.sh
-    Makefile
-    foo.lisp
-    bar.lisp
-```
-
-The `bin` directory is where the executable files end up.  I've added it to my
-`$PATH` so I don't have to symlink or copy the binaries anywhere.
-
-`man` contains the generated `man` pages.  Because it's adjacent to `bin` (which
-is on my path) the `man` program automatically finds the `man` pages as
-expected.
-
-`build-binary.sh`, `build-manual.sh`, and `Makefile` are some glue to make
-building programs easier.
-
-The `.lisp` files are, of course, the programs.  Each new program I want to add
-only requires adding the `<programname>.lisp` file in this directory and running
-`make`.
-
-### Lisp Files
-
-All my small Common Lisp programs follow a few conventions, which makes building
-them easier.  Let's look at the skeleton of a `foo.lisp` file as an example.
-I'll show the entire file here, and then step through it piece by piece.
-
-```lisp
-(eval-when (:compile-toplevel :load-toplevel :execute)
-  (ql:quickload '(… :with-user-abort) :silent t))
-
-(defpackage :foo
-  (:use :cl)
-  (:export :toplevel …))
-
-(in-package :foo)
-
-;;;; Configuration -----------------------------------------------
-(defparameter *version* "1.0.0")
-(defparameter *some-option* nil)
-
-;;;; Errors ------------------------------------------------------
-(define-condition user-error (error) ())
-
-(define-condition missing-foo (user-error) ()
-  (:report "A foo is required, but none was supplied."))
-
-;;;; Functionality -----------------------------------------------
-(defun foo (string)
-  …)
-
-;;;; Run ---------------------------------------------------------
-(defun run (arguments)
-  (map nil #'foo arguments))
-
-;;;; User Interface ----------------------------------------------
-(defmacro exit-on-ctrl-c (&body body)
-  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
-     (with-user-abort:user-abort () (sb-ext:exit :code 130))))
-
-(defparameter *ui*
-  (adopt:make-interface
-    :name "foo"
-    …))
-
-(defun toplevel ()
-  (sb-ext:disable-debugger)
-  (exit-on-ctrl-c
-    (multiple-value-bind (arguments options)
-        (adopt:parse-options-or-exit *ui*)
-      … ; Handle options.
-      (handler-case (run arguments)
-        (user-error (e)
-          (format *error-output* "error: ~A~%" e)
-          (adopt:exit 1))))))
-```
-
-Let's go through each chunk of this.
-
-```lisp
-(eval-when (:compile-toplevel :load-toplevel :execute)
-  (ql:quickload '(:with-user-abort …) :silent t))
-```
-
-First we `quickload` any necessary libraries.  We always want to do this, even
-when compiling the file, because we need the appropriate packages to be loaded
-when we try to use their symbols later in the file.
-
-[`with-user-abort`](https://github.com/compufox/with-user-abort) is a library
-for portably handling `control-c`, which all of these small programs use.
-
-```lisp
-(defpackage :foo
-  (:use :cl)
-  (:export :toplevel *ui*))
-
-(in-package :foo)
-```
-
-Next we define a package `foo` and switch to it.  The package is always named
-the same as the resulting binary and the basename of the file.  The package
-always exports the symbols `toplevel` and `*ui*`.
-
-```lisp
-;;;; Configuration -----------------------------------------------
-(defparameter *version* "1.0.0")
-(defparameter *some-option* nil)
-```
-
-Next we define any configuration variables.  These will be set later after
-parsing the command line arguments (when we run the command line program) or
-at the REPL (when developing interactively).
-
-```lisp
-;;;; Errors ------------------------------------------------------
-(define-condition user-error (error) ())
-
-(define-condition missing-foo (user-error) ()
-  (:report "A foo is required, but none was supplied."))
-```
-
-We define a `user-error` condition, and any errors the user might make will
-inherit from it.  This will make it easy to treat user errors (e.g. passing
-a mangled regular expression like `(foo+` as an argument) differently from
-programming errors (i.e. bugs).  This makes it easier to treat those errors
-differently:
-
-* Bugs should print a backtrace or enter the debugger.
-* Expected user errors should print a helpful error message with no backtrace or debugger.
-
-```lisp
-;;;; Functionality -----------------------------------------------
-(defun foo (string)
-  …)
-```
-
-Next we have the actual functionality of the program.
-
-```lisp
-;;;; Run ---------------------------------------------------------
-(defun run (arguments)
-  (map nil #'foo arguments))
-```
-
-We now define a function `run` that takes some arguments (as strings) and
-performs the main work of the program.
-
-Importantly, `run` does **not** handle command line argument parsing, and it does
-**not** exit the program with an error code, which means we can safely call it
-to "run the program" when we're developing interactively without worrying about
-it killing our Lisp process.
-
-Finally, we need to define the command line interface.
-
-```lisp
-;;;; User Interface ----------------------------------------------
-(defmacro exit-on-ctrl-c (&body body)
-  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
-     (with-user-abort:user-abort () (sb-ext:exit :code 130))))
-```
-
-We'll make a little macro around `with-user-abort` to make it less wordy.  We'll
-[exit with a status of 130](https://tldp.org/LDP/abs/html/exitcodes.html) if the
-user presses `ctrl-c`.
-
-```lisp
-(defparameter *ui*
-  (adopt:make-interface
-    :name "foo"
-    …))
-```
-
-Here we define the `*ui*` variable whose symbol we exported above.  [Adopt][] is
-a command line argument parsing library I wrote.  If you want to use a different
-library, feel free.
-
-[Adopt]: https://docs.stevelosh.com/adopt
-
-```lisp
-(defun toplevel ()
-  (sb-ext:disable-debugger)
-  (exit-on-ctrl-c
-    (multiple-value-bind (arguments options)
-        (adopt:parse-options-or-exit *ui*)
-      … ; Handle options.
-      (handler-case (run arguments)
-        (user-error (e)
-          (format *error-output* "error: ~A~%" e)
-          (adopt:exit 1))))))
-```
-
-And finally we define the `toplevel` function.  This will only ever be called
-when the program is run as a standalone program, never interactively.  It
-handles all the work beyond the main guts of the program that are handled by the
-`run` function:
-
-* Disabling or enabling the debugger.
-* Exiting the process with the appropriate status code on errors.
-* Parsing command line arguments.
-
-That's it for the structure of the `.lisp` files.
-
-### Building Binaries
-
-`build-binary.sh` is a small script to build the executable binaries from the
-`.lisp` files.  `./build-binary.sh foo.lisp` will build `foo`:
-
-```bash
-#!/usr/bin/env bash
-
-set -euo pipefail
-
-LISP=$1
-NAME=$(basename "$1" .lisp)
-shift
-
-sbcl --load "$LISP" \
-     --eval "(sb-ext:save-lisp-and-die \"$NAME\"
-               :executable t
-               :save-runtime-options t
-               :toplevel '$NAME:toplevel)"
-```
-
-Here we see where the naming conventions have become important — we know that
-the package is named the same as the binary and that it will have the symbol
-`toplevel` exported, which names the entry point for the binary.
-
-### Building Man Pages
-
-`build-manual.sh` is similar and builds the `man` pages using [Adopt][]'s
-built-in `man` page generation.  If you don't care about building `man` pages
-for your personal programs (I admit, it's a little bit silly) you can ignore
-this.
-
-```bash
-#!/usr/bin/env bash
-
-set -euo pipefail
-
-LISP=$1
-NAME=$(basename "$LISP" .lisp)
-OUT="$NAME.1"
-shift
-
-sbcl --load "$LISP" \
-     --eval "(with-open-file (f \"$OUT\" :direction :output :if-exists :supersede)
-               (adopt:print-manual $NAME:*ui* :stream f))" \
-     --quit
-```
-
-This is why we always name the Adopt interface variable `*ui*` and export it
-from the package.
-
-### Makefile
-
-Finally we have a simple `Makefile` so we can run `make` to regenerate any
-out of date binaries and `man` pages:
-
-```make
-files := $(wildcard *.lisp)
-names := $(files:.lisp=)
-
-.PHONY: all clean $(names)
-
-all: $(names)
-
-$(names): %: bin/% man/man1/%.1
-
-bin/%: %.lisp build-binary.sh Makefile
-	mkdir -p bin
-	./build-binary.sh $<
-	mv $(@F) bin/
-
-man/man1/%.1: %.lisp build-manual.sh Makefile
-	mkdir -p man/man1
-	./build-manual.sh $<
-	mv $(@F) man/man1/
-
-clean:
-	rm -rf bin man
-```
-
-We use a `wildcard` to automatically find the `.lisp` files so we don't have to
-do anything other than adding a new file when we want to make a new program.
-
-The most notable line here is `$(names): %: bin/% man/man1/%.1` which uses
-a [static pattern rule](https://www.gnu.org/software/make/manual/html_node/Static-Pattern.html#Static-Pattern)
-to automatically define the phony rules for building each program.  If
-`$(names)` is `foo bar` this line effectively defines two phony rules:
-
-```
-foo: bin/foo man/man1/foo.1
-bar: bin/bar man/man1/bar.1
-```
-
-This lets us run `make foo` to make the binary and `man` page for `foo.lisp`.
-
-## Case Study: A Batch Coloring Utility
-
-Now that we've seen the skeleton, let's look at one of my actual programs that
-I use all the time.  It's called `batchcolor` and it's used to highlight regular
-expression matches in text (usually log files in my case) with a twist: each
-unique match is highlighted in a separate color, which makes it easier to
-visually parse the result.
-
-For example, suppose we have some log files with lines of the form `<timestamp>
-[<request ID>] <level> <message>` where request ID is a UUID, and messages might
-contain other UUIDs for various things.  Such a log file might look something
-like this:
-
-```
-2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8b7543/
-2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543 is not cached, retrieving...
-2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] WARN User 28b2d548-eff1-471c-b807-cc2bcee76b7d does not have access to thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543, denying request.
-2021-01-02 14:01:46 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Returning HTTP 404.
-2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8d7543/
-2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8d7543 is not cached, retrieving...
-2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] INFO Incoming request POST /users/sign-up/
-2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Returning HTTP 200.
-2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Error running SQL query: connection refused.
-2021-01-02 14:01:47 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Returning HTTP 500.
-```
-
-If I try to just read this directly, it's easy for my eyes to glaze over unless
-I laboriously read line-by-line.  I can use `grep` to highlight the UUIDs, but
-that honestly doesn't help too much:
-
-    grep -P '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}|$' foo.log
-
-`batchcolor` also highlights matches, but highlights each unique match in its
-own color:
-
-This is *much* easier for me to visually parse.  The interleaving of separate
-request logs is now obvious from the colors of the IDs, and it's easy to match
-up various user IDs and thing IDs at a glance.  Did you even notice that the two
-thing IDs were different before?
-
-`batchcolor` has a few other simple quality of life features, like picking
-explicit colors for specific strings (e.g. red for `ERR`):
-
-I wrap up this `batchcolor` invocation in an alias and use it to `tail` log
-files when developing locally almost every day, and it makes reading the log
-output *much* easier.
-
-Let's step through its code piece by piece.
-
-### Libraries
-
-```lisp
-(eval-when (:compile-toplevel :load-toplevel :execute)
-  (ql:quickload '(:adopt :cl-ppcre :with-user-abort) :silent t))
-```
-
-First we `quickload` libraries.  We'll use [Adopt][] for command line argument
-processing, [cl-ppcre][] for regular expressions, and the previously-mentioned
-[with-user-abort][] to handle `control-c`.
-
-### Package
-
-```lisp
-(defpackage :batchcolor
-  (:use :cl)
-  (:export :toplevel :*ui*))
-
-(in-package :batchcolor)
-```
-
-We define and switch to the appropriately-named package.  Nothing special here.
-
-### Configuration
-
-```lisp
-;;;; Configuration ------------------------------------------------------------
-(defparameter *version* "1.0.0")
-(defparameter *start* 0)
-(defparameter *dark* t)
-```
-
-Next we `defparameter` some variables to hold useful values (like the version)
-and settings.  `*start*` will be used later when randomizing colors, don't worry
-about it for now.
-
-### Errors
-
-```lisp
-;;;; Errors -------------------------------------------------------------------
-(define-condition user-error (error) ())
-
-(define-condition missing-regex (user-error) ()
-  (:report "A regular expression is required."))
-
-(define-condition malformed-regex (user-error)
-  ((underlying-error :initarg :underlying-error))
-  (:report (lambda (c s)
-             (format s "Invalid regex: ~A" (slot-value c 'underlying-error)))))
-
-(define-condition overlapping-groups (user-error) ()
-  (:report "Invalid regex: seems to contain overlapping capturing groups."))
-
-(define-condition malformed-explicit (user-error)
-  ((spec :initarg :spec))
-  (:report (lambda (c s)
-             (format s "Invalid explicit spec ~S, must be of the form \"R,G,B:string\" with colors being 0-5."
-                     (slot-value c 'spec)))))
-```
-
-Here we define the user errors.  Some of these are self-explanatory, while
-others will make more sense later once we see them in action.  The specific
-details aren't as important as the overall idea: for user errors we know might
-happen, display a helpful error message instead of just spewing a backtrace at
-the user.
-
-### Colorization
-
-Next we have the actual meat of the program.  Obviously this is going to be
-completely different for every program, so feel free to skip this if you don't
-care about this specific problem.
-
-```lisp
-;;;; Functionality ------------------------------------------------------------
-(defun rgb-code (r g b)
-  ;; The 256 color mode color values are essentially r/g/b in base 6, but
-  ;; shifted 16 higher to account for the intiial 8+8 colors.
-  (+ (* r 36)
-     (* g 6)
-     (* b 1)
-     16))
-```
-
-We're going to highlight different matches with different colors.  We'll need
-a reasonable amount of colors to make this useful, so using the basic 8/16 ANSI
-colors isn't enough.  Full 24-bit truecolor is overkill, but the 8-bit ANSI
-colors will work nicely.  If we ignore the base colors, we essentially have
-6 x 6 x 6 = 216 colors to work with.  `rgb-code` will take the red, green, and
-blue values from `0` to `5` and return the color code.  See [Wikipedia][8bit]
-for more information.
-
-[8bit]: https://en.wikipedia.org/wiki/ANSI_escape_code#8-bit
-
-```lisp
-(defun make-colors (excludep)
-  (let ((result (make-array 256 :fill-pointer 0)))
-    (dotimes (r 6)
-      (dotimes (g 6)
-        (dotimes (b 6)
-          (unless (funcall excludep (+ r g b))
-            (vector-push-extend (rgb-code r g b) result)))))
-    result))
-
-(defparameter *dark-colors*  (make-colors (lambda (v) (< v 3))))
-(defparameter *light-colors* (make-colors (lambda (v) (> v 11))))
-```
-
-Now we can build some arrays of colors.  We *could* use any of the 216 available
-colors, but in practice we probably don't want to, because the darkest colors
-will be too dark to read on a dark terminal, and vice versa for light terminals.
-In a concession to practicality we'll generate two separate arrays of colors,
-one that excludes colors whose total value is too dark and one excluding those
-that are too light.
-
-(You might notice that `*dark-colors*` is "the array of colors for dark
-terminals" and not "the array of colors which are not light".  Naming things is
-hard.)
-
-Note that these arrays will be generated when the `batchcolor.lisp` file is
-`load`ed, which is *when we build the binary*.  They *won't* be recomputed every
-time you run the resulting binary.  In this case it doesn't really matter (the
-arrays are small) but it's worth remembering in case you ever have some data you
-want (or don't want) to compute at build time instead of run time.
-
-```lisp
-(defparameter *explicits* (make-hash-table :test #'equal))
-```
-
-Here we make a hash table to store the strings and colors for strings we want to
-explicitly color (e.g. `ERR` should be red, `INFO` cyan).  The keys will be the
-strings and values the RGB codes.
-
-```lisp
-(defun djb2 (string)
-  ;; http://www.cse.yorku.ca/~oz/hash.html
-  (reduce (lambda (hash c)
-            (mod (+ (* 33 hash) c) (expt 2 64)))
-          string
-          :initial-value 5381
-          :key #'char-code))
-
-(defun find-color (string)
-  (gethash string *explicits*
-           (let ((colors (if *dark* *dark-colors* *light-colors*)))
-             (aref colors
-                   (mod (+ (djb2 string) *start*)
-                        (length colors))))))
-```
-
-For strings that we want to explicitly color, we just look up the appropriate
-code in `*explicits*` and return it.
-
-Otherwise, we want to highlight unique matches in different colors.  There are
-a number of different ways we could do this.  For example, we could randomly
-pick a color the first time we see a string and store it in a hash table for
-subsequent encounters.  But this would mean we'd grow over time, and one of the
-things I often use this utility for is `tail -f`ing long-running processes when
-developing locally, so the memory usage would grow and grow over time until the
-`batchcolor` process was restarted, which isn't ideal.
-
-Instead, we'll hash each string with a simple [DJB hash][djb] and use it to
-index into the appropriate array of colors.  This ensures that identical matches
-get identical colors, and avoids having to store every match we've ever seen.
-
-We'll talk about `*start*` later, ignore it for now (it's `0` by default).
-
-[djb]: http://www.cse.yorku.ca/~oz/hash.html
-
-```lisp
-(defun ansi-color-start (color)
-  (format nil "~C[38;5;~Dm" #\Escape color))
-
-(defun ansi-color-end ()
-  (format nil "~C[0m" #\Escape))
-
-(defun print-colorized (string)
-  (format *standard-output* "~A~A~A"
-          (ansi-color-start (find-color string))
-          string
-          (ansi-color-end)))
-```
-
-Next we have some functions to output the appropriate ANSI escapes to highlight
-our matches.  We could use a library for this but it's only 2 lines.  [It's not
-worth it](http://xn--rpa.cc/irl/term.html).
-
-And now we have the beating heart of the program:
-
-```lisp
-(defun colorize-line (scanner line &aux (start 0))
-  (ppcre:do-scans (ms me rs re scanner line)
-    ;; If we don't have any register groups, colorize the entire match.
-    ;; Otherwise, colorize each matched capturing group.
-    (let* ((regs? (plusp (length rs)))
-           (starts (if regs? (remove nil rs) (list ms)))
-           (ends   (if regs? (remove nil re) (list me))))
-      (map nil (lambda (word-start word-end)
-                 (unless (<= start word-start)
-                   (error 'overlapping-groups))
-                 (write-string line *standard-output* :start start :end word-start)
-                 (print-colorized (subseq line word-start word-end))
-                 (setf start word-end))
-           starts ends)))
-  (write-line line *standard-output* :start start))
-```
-
-`colorize-line` takes a CL-PPCRE scanner and a line, and outputs the line with
-any of the desired matches colorized appropriately.  There are a few things to
-note here.
-
-First: if the regular expression contains any capturing groups, we will only
-colorize those parts of the match.  For example, if you run `batchcolor
-'^<(\\w+)> '` to colorize the nicks in an IRC log, only the nicknames themselves
-will be highlighted, not the surrounding angle brackets.  Otherwise, if there
-are no capturing groups in the regular expression, we'll highlight the entire
-match (as if there were one big capturing group around the whole thing).
-
-Second: overlapping capturing groups are explicitly disallowed and
-a `user-error` signaled if we notice any.  It's not clear what do to in this
-case — if we match `((f)oo|(b)oo)` against `foo`, what should the output be?
-Highlight `f` and `oo` in the same color?  In different colors?  Should the `oo`
-be a different color than the `oo` in `boo`?  There's too many options with no
-clear winner, so we'll just tell the user to be more clear.
-
-To do the actual work, we iterate over each match and print the non-highlighted
-text before the match, then print the highlighted match.  Finally we print any
-remaining text after the last match.
-
-### Not-Quite-Top-Level Interface
-
-```lisp
-;;;; Run ----------------------------------------------------------------------
-(defun run% (scanner stream)
-  (loop :for line = (read-line stream nil)
-        :while line
-        :do (colorize-line scanner line)))
-
-(defun run (pattern paths)
-  (let ((scanner (handler-case (ppcre:create-scanner pattern)
-                   (ppcre:ppcre-syntax-error (c)
-                     (error 'malformed-regex :underlying-error c))))
-        (paths (or paths '("-"))))
-    (dolist (path paths)
-      (if (string= "-" path)
-        (run% scanner *standard-input*)
-        (with-open-file (stream path :direction :input)
-          (run% scanner stream))))))
-```
-
-Here we have the not-quite-top-level interface to the program.  `run` takes
-a pattern string and a list of paths and runs the colorization on each path.
-This is safe to call interactively from the REPL, e.g. `(run "<(\\w+)>"
-"foo.txt")`, so we can test without worrying about killing the Lisp process.
-
-### User Interface
-
-In the last chunk of the file we have the user interface.  There are a couple of
-things to note here.
-
-I'm using a command line argument parsing library I wrote myself: [Adopt][].
-But if you prefer another library (and there are quite a few around) feel free
-to use it — it should be pretty easy to adapt this setup to a different library.
-The only things you'd need to change would be the `toplevel` function and the
-`build-manual.sh` script (if you even care about building `man` pages at all).
-
-You might also notice that the user interface for the program is almost as much
-code as the entire rest of the program.  This may seem disconcerting at first,
-but I think it makes a certain kind of sense.  When you're writing code to
-interface with an external system, a messier and more complicated external
-system will usually require more code than a cleaner and simpler external
-system.  A human brain is probably the messiest and most complicated external
-system you'll ever have to deal with, so it's worth taking the extra time and
-code to be especially careful when writing an interface to it.
-
-```lisp
-(defparameter *option-help*
-  (adopt:make-option 'help
-    :help "Display help and exit."
-    :long "help"
-    :short #\h
-    :reduce (constantly t)))
-
-(defparameter *option-version*
-  (adopt:make-option 'version
-    :help "Display version information and exit."
-    :long "version"
-    :reduce (constantly t)))
-```
-
-```lisp
-(adopt:defparameters (*option-debug* *option-no-debug*)
-  (adopt:make-boolean-options 'debug
-    :long "debug"
-    :short #\d
-    :help "Enable the Lisp debugger."
-    :help-no "Disable the Lisp debugger (the default)."))
-```
-
-```lisp
-(adopt:defparameters (*option-randomize* *option-no-randomize*)
-  (adopt:make-boolean-options 'randomize
-    :help "Randomize the choice of color each run."
-    :help-no "Do not randomize the choice of color each run (the default)."
-    :long "randomize"
-    :short #\r))
-
-(adopt:defparameters (*option-dark* *option-light*)
-  (adopt:make-boolean-options 'dark
-    :name-no 'light
-    :long "dark"
-    :long-no "light"
-    :help "Optimize for dark terminals (the default)."
-    :help-no "Optimize for light terminals."
-    :initial-value t))
-```
-
-```lisp
-;;;; User Interface -----------------------------------------------------------
-(defun parse-explicit (spec)
-  (ppcre:register-groups-bind
-      ((#'parse-integer r g b) string)
-      ("^([0-5]),([0-5]),([0-5]):(.+)$" spec)
-    (return-from parse-explicit (cons string (rgb-code r g b))))
-  (error 'malformed-explicit :spec spec))
-
-(defparameter *option-explicit*
-  (adopt:make-option 'explicit
-    :parameter "R,G,B:STRING"
-    :help "Highlight STRING in an explicit color.  May be given multiple times."
-    :manual (format nil "~
-      Highlight STRING in an explicit color instead of randomly choosing one.  ~
-      R, G, and B must be 0-5.  STRING is treated as literal string, not a regex.  ~
-      Note that this doesn't automatically add STRING to the overall regex, you ~
-      must do that yourself!  This is a known bug that may be fixed in the future.")
-    :long "explicit"
-    :short #\e
-    :key #'parse-explicit
-    :reduce #'adopt:collect))
-```
-
-```lisp
-(adopt:define-string *help-text*
-  "batchcolor takes a regular expression and matches it against standard ~
-   input one line at a time.  Each unique match is highlighted in its own color.~@
-   ~@
-   If the regular expression contains any capturing groups, only those parts of ~
-   the matches will be highlighted.  Otherwise the entire match will be ~
-   highlighted.  Overlapping capturing groups are not supported.")
-
-(adopt:define-string *extra-manual-text*
-  "If no FILEs are given, standard input will be used.  A file of - stands for ~
-   standard input as well.~@
-   ~@
-   Overlapping capturing groups are not supported because it's not clear what ~
-   the result should be.  For example: what should ((f)oo|(b)oo) highlight when ~
-   matched against 'foo'?  Should it highlight 'foo' in one color?  The 'f' in ~
-   one color and 'oo' in another color?  Should that 'oo' be the same color as ~
-   the 'oo' in 'boo' even though the overall match was different?  There are too ~
-   many possible behaviors and no clear winner, so batchcolor disallows ~
-   overlapping capturing groups entirely.")
-
-(defparameter *examples*
-  '(("Colorize IRC nicknames in a chat log:"
-     . "cat channel.log | batchcolor '<(\\\\w+)>'")
-    ("Colorize UUIDs in a request log:"
-     . "tail -f /var/log/foo | batchcolor '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}'")
-    ("Colorize some keywords explicitly and IPv4 addresses randomly (note that the keywords have to be in the main regex too, not just in the -e options):"
-     . "batchcolor 'WARN|INFO|ERR|(?:[0-9]{1,3}\\\\.){3}[0-9]{1,3}' -e '5,0,0:ERR' -e '5,4,0:WARN' -e '2,2,5:INFO' foo.log")
-    ("Colorize earmuffed symbols in a Lisp file:"
-     . "batchcolor '(?:^|[^*])([*][-a-zA-Z0-9]+[*])(?:$|[^*])' tests/test.lisp")))
-```
-
-```lisp
-(defparameter *ui*
-  (adopt:make-interface
-    :name "batchcolor"
-    :usage "[OPTIONS] REGEX [FILE...]"
-    :summary "colorize regex matches in batches"
-    :help *help-text*
-    :manual (format nil "~A~2%~A" *help-text* *extra-manual-text*)
-    :examples *examples*
-    :contents (list
-                *option-help*
-                *option-version*
-                *option-debug*
-                *option-no-debug*
-                (adopt:make-group 'color-options
-                                  :title "Color Options"
-                                  :options (list *option-randomize*
-                                                 *option-no-randomize*
-                                                 *option-dark*
-                                                 *option-light*
-                                                 *option-explicit*)))))
-```
-
-### Top-Level Interface
-
-```lisp
-(defmacro exit-on-ctrl-c (&body body)
-  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
-     (with-user-abort:user-abort () (adopt:exit 130))))
-
-(defun configure (options)
-  (loop :for (string . rgb) :in (gethash 'explicit options)
-        :do (setf (gethash string *explicits*) rgb))
-  (setf *start* (if (gethash 'randomize options)
-                  (random 256 (make-random-state t))
-                  0)
-        *dark* (gethash 'dark options)))
-
-(defun toplevel ()
-  (sb-ext:disable-debugger)
-  (exit-on-ctrl-c
-    (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*)
-      (when (gethash 'debug options)
-        (sb-ext:enable-debugger))
-      (handler-case
-          (cond
-            ((gethash 'help options) (adopt:print-help-and-exit *ui*))
-            ((gethash 'version options) (write-line *version*) (adopt:exit))
-            ((null arguments) (error 'missing-regex))
-            (t (destructuring-bind (pattern . files) arguments
-                 (configure options)
-                 (run pattern files))))
-        (user-error (e) (adopt:print-error-and-exit e))))))
-```
-
-
-
-## More Information
-
-* ieure link
-* dotfiles repo link
-
-
-
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/content/blog/2021/03/small-common-lisp-cli-programs.markdown	Tue Mar 16 23:15:35 2021 -0400
@@ -0,0 +1,988 @@
+(:title "Writing Small CLI Programs in Common Lisp"
+ :snip "Somewhere between tiny shell scripts and full projects."
+ :date "2021-03-17T16:00:00Z"
+ :draft t)
+
+I write a lot of command-line programs.  For tiny programs I usually go with the
+typical UNIX approach: throw together a half-assed shell script and move on.
+For large programs I make a full Common Lisp project, with an ASDF system
+definition and such.  But there's a middle ground of small*ish* programs that
+don't warrant a full repository on their own, but for which I still want a real
+interface with proper `--help` and error handling.
+
+I've found Common Lisp to be a good language for writing these small command
+line programs.  But it can be a little intimidating to get started (especially
+for beginners) because Common Lisp is a very flexible language and doesn't lock
+you into one way of working.
+
+In this post I'll describe how I write small, stand-alone command line programs
+in Common Lisp.  It might work for you, or you might want to modify things to
+fit your own needs.
+
+<div id="toc"></div>
+
+## Requirements
+
+When you're writing programs in Common Lisp, you've got a lot of options.
+Laying out the requirements I have helped me decide on an approach.
+
+First: each new program should be one single file.  A few other files for the
+collection as a whole (e.g. a `Makefile`) are okay, but once everything is set
+up creating a new program should mean adding one single file.  For larger
+programs a full project directory and ASDF system are great, but for small
+programs having one file per program reduces the mental overhead quite a bit.
+
+The programs need to be able to be developed in the typical Common Lisp
+interactive style (in my case: with Swank and VLIME).  Interactive development
+is one of the best parts of working in Common Lisp, and I'm not willing to give
+it up.  In particular this means that a shell-script style approach, with
+`#!/path/to/sbcl --script` and the top and directly running code at the top
+level in the file, doesn't work for two main reasons:
+
+* `load`ing that file will fail due to the shebang unless you have some ugly
+  reader macros in your startup file.
+* The program will need to do things like parsing command-line arguments and
+  exiting with an error code, and calling `exit` would kill the Swank process.
+
+The programs need to be able to use libraries, so Quicklisp will need to be
+involved.  Common Lisp has a lot of nice things built-in, but there are some
+libraries that are just too useful to pass up.
+
+The programs will need to have proper user interfaces.  Command line arguments
+must be robustly parsed (e.g. collapsing `-a -b -c foo -d` into `-abcfoo -d`
+should work as expected), malformed or unknown options must be caught instead of
+dropping them on the floor, error messages should be meaningful, and the
+`--help` should be thoroughly and thoughtfully written so I can remember how to
+use the program months later.  A `man` page is a nice bonus, but not required.
+
+Relying on some basic conventions (e.g. a command `foo` is always in `foo.lisp`
+and defines a package `foo` with a function called `toplevel`) is okay if it
+makes my life easier.  These programs are just for me, so I don't have to worry
+about people wanting to create executables with spaces in the name or something.
+
+Portability between Common Lisp implementations is nice to have, but not
+required.  If using a bit of SBCL-specific grease will let me avoid a bunch of
+extra dependencies, that's fine for these small personal programs.
+
+## Solution Skeleton
+
+After trying a number of different approaches I've settled on a solution that
+I'm pretty happy with.  First I'll describe the general approach, then we'll
+look at one actual example program in its entirety.
+
+### Directory Structure
+
+I keep all my small single-file Common Lisp programs in a `lisp` directory
+inside my dotfiles repository.  Its contents look like this:
+
+```
+…/dotfiles/lisp/
+    bin/
+        foo
+        bar
+    man/
+        man1/
+            foo.1
+            bar.1
+    build-binary.sh
+    build-manual.sh
+    Makefile
+    foo.lisp
+    bar.lisp
+```
+
+The `bin` directory is where the executable files end up.  I've added it to my
+`$PATH` so I don't have to symlink or copy the binaries anywhere.
+
+`man` contains the generated `man` pages.  Because it's adjacent to `bin` (which
+is on my path) the `man` program automatically finds the `man` pages as
+expected.
+
+`build-binary.sh`, `build-manual.sh`, and `Makefile` are some glue to make
+building programs easier.
+
+The `.lisp` files are the programs.  Each new program I want to add only
+requires adding the `<programname>.lisp` file in this directory and running
+`make`.
+
+### Lisp Files
+
+My small Common Lisp programs follow a few conventions that make building them
+easier.  Let's look at the skeleton of a `foo.lisp` file as an example.  I'll
+show the entire file here, and then step through it piece by piece.
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+  (ql:quickload '(:with-user-abort …) :silent t))
+
+(defpackage :foo
+  (:use :cl)
+  (:export :toplevel *ui*))
+
+(in-package :foo)
+
+;;;; Configuration -----------------------------------------------
+(defparameter *whatever* 123)
+
+;;;; Errors ------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-foo (user-error) ()
+  (:report "A foo is required, but none was supplied."))
+
+;;;; Functionality -----------------------------------------------
+(defun foo (string)
+  …)
+
+;;;; Run ---------------------------------------------------------
+(defun run (arguments)
+  (map nil #'foo arguments))
+
+;;;; User Interface ----------------------------------------------
+(defmacro exit-on-ctrl-c (&body body)
+  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+     (with-user-abort:user-abort () (sb-ext:exit :code 130))))
+
+(defparameter *ui*
+  (adopt:make-interface
+    :name "foo"
+    …))
+
+(defun toplevel ()
+  (sb-ext:disable-debugger)
+  (exit-on-ctrl-c
+    (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*)
+      … ; Handle options.
+      (handler-case (run arguments)
+        (user-error (e) (adopt:print-error-and-exit e))))))
+```
+
+Let's go through each chunk of this.
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+  (ql:quickload '(:with-user-abort …) :silent t))
+```
+
+First we `quickload` any necessary libraries.  We always want to do this, even
+when compiling the file, because we need the appropriate packages to exist when
+we try to use their symbols later in the file.
+
+[with-user-abort][] is a library for easily handling `control-c`, which all of
+these small programs will use.
+
+[with-user-abort]: https://github.com/compufox/with-user-abort
+
+```lisp
+(defpackage :foo
+  (:use :cl)
+  (:export :toplevel *ui*))
+
+(in-package :foo)
+```
+
+Next we define a package `foo` and switch to it.  The package is always named
+the same as the resulting binary and the basename of the file, and always
+exports the symbols `toplevel` and `*ui*`.  These conventions make it easy to
+build everything automatically with `make` later.
+
+```lisp
+;;;; Configuration -----------------------------------------------
+(defparameter *whatever* 123)
+```
+
+Next we define any configuration variables.  These will be set later after
+parsing the command line arguments (when we run the command line program) or
+at the REPL (when developing interactively).
+
+```lisp
+;;;; Errors ------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-foo (user-error) ()
+  (:report "A foo is required, but none was supplied."))
+```
+
+We define a `user-error` condition, and any errors the user might make will
+inherit from it.  This will make it easy to treat user errors (e.g. passing
+a mangled regular expression like `(foo+` as an argument) differently from
+programming errors (i.e. bugs).  This makes it easier to treat those errors
+differently:
+
+* Bugs should print a backtrace or enter the debugger.
+* Expected user errors should print a helpful error message with no backtrace or debugger.
+
+```lisp
+;;;; Functionality -----------------------------------------------
+(defun foo (string)
+  …)
+```
+
+Next we have the actual functionality of the program.
+
+```lisp
+;;;; Run ---------------------------------------------------------
+(defun run (arguments)
+  (map nil #'foo arguments))
+```
+
+We define a function `run` that takes some arguments (as strings) and performs
+the main work of the program.
+
+Importantly, `run` does *not* handle command line argument parsing, and it does
+*not* exit the program with an error code, which means we can safely call it to
+say "run the whole program" when we're developing interactively without worrying
+about it killing our Lisp process.
+
+Now we need to define the command line interface.
+
+```lisp
+;;;; User Interface ----------------------------------------------
+(defmacro exit-on-ctrl-c (&body body)
+  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+     (with-user-abort:user-abort () (adopt:exit 130))))
+```
+
+We'll make a little macro around `with-user-abort` to make it less wordy.  We'll
+[exit with a status of 130](https://tldp.org/LDP/abs/html/exitcodes.html) if the
+user presses `ctrl-c`.  Maybe some day I'll pull this into Adopt so I don't have
+to copy these three lines everywhere.
+
+```lisp
+(defparameter *ui*
+  (adopt:make-interface
+    :name "foo"
+    …))
+```
+
+Here we define the `*ui*` variable whose symbol we exported above.  [Adopt][] is
+a command line argument parsing library I wrote.  If you want to use a different
+library, feel free.
+
+[Adopt]: https://docs.stevelosh.com/adopt
+
+```lisp
+(defun toplevel ()
+  (sb-ext:disable-debugger)
+  (exit-on-ctrl-c
+    (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*)
+      … ; Handle options.
+      (handler-case (run arguments)
+        (user-error (e) (adopt:print-error-and-exit e))))))
+```
+
+And finally we define the `toplevel` function.  This will only ever be called
+when the program is run as a standalone program, never interactively.  It
+handles all the work beyond the main guts of the program (which are handled by
+the `run` function), including:
+
+* Disabling or enabling the debugger.
+* Exiting the process with an appropriate status code on errors.
+* Parsing command line arguments.
+* Setting the values of the configuration parameters.
+* Calling `run`.
+
+That's it for the structure of the `.lisp` files.
+
+### Building Binaries
+
+`build-binary.sh` is a small script to build the executable binaries from the
+`.lisp` files.  `./build-binary.sh foo.lisp` will build `foo`:
+
+```bash
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+LISP=$1
+NAME=$(basename "$1" .lisp)
+shift
+
+sbcl --load "$LISP" \
+     --eval "(sb-ext:save-lisp-and-die \"$NAME\"
+               :executable t
+               :save-runtime-options t
+               :toplevel '$NAME:toplevel)"
+```
+
+Here we see where the naming conventions have become important — we know that
+the package is named the same as the binary and that it will have the symbol
+`toplevel` exported, which always names the entry point for the binary.
+
+### Building Man Pages
+
+`build-manual.sh` is similar and builds the `man` pages using [Adopt][]'s
+built-in `man` page generation.  If you don't care about building `man` pages
+for your personal programs you can ignore this.  I admit that generating `man`
+pages for these programs is a little bit silly because they're only for my own
+personal use, but I get it for free with Adopt, so why not?
+
+```bash
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+LISP=$1
+NAME=$(basename "$LISP" .lisp)
+OUT="$NAME.1"
+shift
+
+sbcl --load "$LISP" \
+     --eval "(with-open-file (f \"$OUT\" :direction :output :if-exists :supersede)
+               (adopt:print-manual $NAME:*ui* :stream f))" \
+     --quit
+```
+
+This is why we always name the Adopt interface variable `*ui*` and export it
+from the package.
+
+### Makefile
+
+Finally we have a simple `Makefile` so we can run `make` to regenerate any
+out of date binaries and `man` pages:
+
+```make
+files := $(wildcard *.lisp)
+names := $(files:.lisp=)
+
+.PHONY: all clean $(names)
+
+all: $(names)
+
+$(names): %: bin/% man/man1/%.1
+
+bin/%: %.lisp build-binary.sh Makefile
+	mkdir -p bin
+	./build-binary.sh $<
+	mv $(@F) bin/
+
+man/man1/%.1: %.lisp build-manual.sh Makefile
+	mkdir -p man/man1
+	./build-manual.sh $<
+	mv $(@F) man/man1/
+
+clean:
+	rm -rf bin man
+```
+
+We use a `wildcard` to automatically find the `.lisp` files so we don't have to
+do anything extra after adding a new file when we want to make a new program.
+
+The most notable line here is `$(names): %: bin/% man/man1/%.1` which uses
+a [static pattern rule](https://www.gnu.org/software/make/manual/html_node/Static-Pattern.html#Static-Pattern)
+to automatically define the phony rules for building each program.  If
+`$(names)` is `foo bar` this line effectively defines two phony rules:
+
+```
+foo: bin/foo man/man1/foo.1
+bar: bin/bar man/man1/bar.1
+```
+
+This lets us run `make foo` to make both the binary and `man` page for
+`foo.lisp`.
+
+## Case Study: A Batch Coloring Utility
+
+Now that we've seen the skeleton, let's look at one of my actual programs that
+I use all the time.  It's called `batchcolor` and it's used to highlight regular
+expression matches in text (usually log files) with a twist: each unique match
+is highlighted in a separate color, which makes it easier to visually parse the
+result.
+
+For example: suppose we have some log files with lines of the form `<timestamp>
+[<request ID>] <level> <message>` where request ID is a UUID, and messages might
+contain other UUIDs for various things.  Such a log file might look something
+like this:
+
+```
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8b7543/
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543 is not cached, retrieving...
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] WARN User 28b2d548-eff1-471c-b807-cc2bcee76b7d does not have access to thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543, denying request.
+2021-01-02 14:01:46 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Returning HTTP 404.
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8d7543/
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8d7543 is not cached, retrieving...
+2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] INFO Incoming request POST /users/sign-up/
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Returning HTTP 200.
+2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Error running SQL query: connection refused.
+2021-01-02 14:01:47 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Returning HTTP 500.
+```
+
+If I try to just read this directly, it's easy for my eyes to glaze over unless
+I laboriously walk line-by-line.
+
+[![Screenshot of uncolored log output](/static/images/blog/2021/03/uncolored.png)](/static/images/blog/2021/03/uncolored.png)
+
+I could use `grep` to highlight the UUIDs:
+
+    grep -P \
+        '\b[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}\b' \
+        example.log
+
+Unfortunately that doesn't really help too much because all the UUIDs are
+highlighted the same color:
+
+[![Screenshot of grep-colored log output](/static/images/blog/2021/03/grepcolored.png)](/static/images/blog/2021/03/grepcolored.png)
+
+To get a more readable version of the log, I use `batchcolor`:
+
+    batchcolor \
+        '\b[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}\b' \
+        example.log
+
+`batchcolor` also highlights matches, but it highlights each unique match in its
+own color:
+
+[![Screenshot of batchcolored log output](/static/images/blog/2021/03/batchcolored.png)](/static/images/blog/2021/03/batchcolored.png)
+
+This is *much* easier for me to visually parse.  The interleaving of separate
+request logs is now obvious from the colors of the IDs, and it's easy to match
+up various user IDs and thing IDs at a glance.  Did you even notice that the two
+thing IDs were different before?
+
+`batchcolor` has a few other quality of life features, like picking explicit
+colors for specific strings (e.g. red for `ERR`):
+
+[![Screenshot of fully batchcolored log output](/static/images/blog/2021/03/batchcoloredfull.png)](/static/images/blog/2021/03/batchcoloredfull.png)
+
+I use this particular `batchcolor` invocation so often I've put it in its own
+tiny shell script.  I use it to `tail` log files when developing locally almost
+every day, and it makes visually scanning the log output *much* easier.  It can
+come in handy for other kinds of text too, like highlighting nicknames in an IRC
+log.
+
+Let's step through its code piece by piece.
+
+### Libraries
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+  (ql:quickload '(:adopt :cl-ppcre :with-user-abort) :silent t))
+```
+
+First we `quickload` libraries.  We'll use [Adopt][] for command line argument
+processing, [cl-ppcre][] for regular expressions, and the previously-mentioned
+[with-user-abort][] to handle `control-c`.
+
+[cl-ppcre]: http://edicl.github.io/cl-ppcre/
+
+### Package
+
+```lisp
+(defpackage :batchcolor
+  (:use :cl)
+  (:export :toplevel :*ui*))
+
+(in-package :batchcolor)
+```
+
+We define and switch to the appropriately-named package.  Nothing special here.
+
+### Configuration
+
+```lisp
+;;;; Configuration ------------------------------------------------------------
+(defparameter *start* 0)
+(defparameter *dark* t)
+```
+
+Next we `defparameter` some variables to hold some settings.  `*start*` will be
+used later when randomizing colors, don't worry about it for now.
+
+### Errors
+
+```lisp
+;;;; Errors -------------------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-regex (user-error) ()
+  (:report "A regular expression is required."))
+
+(define-condition malformed-regex (user-error)
+  ((underlying-error :initarg :underlying-error))
+  (:report (lambda (c s)
+             (format s "Invalid regex: ~A" (slot-value c 'underlying-error)))))
+
+(define-condition overlapping-groups (user-error) ()
+  (:report "Invalid regex: seems to contain overlapping capturing groups."))
+
+(define-condition malformed-explicit (user-error)
+  ((spec :initarg :spec))
+  (:report
+    (lambda (c s)
+      (format s "Invalid explicit spec ~S, must be of the form \"R,G,B:string\" with colors being 0-5."
+              (slot-value c 'spec)))))
+```
+
+Here we define the user errors.  Some of these are self-explanatory, while
+others will make more sense later once we see them in action.  The specific
+details aren't as important as the overall idea: for user errors we know might
+happen, display a helpful error message instead of just spewing a backtrace at
+the user.
+
+### Colorization
+
+Next we have the actual meat of the program.  Obviously this is going to be
+completely different for every program, so feel free to skip this if you don't
+care about this specific problem.
+
+```lisp
+;;;; Functionality ------------------------------------------------------------
+(defun rgb-code (r g b)
+  ;; The 256 color mode color values are essentially r/g/b in base 6, but
+  ;; shifted 16 higher to account for the intiial 8+8 colors.
+  (+ (* r 36)
+     (* g 6)
+     (* b 1)
+     16))
+```
+
+We're going to highlight different matches with different colors.  We'll need
+a reasonable amount of colors to make this useful, so using the basic 8/16 ANSI
+colors isn't enough.  Full 24-bit truecolor is overkill, but the 8-bit ANSI
+colors will work nicely.  If we ignore the base colors, we essentially have
+6 x 6 x 6 = 216 colors to work with.  `rgb-code` will take the red, green, and
+blue values from `0` to `5` and return the color code.  See [Wikipedia][8bit]
+for more information.
+
+[8bit]: https://en.wikipedia.org/wiki/ANSI_escape_code#8-bit
+
+```lisp
+(defun make-colors (excludep)
+  (let ((result (make-array 256 :fill-pointer 0)))
+    (dotimes (r 6)
+      (dotimes (g 6)
+        (dotimes (b 6)
+          (unless (funcall excludep (+ r g b))
+            (vector-push-extend (rgb-code r g b) result)))))
+    result))
+
+(defparameter *dark-colors*  (make-colors (lambda (v) (< v 3))))
+(defparameter *light-colors* (make-colors (lambda (v) (> v 11))))
+```
+
+Now we can build some arrays of colors.  We *could* use any of the 216 available
+colors, but in practice we probably don't want to, because the darkest colors
+will be too dark to read on a dark terminal, and vice versa for light terminals.
+In a concession to practicality we'll generate two separate arrays of colors,
+one that excludes colors whose total value is too dark and one excluding those
+that are too light.
+
+(Notice that `*dark-colors*` is "the array of colors which are suitable for use
+on dark terminals" and not "the array of colors which are *themselves* dark".
+Naming things is hard.)
+
+Note that these arrays will be generated when the `batchcolor.lisp` file is
+`load`ed, which is *when we build the binary*.  They *won't* be recomputed every
+time you run the resulting binary.  In this case it doesn't really matter (the
+arrays are small) but it's worth remembering in case you ever have some data you
+want (or don't want) to compute at build time instead of run time.
+
+```lisp
+(defparameter *explicits* (make-hash-table :test #'equal))
+```
+
+Here we make a hash table to store the strings and colors for strings we want to
+explicitly color (e.g. `ERR` should be red, `INFO` cyan).  The keys will be the
+strings and values the RGB codes.
+
+```lisp
+(defun djb2 (string)
+  ;; http://www.cse.yorku.ca/~oz/hash.html
+  (reduce (lambda (hash c)
+            (mod (+ (* 33 hash) c) (expt 2 64)))
+          string
+          :initial-value 5381
+          :key #'char-code))
+
+(defun find-color (string)
+  (gethash string *explicits*
+           (let ((colors (if *dark* *dark-colors* *light-colors*)))
+             (aref colors
+                   (mod (+ (djb2 string) *start*)
+                        (length colors))))))
+```
+
+For strings that we want to explicitly color, we just look up the appropriate
+code in `*explicits*` and return it.
+
+Otherwise, we want to highlight unique matches in different colors.  There are
+a number of different ways we could do this, for example: we could randomly pick
+a color the first time we see a string and store it in a hash table for
+subsequent encounters.  But this would mean we'd grow that hash table over time,
+and one of the things I often use this utility for is `tail -f`ing long-running
+processes when developing locally, so the memory usage would grow and grow until
+the `batchcolor` process was restarted, which isn't ideal.
+
+Instead, we'll hash each string with a simple [DJB hash][djb] and use it to
+index into the appropriate array of colors.  This ensures that identical matches
+get identical colors, and avoids having to store every match we've ever seen.
+
+There will be some collisions, but there's not much we can do about that with
+only ~200 colors to work with.  We could have used 16-bit colors like
+I mentioned before, but then we'd have to worry about picking colors different
+enough for humans to easily tell apart, and for this simple utility I didn't
+want to bother.
+
+We'll talk about `*start*` later, ignore it for now (it's `0` by default).
+
+[djb]: http://www.cse.yorku.ca/~oz/hash.html
+
+```lisp
+(defun ansi-color-start (color)
+  (format nil "~C[38;5;~Dm" #\Escape color))
+
+(defun ansi-color-end ()
+  (format nil "~C[0m" #\Escape))
+
+(defun print-colorized (string)
+  (format *standard-output* "~A~A~A"
+          (ansi-color-start (find-color string))
+          string
+          (ansi-color-end)))
+```
+
+Next we have some functions to output the appropriate ANSI escapes to highlight
+our matches.  We could use a library for this but it's only two lines.  [It's
+not worth it](http://xn--rpa.cc/irl/term.html).
+
+And now we have the beating heart of the program:
+
+```lisp
+(defun colorize-line (scanner line &aux (start 0))
+  (ppcre:do-scans (ms me rs re scanner line)
+    ;; If we don't have any register groups, colorize the entire match.
+    ;; Otherwise, colorize each matched capturing group.
+    (let* ((regs? (plusp (length rs)))
+           (starts (if regs? (remove nil rs) (list ms)))
+           (ends   (if regs? (remove nil re) (list me))))
+      (map nil (lambda (word-start word-end)
+                 (unless (<= start word-start)
+                   (error 'overlapping-groups))
+                 (write-string line *standard-output* :start start :end word-start)
+                 (print-colorized (subseq line word-start word-end))
+                 (setf start word-end))
+           starts ends)))
+  (write-line line *standard-output* :start start))
+```
+
+`colorize-line` takes a CL-PPCRE scanner and a line, and outputs the line with
+any of the desired matches colorized appropriately.  There are a few things to
+note here.
+
+First: if the regular expression contains any capturing groups, we'll only
+colorize those parts of the match.  For example: if you run `batchcolor
+'^<(\\w+)> '` to colorize the nicks in an IRC log, only the nicknames themselves
+will be highlighted, not the surrounding angle brackets.  Otherwise, if there
+are no capturing groups in the regular expression, we'll highlight the entire
+match (as if there were one big capturing group around the whole thing).
+
+Second: overlapping capturing groups are explicitly disallowed and
+a `user-error` signaled if we notice any.  It's not clear what do to in this
+case — if we match `((f)oo|(b)oo)` against `foo`, what should the output be?
+Highlight `f` and `oo` in the same color?  In different colors?  Should the `oo`
+be a different color than the `oo` in `boo`?  There's too many options with no
+clear winner, so we'll just tell the user to be more clear.
+
+To do the actual work we iterate over each match and print the non-highlighted
+text before the match, then print the highlighted match.  Finally we print any
+remaining text after the last match.
+
+### Not-Quite-Top-Level Interface
+
+```lisp
+;;;; Run ----------------------------------------------------------------------
+(defun run% (scanner stream)
+  (loop :for line = (read-line stream nil)
+        :while line
+        :do (colorize-line scanner line)))
+
+(defun run (pattern paths)
+  (let ((scanner (handler-case (ppcre:create-scanner pattern)
+                   (ppcre:ppcre-syntax-error (c)
+                     (error 'malformed-regex :underlying-error c))))
+        (paths (or paths '("-"))))
+    (dolist (path paths)
+      (if (string= "-" path)
+        (run% scanner *standard-input*)
+        (with-open-file (stream path :direction :input)
+          (run% scanner stream))))))
+```
+
+Here we have the not-quite-top-level interface to the program.  `run` takes
+a pattern string and a list of paths and runs the colorization on each path.
+This is safe to call interactively from the REPL, e.g. `(run "<(\\w+)>"
+"foo.txt")`, so we can test without worrying about killing the Lisp process.
+
+### User Interface
+
+In the last chunk of the file we have the user interface.  There are a couple of
+things to note here.
+
+I'm using a command line argument parsing library I wrote myself: [Adopt][].
+I won't go over exactly what all the various Adopt functions do.  Most of them
+should be fairly easy to understand, but [check out the Adopt
+documentation][adopt-usage] for the full story if you're curious.
+
+[adopt-usage]: https://docs.stevelosh.com/adopt/usage/
+
+If you prefer another library (and there are quite a few around) feel free
+to use it — it should be pretty easy to adapt this setup to a different library.
+The only things you'd need to change would be the `toplevel` function and the
+`build-manual.sh` script (if you even care about building `man` pages at all).
+
+You might also notice that the user interface for the program is almost as much
+code as the entire rest of the program.  This may seem strange, but I think it
+makes a certain kind of sense.  When you're writing code to interface with an
+external system, a messier and more complicated external system will usually
+require more code than a cleaner and simpler external system.  A human brain is
+probably the messiest and most complicated external system you'll ever have to
+deal with, so it's worth taking the extra time and code to be especially careful
+when writing an interface to it.
+
+First we'll define a typical `-h`/`--help` option:
+
+```lisp
+(defparameter *option-help*
+  (adopt:make-option 'help
+    :help "Display help and exit."
+    :long "help"
+    :short #\h
+    :reduce (constantly t)))
+```
+
+Next we'll define a pair of options for enabling/disabling the Lisp debugger:
+
+```lisp
+(adopt:defparameters (*option-debug* *option-no-debug*)
+  (adopt:make-boolean-options 'debug
+    :long "debug"
+    :short #\d
+    :help "Enable the Lisp debugger."
+    :help-no "Disable the Lisp debugger (the default)."))
+```
+
+By default the debugger will be off, so any unexpected error will print
+a backtrace to standard error and exit with a nonzero exit code.  This is the
+default because if I add a `batchcolor` somewhere in a shell script, I probably
+don't want to suddenly hang the entire script if something breaks.  But we still
+want to be *able* to get into the debugger manually if something goes wrong.
+This is Common Lisp — we don't have to settle for a stack trace or core dump, we
+can have a real interactive debugger in the final binary.
+
+Note how Adopt's `make-boolean-options` function creates *two* options here:
+
+* `-d`/`--debug` will enable the debugger.
+* `-D`/`--no-debug` will disable the debugger.
+
+Even though *disabled* is the default, it's still important to have both
+switches for boolean options like this.  If someone wants the debugger to be
+*enabled* by default instead (along with some other configuration options), they
+might have a shell alias like this:
+
+    alias bcolor='batchcolor --debug --foo --bar'
+
+But sometimes they might want to temporarily *disable* the debugger for a single
+run.  Without a `--no-debug` option, they would have to run the vanilla
+`batchcolor` and retype all the *other* options.  But having the `--no-debug`
+option allows them to just say:
+
+    bcolor --no-debug
+
+This would expand to:
+
+    batchcolor --debug --foo --bar --no-debug
+
+The later option wins, and the user gets the behavior they expect.
+
+Next we'll define some color-related options.  First an option to randomize the
+colors each run, instead of always picking the same color for a particular
+string, and then a toggle for choosing colors that work for dark or light
+terminals:
+
+```lisp
+(adopt:defparameters (*option-randomize* *option-no-randomize*)
+  (adopt:make-boolean-options 'randomize
+    :help "Randomize the choice of color each run."
+    :help-no "Do not randomize the choice of color each run (the default)."
+    :long "randomize"
+    :short #\r))
+
+(adopt:defparameters (*option-dark* *option-light*)
+  (adopt:make-boolean-options 'dark
+    :name-no 'light
+    :long "dark"
+    :long-no "light"
+    :help "Optimize for dark terminals (the default)."
+    :help-no "Optimize for light terminals."
+    :initial-value t))
+```
+
+The last option we'll define is `-e`/`--explicit`, to allow the user to select
+an explicit color for a particular string:
+
+```lisp
+(defun parse-explicit (spec)
+  (ppcre:register-groups-bind
+      ((#'parse-integer r g b) string)
+      ("^([0-5]),([0-5]),([0-5]):(.+)$" spec)
+    (return-from parse-explicit (cons string (rgb-code r g b))))
+  (error 'malformed-explicit :spec spec))
+
+(defparameter *option-explicit*
+  (adopt:make-option 'explicit
+    :parameter "R,G,B:STRING"
+    :help "Highlight STRING in an explicit color.  May be given multiple times."
+    :manual (format nil "~
+      Highlight STRING in an explicit color instead of randomly choosing one.  ~
+      R, G, and B must be 0-5.  STRING is treated as literal string, not a regex.  ~
+      Note that this doesn't automatically add STRING to the overall regex, you ~
+      must do that yourself!  This is a known bug that may be fixed in the future.")
+    :long "explicit"
+    :short #\e
+    :key #'parse-explicit
+    :reduce #'adopt:collect))
+```
+
+Notice how we signal a `malformed-explicit` condition if the user gives us
+mangled text.  This is a subtype of `user-error`, so the program will print the
+error and exit even if the debugger is enabled.  We also include a slightly more
+verbose description in the `man` page than the terse one in the `--help` text.
+
+Next we write the main help and manual text, as well as some real-world
+examples:
+
+```lisp
+(adopt:define-string *help-text*
+  "batchcolor takes a regular expression and matches it against standard ~
+   input one line at a time.  Each unique match is highlighted in its own color.~@
+   ~@
+   If the regular expression contains any capturing groups, only those parts of ~
+   the matches will be highlighted.  Otherwise the entire match will be ~
+   highlighted.  Overlapping capturing groups are not supported.")
+
+(adopt:define-string *extra-manual-text*
+  "If no FILEs are given, standard input will be used.  A file of - stands for ~
+   standard input as well.~@
+   ~@
+   Overlapping capturing groups are not supported because it's not clear what ~
+   the result should be.  For example: what should ((f)oo|(b)oo) highlight when ~
+   matched against 'foo'?  Should it highlight 'foo' in one color?  The 'f' in ~
+   one color and 'oo' in another color?  Should that 'oo' be the same color as ~
+   the 'oo' in 'boo' even though the overall match was different?  There are too ~
+   many possible behaviors and no clear winner, so batchcolor disallows ~
+   overlapping capturing groups entirely.")
+
+(defparameter *examples*
+  '(("Colorize IRC nicknames in a chat log:"
+     . "cat channel.log | batchcolor '<(\\\\w+)>'")
+    ("Colorize UUIDs in a request log:"
+     . "tail -f /var/log/foo | batchcolor '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}'")
+    ("Colorize some keywords explicitly and IPv4 addresses randomly (note that the keywords have to be in the main regex too, not just in the -e options):"
+     . "batchcolor 'WARN|INFO|ERR|(?:[0-9]{1,3}\\\\.){3}[0-9]{1,3}' -e '5,0,0:ERR' -e '5,4,0:WARN' -e '2,2,5:INFO' foo.log")
+    ("Colorize earmuffed symbols in a Lisp file:"
+     . "batchcolor '(?:^|[^*])([*][-a-zA-Z0-9]+[*])(?:$|[^*])' tests/test.lisp")))
+```
+
+Finally we can wire everything together in the main Adopt interface:
+
+```lisp
+(defparameter *ui*
+  (adopt:make-interface
+    :name "batchcolor"
+    :usage "[OPTIONS] REGEX [FILE...]"
+    :summary "colorize regex matches in batches"
+    :help *help-text*
+    :manual (format nil "~A~2%~A" *help-text* *extra-manual-text*)
+    :examples *examples*
+    :contents (list
+                *option-help*
+                *option-debug*
+                *option-no-debug*
+                (adopt:make-group 'color-options
+                                  :title "Color Options"
+                                  :options (list *option-randomize*
+                                                 *option-no-randomize*
+                                                 *option-dark*
+                                                 *option-light*
+                                                 *option-explicit*)))))
+```
+
+All that's left to do is the top-level function that will be called when the
+binary is executed.
+
+### Top-Level Interface
+
+Before we write `toplevel` we've got a couple of helpers:
+
+```lisp
+(defmacro exit-on-ctrl-c (&body body)
+  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+     (with-user-abort:user-abort () (adopt:exit 130))))
+
+(defun configure (options)
+  (loop :for (string . rgb) :in (gethash 'explicit options)
+        :do (setf (gethash string *explicits*) rgb))
+  (setf *start* (if (gethash 'randomize options)
+                  (random 256 (make-random-state t))
+                  0)
+        *dark* (gethash 'dark options)))
+```
+
+Our `toplevel` function looks much like the one in the skeleton, but fleshed out
+a bit more:
+
+```lisp
+(defun toplevel ()
+  (sb-ext:disable-debugger)
+  (exit-on-ctrl-c
+    (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*)
+      (when (gethash 'debug options)
+        (sb-ext:enable-debugger))
+      (handler-case
+          (cond
+            ((gethash 'help options) (adopt:print-help-and-exit *ui*))
+            ((null arguments) (error 'missing-regex))
+            (t (destructuring-bind (pattern . files) arguments
+                 (configure options)
+                 (run pattern files))))
+        (user-error (e) (adopt:print-error-and-exit e))))))
+```
+
+This `toplevel` has a few extra bits beyond the skeletal example.
+
+First, we disable the debugger immediately, and then re-enable it later if the
+user asks us to.  We want to keep it disabled until *after* argument parsing
+because we can't know whether the user wants it or not until we parse the
+arguments.
+
+Instead of just blindly running `run`, we check for `--help` and print it if
+desired.  We also validate that the user passes the correct amount of arguments,
+signaling a subtype of `user-error` if they don't.  Assuming everything looks
+good we handle the configuration, call `run`, and that's it!
+
+Running `make` generates `bin/batchcolor` and `man/man1/batchcolor.1`, and we
+can view our log files in beautiful color.
+
+## More Information
+
+I hope this overview was helpful.  This has worked for me, but Common Lisp is
+a flexible language, so if you want to use this layout as a starting point and
+modify it for your own needs, go for it!
+
+If you want to see some more examples you can find them in [my dotfiles
+repository](https://hg.stevelosh.com/dotfiles/file/tip/lisp).  Some of the more
+fun ones include:
+
+* `weather` for displaying the weather over the next few hours so I can tell if
+  I need a jacket or umbrella before I go out for a walk.
+* `retry` to retry shell commands if they fail, with options for how many times
+  to retry, strategies for waiting/backing off on failure, etc.
+* `pick` to interactively filter the output of one command into another
+  (inspired by the `pick` program in "The UNIX Programming Environment" but with
+  more options).
+
+The approach I laid out in this post works well for small, single-file programs.
+If you're creating a larger program you'll probably want to move to a full ASDF
+system in its own directory/repository.  My friend Ian [wrote a post about
+that](http://atomized.org/blog/2020/07/06/common-lisp-in-practice/) which you
+might find interesting.
Binary file static/images/blog/2021/03/batchcolored.png has changed
Binary file static/images/blog/2021/03/batchcoloredfull.png has changed
Binary file static/images/blog/2021/03/grepcolored.png has changed
Binary file static/images/blog/2021/03/uncolored.png has changed