c635f15b37c1

Start draft
[view raw] [browse files]
author Steve Losh <steve@stevelosh.com>
date Sun, 27 Dec 2020 22:08:26 -0500
parents deaaa26b7266
children a9d8b7a86226
branches/tags (none)
files content/blog/2018/08/types-and-classes-in-common-lisp.markdown content/blog/2021/01/small-common-lisp-cli-programs.markdown

Changes

--- a/content/blog/2018/08/types-and-classes-in-common-lisp.markdown	Sat Apr 18 13:10:31 2020 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,166 +0,0 @@
-+++
-title = "Types and Classes in Common Lisp"
-snip = "They're not the same thing!"
-date = 2018-08-30T16:00:00Z
-draft = true
-
-+++
-
-One thing that often confuses people new to Common Lisp is the differences
-(and interactions) between types and classes in the language.  Type and classes
-are two completely separate things in Common Lisp, but if you're coming from
-modern languages it's easy to get the two blurred and confused.  Hopefully this
-post will make the distinction more clear.
-
-<div id="toc"></div>
-
-## Objects
-
-Before we dive into defining types and classes we should define what an "object"
-is, because the term will come up immediately.  The following definition will be
-good enough for our purposes here:
-
-**An object is a hunk of bits somewhere in memory.**
-
-Objects are the things the garbage collector manages.  They're the things you
-pass to functions, and the things you return from them.  Objects have identity,
-and that identity can be compared with `eq`.
-
-I realize that this is a little handwavey, but I think it's good enough to work
-with for now.
-
-TODO values
-
-(There are a few corner cases (e.g. fixnums), but you can safely ignore them
-while trying to wrap your head around this post.)
-
-## Types
-
-Types in Common Lisp can be summed up in one single line:
-
-**A type is a set of objects.**
-
-We already saw what objects are.  "Set" in this definition is a set in the
-mathematical sense: an unordered collection of elements (possibly *infinitely
-many* elements) with no duplicates.
-
-That's it.  That's all there is to it.  This probably seems like a weird
-definition if you've never thought much about it before, but let's look at some
-examples to see what falls out of it.
-
-### Type Designators
-
-First we need a way to specify types.  Common Lisp has a concept called [type
-designators][TODO] for this purpose.  A type designator is something that
-represents the given type.
-
-Let's look at a common type: the set of all integers.  Obviously it would be
-impractical to talk about this set by listing out all of its members.
-A mathematician would denote this type as Z TODO.  A Common Lisp programmer
-would use the type designator `integer`.
-
-### Being Of a Type
-
-What are some things we might want to do with a type?  One thing might be to ask
-whether a particular object "is of that type", done with `typep` in Common Lisp
-(and `instanceOf` in Java, TODO in Python, etc).  But what does this actually
-*mean*?
-
-When you're thinking of types as sets of objects, asking whether an object is of
-a particular type essentially asking if the object is a member of that set!
-
-Let's look at couple of examples:
-
-    (typep 42 'integer)
-
-Here we're asking "Is `42` a member of the set of all integers?"
-
-    (typep x 'symbol)
-
-Now we're asking "Is the object that `x` evaluates to a member of the set of all
-symbols?".
-
-    (typecase foo
-      (symbol ...a...)
-      (integer ...b...)
-      (number ...c...))
-
-And now we're saying "Evaluate `foo`.  If the result is a member of the set of
-all symbols, do `a`.  Otherwise if it's a member of the set of all integers, do
-`b`.  Otherwise if it's a member of the set of all numbers, do `c`.  Otherwise
-return `nil`.".
-
-### Subtypes and Supertypes
-
-So checking if an object is of a particular type is simply the set membership
-operation.  It turns out that other set operations also have useful definitions
-when you think this way:
-
-* If `foo` is a supertype of `bar`, that means `foo` is a superset of `bar`.
-* If `bar` is a subtype of `foo`, that means `bar` is a subtype of `foo`.
-
-For example:
-
-* The set of all integers is a subset of the set of all real numbers, which
-  makes `integer` a subtype of `real`.
-* The set of all symbols is a superset of the set of all keyword symbols, so
-  `symbol` is a supertype of `keyword`.
-* The set of all floating point numbers is neither a subset nor a superset of
-  the set of all symbols, so neither is a subtype of the other.
-
-Common Lisp's numeric tower consists of nice sequences of types that get more
-and more specific as you take further subsets:
-
-`number` ⊆ `real` ⊆ `rational` ⊆ `integer` ⊆ `fixnum`
-
-### Explicit Designation
-
-There are other ways to designate sets too.  You can use the `member` type
-designator to just list out the members of a set by hand if you want:
-
-    (member 1 2 3) ; => designates the set {1, 2, 3}
-
-### Everything and Nothing
-
-The type `t` is the set of all objects.  The type `nil` is the empty set.
-
-This last one can sometimes cause confusion.  The symbol `nil` is *not* of type
-`nil` because the symbol `nil` is not a member of the empty set (because it's
-empty!).  If you want to talk about the set containing the symbol `nil`, that's
-called `null`:
-
-```lisp
-(typep nil nil)  ; => NIL, nil is NOT a member of the empty set
-(typep nil null) ; => T,   nil IS a member of { nil }
-```
-
-### Type Combinations
-
-Let's look at some more set operations.  Set complement is probably the
-simplest, and Common Lisp supports this with the `not` compound type specifier
-TODO:
-
-```lisp
-(typep 1/2     '(not integer)) ; => T
-(typep "hello" '(not integer)) ; => T
-(typep 42      '(not integer)) ; => NIL
-```
-
-Set union is covered with `and`:
-
-```lisp
-(typep 1     '(or integer string)) ; => T
-(typep "hi"  '(or integer string)) ; => T
-(typep :what '(or integer string)) ; => NIL
-```
-
-And of course set intersection is done with `and`:
-
-```lisp
-(typep 1 )
-```
-
-
-## Classes
-
-## Blurring the Line
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/content/blog/2021/01/small-common-lisp-cli-programs.markdown	Sun Dec 27 22:08:26 2020 -0500
@@ -0,0 +1,854 @@
+(:title "Writing Small CLI Programs in Common Lisp"
+ :snip "Somewhere between tiny shell scripts and full projects."
+ :date "2021-01-02T15:50:00Z"
+ :draft t)
+
+I've found Common Lisp to be a good language for this.  But it can be a little
+intimidating to get started, especially for beginners, because Common Lisp is
+a very flexible language and doesn't lock you into one way of working.  In this
+post I'll describe how I write small, stand-alone command line programs in
+Common Lisp.
+
+<div id="toc"></div>
+
+## Requirements
+
+When you're writing programs in Common Lisp, you've got a lot of options.  For
+this use case, laying out the requirements I have helped me decide on an
+approach.
+
+First, each new program should be one single file.  A few ancillary files for
+all scripts together (e.g. a `Makefile`) are okay, but adding a new program
+should mean adding one single file.  For larger programs a full project
+directory and ASDF system are great, but for small programs having one file per
+program reduces the overhead quite a bit.
+
+The programs need to be able to be developed in the typical Common Lisp
+interactive style (in my case: with Swank and VLIME).  The interactive
+development is one of the best parts of working in Common Lisp, and I won't give
+it up.  In particular, this means that a shell-script style approach, with
+`#!/path/to/sbcl --script` and the top and directly running code at the top
+level in the file does not work for two main reasons:
+
+* Loading that file will fail due to the shebang unless you have some ugly
+  reader macros in your startup file.
+* The program will need to do things like parsing command-line arguments and
+  exiting with an error code, and running `exit` would kill the Swank process.
+
+The programs need to be able to use libraries, so Quicklisp will need to be
+involved.  Common Lisp has a lot of nice things built-in, but there's some
+libraries that are just too useful to pass up.
+
+The programs will need to have proper user interfaces.  Command line arguments
+must be robustly parsed (e.g. collapsing `-a -b -c foo -d` into `-abc foo -d`
+should work as expected), malformed or unknown options must be caught instead of
+dropping them on the floor, error messages should be meaningful, and the
+`--help` should be thoroughly and thoughtfully written.
+
+Relying on some basic conventions (e.g. a command `foo` is always in `foo.lisp`
+and defines a package `foo` with a function called `toplevel`) is okay if it
+makes my life easier.  These programs are just for me, so I don't have to worry
+about people wanting to create executables with spaces in the name or something.
+
+## Solution Skeleton
+
+After trying a number of different approaches, I've settled on a solution that
+I'm pretty happy with.  First I'll describe the general approach, then we'll
+look at one actual example program in its entirety.
+
+### Directory Structure
+
+I keep all my small single-file Common Lisp programs in a `lisp` directory
+inside my dotfiles repository.  Its contents look like this:
+
+```
+…/lisp/
+    bin/
+        foo
+        bar
+    man/
+        man1/
+            foo.1
+            bar.1
+    build-binary.sh
+    build-manual.sh
+    Makefile
+    foo.lisp
+    bar.lisp
+```
+
+The `bin` directory is where the executable files end up.  I've added it to my
+`$PATH` so I don't have to symlink or copy the binaries anywhere.
+
+`man` contains the generated `man` pages.  Because it's adjacent to `bin` (which
+is on my path) the `man` program automatically finds the `man` pages as
+expected.
+
+`build-binary.sh`, `build-manual.sh`, and `Makefile` are some glue to make
+building programs easier.
+
+The `.lisp` files are, of course, the programs.  Each new program I want to add
+only requires adding the `<programname>.lisp` file in this directory and running
+`make`.
+
+### Lisp Files
+
+All my small Common Lisp programs follow a few conventions, which makes building
+them easier.  Let's look at the skeleton of a `foo.lisp` file as an example.
+I'll show the entire file here, and then step through it piece by piece.
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+  (ql:quickload '(… :with-user-abort) :silent t))
+
+(defpackage :foo
+  (:use :cl)
+  (:export :toplevel …))
+
+(in-package :foo)
+
+;;;; Configuration -----------------------------------------------
+(defparameter *version* "1.0.0")
+(defparameter *some-option* nil)
+
+;;;; Errors ------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-foo (user-error) ()
+  (:report "A foo is required, but none was supplied."))
+
+;;;; Functionality -----------------------------------------------
+(defun foo (string)
+  …)
+
+;;;; Run ---------------------------------------------------------
+(defun run (arguments)
+  (map nil #'foo arguments))
+
+;;;; User Interface ----------------------------------------------
+(defmacro exit-on-ctrl-c (&body body)
+  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+     (with-user-abort:user-abort () (sb-ext:exit :code 130))))
+
+(defparameter *ui*
+  (adopt:make-interface
+    :name "foo"
+    …))
+
+(defun toplevel ()
+  (sb-ext:disable-debugger)
+  (exit-on-ctrl-c
+    (multiple-value-bind (arguments options)
+        (adopt:parse-options-or-exit *ui*)
+      … ; Handle options.
+      (handler-case (run arguments)
+        (user-error (e)
+          (format *error-output* "error: ~A~%" e)
+          (adopt:exit 1))))))
+```
+
+Let's go through each chunk of this.
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+  (ql:quickload '(:with-user-abort …) :silent t))
+```
+
+First we `quickload` any necessary libraries.  We always want to do this, even
+when compiling the file, because we need the appropriate packages to be loaded
+when we try to use their symbols later in the file.
+
+[`with-user-abort`](https://github.com/compufox/with-user-abort) is a library
+for portably handling `control-c`, which all of these small programs use.
+
+```lisp
+(defpackage :foo
+  (:use :cl)
+  (:export :toplevel *ui*))
+
+(in-package :foo)
+```
+
+Next we define a package `foo` and switch to it.  The package is always named
+the same as the resulting binary and the basename of the file.  The package
+always exports the symbols `toplevel` and `*ui*`.
+
+```lisp
+;;;; Configuration -----------------------------------------------
+(defparameter *version* "1.0.0")
+(defparameter *some-option* nil)
+```
+
+Next we define any configuration variables.  These will be set later after
+parsing the command line arguments (when we run the command line program) or
+at the REPL (when developing interactively).
+
+```lisp
+;;;; Errors ------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-foo (user-error) ()
+  (:report "A foo is required, but none was supplied."))
+```
+
+We define a `user-error` condition, and any errors the user might make will
+inherit from it.  This will make it easy to treat user errors (e.g. passing
+a mangled regular expression like `(foo+` as an argument) differently from
+programming errors (i.e. bugs).  This makes it easier to treat those errors
+differently:
+
+* Bugs should print a backtrace or enter the debugger.
+* Expected user errors should print a helpful error message with no backtrace or debugger.
+
+```lisp
+;;;; Functionality -----------------------------------------------
+(defun foo (string)
+  …)
+```
+
+Next we have the actual functionality of the program.
+
+```lisp
+;;;; Run ---------------------------------------------------------
+(defun run (arguments)
+  (map nil #'foo arguments))
+```
+
+We now define a function `run` that takes some arguments (as strings) and
+performs the main work of the program.
+
+Importantly, `run` does **not** handle command line argument parsing, and it does
+**not** exit the program with an error code, which means we can safely call it
+to "run the program" when we're developing interactively without worrying about
+it killing our Lisp process.
+
+Finally, we need to define the command line interface.
+
+```lisp
+;;;; User Interface ----------------------------------------------
+(defmacro exit-on-ctrl-c (&body body)
+  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+     (with-user-abort:user-abort () (sb-ext:exit :code 130))))
+```
+
+We'll make a little macro around `with-user-abort` to make it less wordy.  We'll
+[exit with a status of 130](https://tldp.org/LDP/abs/html/exitcodes.html) if the
+user presses `ctrl-c`.
+
+```lisp
+(defparameter *ui*
+  (adopt:make-interface
+    :name "foo"
+    …))
+```
+
+Here we define the `*ui*` variable whose symbol we exported above.  [Adopt][] is
+a command line argument parsing library I wrote.  If you want to use a different
+library, feel free.
+
+[Adopt]: https://docs.stevelosh.com/adopt
+
+```lisp
+(defun toplevel ()
+  (sb-ext:disable-debugger)
+  (exit-on-ctrl-c
+    (multiple-value-bind (arguments options)
+        (adopt:parse-options-or-exit *ui*)
+      … ; Handle options.
+      (handler-case (run arguments)
+        (user-error (e)
+          (format *error-output* "error: ~A~%" e)
+          (adopt:exit 1))))))
+```
+
+And finally we define the `toplevel` function.  This will only ever be called
+when the program is run as a standalone program, never interactively.  It
+handles all the work beyond the main guts of the program that are handled by the
+`run` function:
+
+* Disabling or enabling the debugger.
+* Exiting the process with the appropriate status code on errors.
+* Parsing command line arguments.
+
+That's it for the structure of the `.lisp` files.
+
+### Building Binaries
+
+`build-binary.sh` is a small script to build the executable binaries from the
+`.lisp` files.  `./build-binary.sh foo.lisp` will build `foo`:
+
+```bash
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+LISP=$1
+NAME=$(basename "$1" .lisp)
+shift
+
+sbcl --load "$LISP" \
+     --eval "(sb-ext:save-lisp-and-die \"$NAME\"
+               :executable t
+               :save-runtime-options t
+               :toplevel '$NAME:toplevel)"
+```
+
+Here we see where the naming conventions have become important — we know that
+the package is named the same as the binary and that it will have the symbol
+`toplevel` exported, which names the entry point for the binary.
+
+### Building Man Pages
+
+`build-manual.sh` is similar and builds the `man` pages using [Adopt][]'s
+built-in `man` page generation.  If you don't care about building `man` pages
+for your personal programs (I admit, it's a little bit silly) you can ignore
+this.
+
+```bash
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+LISP=$1
+NAME=$(basename "$LISP" .lisp)
+OUT="$NAME.1"
+shift
+
+sbcl --load "$LISP" \
+     --eval "(with-open-file (f \"$OUT\" :direction :output :if-exists :supersede)
+               (adopt:print-manual $NAME:*ui* :stream f))" \
+     --quit
+```
+
+This is why we always name the Adopt interface variable `*ui*` and export it
+from the package.
+
+### Makefile
+
+Finally we have a simple `Makefile` so we can run `make` to regenerate any
+out of date binaries and `man` pages:
+
+```make
+files := $(wildcard *.lisp)
+names := $(files:.lisp=)
+
+.PHONY: all clean $(names)
+
+all: $(names)
+
+$(names): %: bin/% man/man1/%.1
+
+bin/%: %.lisp build-binary.sh Makefile
+	mkdir -p bin
+	./build-binary.sh $<
+	mv $(@F) bin/
+
+man/man1/%.1: %.lisp build-manual.sh Makefile
+	mkdir -p man/man1
+	./build-manual.sh $<
+	mv $(@F) man/man1/
+
+clean:
+	rm -rf bin man
+```
+
+We use a `wildcard` to automatically find the `.lisp` files so we don't have to
+do anything other than adding a new file when we want to make a new program.
+
+The most notable line here is `$(names): %: bin/% man/man1/%.1` which uses
+a [static pattern rule](https://www.gnu.org/software/make/manual/html_node/Static-Pattern.html#Static-Pattern)
+to automatically define the phony rules for building each program.  If
+`$(names)` is `foo bar` this line effectively defines two phony rules:
+
+```
+foo: bin/foo man/man1/foo.1
+bar: bin/bar man/man1/bar.1
+```
+
+This lets us run `make foo` to make the binary and `man` page for `foo.lisp`.
+
+## Case Study: A Batch Coloring Utility
+
+Now that we've seen the skeleton, let's look at one of my actual programs that
+I use all the time.  It's called `batchcolor` and it's used to highlight regular
+expression matches in text (usually log files in my case) with a twist: each
+unique match is highlighted in a separate color, which makes it easier to
+visually parse the result.
+
+For example, suppose we have some log files with lines of the form `<timestamp>
+[<request ID>] <level> <message>` where request ID is a UUID, and messages might
+contain other UUIDs for various things.  Such a log file might look something
+like this:
+
+```
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8b7543/
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543 is not cached, retrieving...
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] WARN User 28b2d548-eff1-471c-b807-cc2bcee76b7d does not have access to thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543, denying request.
+2021-01-02 14:01:46 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Returning HTTP 404.
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8d7543/
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8d7543 is not cached, retrieving...
+2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] INFO Incoming request POST /users/sign-up/
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Returning HTTP 200.
+2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Error running SQL query: connection refused.
+2021-01-02 14:01:47 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Returning HTTP 500.
+```
+
+If I try to just read this directly, it's easy for my eyes to glaze over unless
+I laboriously read line-by-line.  I can use `grep` to highlight the UUIDs, but
+that honestly doesn't help too much:
+
+    grep -P '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}|$' foo.log
+
+`batchcolor` also highlights matches, but highlights each unique match in its
+own color:
+
+This is *much* easier for me to visually parse.  The interleaving of separate
+request logs is now obvious from the colors of the IDs, and it's easy to match
+up various user IDs and thing IDs at a glance.  Did you even notice that the two
+thing IDs were different before?
+
+`batchcolor` has a few other simple quality of life features, like picking
+explicit colors for specific strings (e.g. red for `ERR`):
+
+I wrap up this `batchcolor` invocation in an alias and use it to `tail` log
+files when developing locally almost every day, and it makes reading the log
+output *much* easier.
+
+Let's step through its code piece by piece.
+
+### Libraries
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+  (ql:quickload '(:adopt :cl-ppcre :with-user-abort) :silent t))
+```
+
+First we `quickload` libraries.  We'll use [Adopt][] for command line argument
+processing, [cl-ppcre][] for regular expressions, and the previously-mentioned
+[with-user-abort][] to handle `control-c`.
+
+### Package
+
+```lisp
+(defpackage :batchcolor
+  (:use :cl)
+  (:export :toplevel :*ui*))
+
+(in-package :batchcolor)
+```
+
+We define and switch to the appropriately-named package.  Nothing special here.
+
+### Configuration
+
+```lisp
+;;;; Configuration ------------------------------------------------------------
+(defparameter *version* "1.0.0")
+(defparameter *start* 0)
+(defparameter *dark* t)
+```
+
+Next we `defparameter` some variables to hold useful values (like the version)
+and settings.  `*start*` will be used later when randomizing colors, don't worry
+about it for now.
+
+### Errors
+
+```lisp
+;;;; Errors -------------------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-regex (user-error) ()
+  (:report "A regular expression is required."))
+
+(define-condition malformed-regex (user-error)
+  ((underlying-error :initarg :underlying-error))
+  (:report (lambda (c s)
+             (format s "Invalid regex: ~A" (slot-value c 'underlying-error)))))
+
+(define-condition overlapping-groups (user-error) ()
+  (:report "Invalid regex: seems to contain overlapping capturing groups."))
+
+(define-condition malformed-explicit (user-error)
+  ((spec :initarg :spec))
+  (:report (lambda (c s)
+             (format s "Invalid explicit spec ~S, must be of the form \"R,G,B:string\" with colors being 0-5."
+                     (slot-value c 'spec)))))
+```
+
+Here we define the user errors.  Some of these are self-explanatory, while
+others will make more sense later once we see them in action.  The specific
+details aren't as important as the overall idea: for user errors we know might
+happen, display a helpful error message instead of just spewing a backtrace at
+the user.
+
+### Colorization
+
+Next we have the actual meat of the program.  Obviously this is going to be
+completely different for every program, so feel free to skip this if you don't
+care about this specific problem.
+
+```lisp
+;;;; Functionality ------------------------------------------------------------
+(defun rgb-code (r g b)
+  ;; The 256 color mode color values are essentially r/g/b in base 6, but
+  ;; shifted 16 higher to account for the intiial 8+8 colors.
+  (+ (* r 36)
+     (* g 6)
+     (* b 1)
+     16))
+```
+
+We're going to highlight different matches with different colors.  We'll need
+a reasonable amount of colors to make this useful, so using the basic 8/16 ANSI
+colors isn't enough.  Full 24-bit truecolor is overkill, but the 8-bit ANSI
+colors will work nicely.  If we ignore the base colors, we essentially have
+6 x 6 x 6 = 216 colors to work with.  `rgb-code` will take the red, green, and
+blue values from `0` to `5` and return the color code.  See [Wikipedia][8bit]
+for more information.
+
+[8bit]: https://en.wikipedia.org/wiki/ANSI_escape_code#8-bit
+
+```lisp
+(defun make-colors (excludep)
+  (let ((result (make-array 256 :fill-pointer 0)))
+    (dotimes (r 6)
+      (dotimes (g 6)
+        (dotimes (b 6)
+          (unless (funcall excludep (+ r g b))
+            (vector-push-extend (rgb-code r g b) result)))))
+    result))
+
+(defparameter *dark-colors*  (make-colors (lambda (v) (< v 3))))
+(defparameter *light-colors* (make-colors (lambda (v) (> v 11))))
+```
+
+Now we can build some arrays of colors.  We *could* use any of the 216 available
+colors, but in practice we probably don't want to, because the darkest colors
+will be too dark to read on a dark terminal, and vice versa for light terminals.
+In a concession to practicality we'll generate two separate arrays of colors,
+one that excludes colors whose total value is too dark and one excluding those
+that are too light.
+
+(You might notice that `*dark-colors*` is "the array of colors for dark
+terminals" and not "the array of colors which are not light".  Naming things is
+hard.)
+
+Note that these arrays will be generated when the `batchcolor.lisp` file is
+`load`ed, which is *when we build the binary*.  They *won't* be recomputed every
+time you run the resulting binary.  In this case it doesn't really matter (the
+arrays are small) but it's worth remembering in case you ever have some data you
+want (or don't want) to compute at build time instead of run time.
+
+```lisp
+(defparameter *explicits* (make-hash-table :test #'equal))
+```
+
+Here we make a hash table to store the strings and colors for strings we want to
+explicitly color (e.g. `ERR` should be red, `INFO` cyan).  The keys will be the
+strings and values the RGB codes.
+
+```lisp
+(defun djb2 (string)
+  ;; http://www.cse.yorku.ca/~oz/hash.html
+  (reduce (lambda (hash c)
+            (mod (+ (* 33 hash) c) (expt 2 64)))
+          string
+          :initial-value 5381
+          :key #'char-code))
+
+(defun find-color (string)
+  (gethash string *explicits*
+           (let ((colors (if *dark* *dark-colors* *light-colors*)))
+             (aref colors
+                   (mod (+ (djb2 string) *start*)
+                        (length colors))))))
+```
+
+For strings that we want to explicitly color, we just look up the appropriate
+code in `*explicits*` and return it.
+
+Otherwise, we want to highlight unique matches in different colors.  There are
+a number of different ways we could do this.  For example, we could randomly
+pick a color the first time we see a string and store it in a hash table for
+subsequent encounters.  But this would mean we'd grow over time, and one of the
+things I often use this utility for is `tail -f`ing long-running processes when
+developing locally, so the memory usage would grow and grow over time until the
+`batchcolor` process was restarted, which isn't ideal.
+
+Instead, we'll hash each string with a simple [DJB hash][djb] and use it to
+index into the appropriate array of colors.  This ensures that identical matches
+get identical colors, and avoids having to store every match we've ever seen.
+
+We'll talk about `*start*` later, ignore it for now (it's `0` by default).
+
+[djb]: http://www.cse.yorku.ca/~oz/hash.html
+
+```lisp
+(defun ansi-color-start (color)
+  (format nil "~C[38;5;~Dm" #\Escape color))
+
+(defun ansi-color-end ()
+  (format nil "~C[0m" #\Escape))
+
+(defun print-colorized (string)
+  (format *standard-output* "~A~A~A"
+          (ansi-color-start (find-color string))
+          string
+          (ansi-color-end)))
+```
+
+Next we have some functions to output the appropriate ANSI escapes to highlight
+our matches.  We could use a library for this but it's only 2 lines.  [It's not
+worth it](http://xn--rpa.cc/irl/term.html).
+
+And now we have the beating heart of the program:
+
+```lisp
+(defun colorize-line (scanner line &aux (start 0))
+  (ppcre:do-scans (ms me rs re scanner line)
+    ;; If we don't have any register groups, colorize the entire match.
+    ;; Otherwise, colorize each matched capturing group.
+    (let* ((regs? (plusp (length rs)))
+           (starts (if regs? (remove nil rs) (list ms)))
+           (ends   (if regs? (remove nil re) (list me))))
+      (map nil (lambda (word-start word-end)
+                 (unless (<= start word-start)
+                   (error 'overlapping-groups))
+                 (write-string line *standard-output* :start start :end word-start)
+                 (print-colorized (subseq line word-start word-end))
+                 (setf start word-end))
+           starts ends)))
+  (write-line line *standard-output* :start start))
+```
+
+`colorize-line` takes a CL-PPCRE scanner and a line, and outputs the line with
+any of the desired matches colorized appropriately.  There are a few things to
+note here.
+
+First: if the regular expression contains any capturing groups, we will only
+colorize those parts of the match.  For example, if you run `batchcolor
+'^<(\\w+)> '` to colorize the nicks in an IRC log, only the nicknames themselves
+will be highlighted, not the surrounding angle brackets.  Otherwise, if there
+are no capturing groups in the regular expression, we'll highlight the entire
+match (as if there were one big capturing group around the whole thing).
+
+Second: overlapping capturing groups are explicitly disallowed and
+a `user-error` signaled if we notice any.  It's not clear what do to in this
+case — if we match `((f)oo|(b)oo)` against `foo`, what should the output be?
+Highlight `f` and `oo` in the same color?  In different colors?  Should the `oo`
+be a different color than the `oo` in `boo`?  There's too many options with no
+clear winner, so we'll just tell the user to be more clear.
+
+To do the actual work, we iterate over each match and print the non-highlighted
+text before the match, then print the highlighted match.  Finally we print any
+remaining text after the last match.
+
+### Not-Quite-Top-Level Interface
+
+```lisp
+;;;; Run ----------------------------------------------------------------------
+(defun run% (scanner stream)
+  (loop :for line = (read-line stream nil)
+        :while line
+        :do (colorize-line scanner line)))
+
+(defun run (pattern paths)
+  (let ((scanner (handler-case (ppcre:create-scanner pattern)
+                   (ppcre:ppcre-syntax-error (c)
+                     (error 'malformed-regex :underlying-error c))))
+        (paths (or paths '("-"))))
+    (dolist (path paths)
+      (if (string= "-" path)
+        (run% scanner *standard-input*)
+        (with-open-file (stream path :direction :input)
+          (run% scanner stream))))))
+```
+
+Here we have the not-quite-top-level interface to the program.  `run` takes
+a pattern string and a list of paths and runs the colorization on each path.
+This is safe to call interactively from the REPL, e.g. `(run "<(\\w+)>"
+"foo.txt")`, so we can test without worrying about killing the Lisp process.
+
+### User Interface
+
+In the last chunk of the file we have the user interface.  There are a couple of
+things to note here.
+
+I'm using a command line argument parsing library I wrote myself: [Adopt][].
+But if you prefer another library (and there are quite a few around) feel free
+to use it — it should be pretty easy to adapt this setup to a different library.
+The only things you'd need to change would be the `toplevel` function and the
+`build-manual.sh` script (if you even care about building `man` pages at all).
+
+You might also notice that the user interface for the program is almost as much
+code as the entire rest of the program.  This may seem disconcerting at first,
+but I think it makes a certain kind of sense.  When you're writing code to
+interface with an external system, a messier and more complicated external
+system will usually require more code than a cleaner and simpler external
+system.  A human brain is probably the messiest and most complicated external
+system you'll ever have to deal with, so it's worth taking the extra time and
+code to be especially careful when writing an interface to it.
+
+```lisp
+(defparameter *option-help*
+  (adopt:make-option 'help
+    :help "Display help and exit."
+    :long "help"
+    :short #\h
+    :reduce (constantly t)))
+
+(defparameter *option-version*
+  (adopt:make-option 'version
+    :help "Display version information and exit."
+    :long "version"
+    :reduce (constantly t)))
+```
+
+```lisp
+(adopt:defparameters (*option-debug* *option-no-debug*)
+  (adopt:make-boolean-options 'debug
+    :long "debug"
+    :short #\d
+    :help "Enable the Lisp debugger."
+    :help-no "Disable the Lisp debugger (the default)."))
+```
+
+```lisp
+(adopt:defparameters (*option-randomize* *option-no-randomize*)
+  (adopt:make-boolean-options 'randomize
+    :help "Randomize the choice of color each run."
+    :help-no "Do not randomize the choice of color each run (the default)."
+    :long "randomize"
+    :short #\r))
+
+(adopt:defparameters (*option-dark* *option-light*)
+  (adopt:make-boolean-options 'dark
+    :name-no 'light
+    :long "dark"
+    :long-no "light"
+    :help "Optimize for dark terminals (the default)."
+    :help-no "Optimize for light terminals."
+    :initial-value t))
+```
+
+```lisp
+;;;; User Interface -----------------------------------------------------------
+(defun parse-explicit (spec)
+  (ppcre:register-groups-bind
+      ((#'parse-integer r g b) string)
+      ("^([0-5]),([0-5]),([0-5]):(.+)$" spec)
+    (return-from parse-explicit (cons string (rgb-code r g b))))
+  (error 'malformed-explicit :spec spec))
+
+(defparameter *option-explicit*
+  (adopt:make-option 'explicit
+    :parameter "R,G,B:STRING"
+    :help "Highlight STRING in an explicit color.  May be given multiple times."
+    :manual (format nil "~
+      Highlight STRING in an explicit color instead of randomly choosing one.  ~
+      R, G, and B must be 0-5.  STRING is treated as literal string, not a regex.  ~
+      Note that this doesn't automatically add STRING to the overall regex, you ~
+      must do that yourself!  This is a known bug that may be fixed in the future.")
+    :long "explicit"
+    :short #\e
+    :key #'parse-explicit
+    :reduce #'adopt:collect))
+```
+
+```lisp
+(adopt:define-string *help-text*
+  "batchcolor takes a regular expression and matches it against standard ~
+   input one line at a time.  Each unique match is highlighted in its own color.~@
+   ~@
+   If the regular expression contains any capturing groups, only those parts of ~
+   the matches will be highlighted.  Otherwise the entire match will be ~
+   highlighted.  Overlapping capturing groups are not supported.")
+
+(adopt:define-string *extra-manual-text*
+  "If no FILEs are given, standard input will be used.  A file of - stands for ~
+   standard input as well.~@
+   ~@
+   Overlapping capturing groups are not supported because it's not clear what ~
+   the result should be.  For example: what should ((f)oo|(b)oo) highlight when ~
+   matched against 'foo'?  Should it highlight 'foo' in one color?  The 'f' in ~
+   one color and 'oo' in another color?  Should that 'oo' be the same color as ~
+   the 'oo' in 'boo' even though the overall match was different?  There are too ~
+   many possible behaviors and no clear winner, so batchcolor disallows ~
+   overlapping capturing groups entirely.")
+
+(defparameter *examples*
+  '(("Colorize IRC nicknames in a chat log:"
+     . "cat channel.log | batchcolor '<(\\\\w+)>'")
+    ("Colorize UUIDs in a request log:"
+     . "tail -f /var/log/foo | batchcolor '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}'")
+    ("Colorize some keywords explicitly and IPv4 addresses randomly (note that the keywords have to be in the main regex too, not just in the -e options):"
+     . "batchcolor 'WARN|INFO|ERR|(?:[0-9]{1,3}\\\\.){3}[0-9]{1,3}' -e '5,0,0:ERR' -e '5,4,0:WARN' -e '2,2,5:INFO' foo.log")
+    ("Colorize earmuffed symbols in a Lisp file:"
+     . "batchcolor '(?:^|[^*])([*][-a-zA-Z0-9]+[*])(?:$|[^*])' tests/test.lisp")))
+```
+
+```lisp
+(defparameter *ui*
+  (adopt:make-interface
+    :name "batchcolor"
+    :usage "[OPTIONS] REGEX [FILE...]"
+    :summary "colorize regex matches in batches"
+    :help *help-text*
+    :manual (format nil "~A~2%~A" *help-text* *extra-manual-text*)
+    :examples *examples*
+    :contents (list
+                *option-help*
+                *option-version*
+                *option-debug*
+                *option-no-debug*
+                (adopt:make-group 'color-options
+                                  :title "Color Options"
+                                  :options (list *option-randomize*
+                                                 *option-no-randomize*
+                                                 *option-dark*
+                                                 *option-light*
+                                                 *option-explicit*)))))
+```
+
+### Top-Level Interface
+
+```lisp
+(defmacro exit-on-ctrl-c (&body body)
+  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+     (with-user-abort:user-abort () (adopt:exit 130))))
+
+(defun configure (options)
+  (loop :for (string . rgb) :in (gethash 'explicit options)
+        :do (setf (gethash string *explicits*) rgb))
+  (setf *start* (if (gethash 'randomize options)
+                  (random 256 (make-random-state t))
+                  0)
+        *dark* (gethash 'dark options)))
+
+(defun toplevel ()
+  (sb-ext:disable-debugger)
+  (exit-on-ctrl-c
+    (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*)
+      (when (gethash 'debug options)
+        (sb-ext:enable-debugger))
+      (handler-case
+          (cond
+            ((gethash 'help options) (adopt:print-help-and-exit *ui*))
+            ((gethash 'version options) (write-line *version*) (adopt:exit))
+            ((null arguments) (error 'missing-regex))
+            (t (destructuring-bind (pattern . files) arguments
+                 (configure options)
+                 (run pattern files))))
+        (user-error (e) (adopt:print-error-and-exit e))))))
+```
+
+
+
+## More Information
+
+* ieure link
+* dotfiles repo link
+
+
+