--- a/content/blog/2018/08/types-and-classes-in-common-lisp.markdown Sat Apr 18 13:10:31 2020 -0400
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,166 +0,0 @@
-+++
-title = "Types and Classes in Common Lisp"
-snip = "They're not the same thing!"
-date = 2018-08-30T16:00:00Z
-draft = true
-
-+++
-
-One thing that often confuses people new to Common Lisp is the differences
-(and interactions) between types and classes in the language. Type and classes
-are two completely separate things in Common Lisp, but if you're coming from
-modern languages it's easy to get the two blurred and confused. Hopefully this
-post will make the distinction more clear.
-
-<div id="toc"></div>
-
-## Objects
-
-Before we dive into defining types and classes we should define what an "object"
-is, because the term will come up immediately. The following definition will be
-good enough for our purposes here:
-
-**An object is a hunk of bits somewhere in memory.**
-
-Objects are the things the garbage collector manages. They're the things you
-pass to functions, and the things you return from them. Objects have identity,
-and that identity can be compared with `eq`.
-
-I realize that this is a little handwavey, but I think it's good enough to work
-with for now.
-
-TODO values
-
-(There are a few corner cases (e.g. fixnums), but you can safely ignore them
-while trying to wrap your head around this post.)
-
-## Types
-
-Types in Common Lisp can be summed up in one single line:
-
-**A type is a set of objects.**
-
-We already saw what objects are. "Set" in this definition is a set in the
-mathematical sense: an unordered collection of elements (possibly *infinitely
-many* elements) with no duplicates.
-
-That's it. That's all there is to it. This probably seems like a weird
-definition if you've never thought much about it before, but let's look at some
-examples to see what falls out of it.
-
-### Type Designators
-
-First we need a way to specify types. Common Lisp has a concept called [type
-designators][TODO] for this purpose. A type designator is something that
-represents the given type.
-
-Let's look at a common type: the set of all integers. Obviously it would be
-impractical to talk about this set by listing out all of its members.
-A mathematician would denote this type as Z TODO. A Common Lisp programmer
-would use the type designator `integer`.
-
-### Being Of a Type
-
-What are some things we might want to do with a type? One thing might be to ask
-whether a particular object "is of that type", done with `typep` in Common Lisp
-(and `instanceOf` in Java, TODO in Python, etc). But what does this actually
-*mean*?
-
-When you're thinking of types as sets of objects, asking whether an object is of
-a particular type essentially asking if the object is a member of that set!
-
-Let's look at couple of examples:
-
- (typep 42 'integer)
-
-Here we're asking "Is `42` a member of the set of all integers?"
-
- (typep x 'symbol)
-
-Now we're asking "Is the object that `x` evaluates to a member of the set of all
-symbols?".
-
- (typecase foo
- (symbol ...a...)
- (integer ...b...)
- (number ...c...))
-
-And now we're saying "Evaluate `foo`. If the result is a member of the set of
-all symbols, do `a`. Otherwise if it's a member of the set of all integers, do
-`b`. Otherwise if it's a member of the set of all numbers, do `c`. Otherwise
-return `nil`.".
-
-### Subtypes and Supertypes
-
-So checking if an object is of a particular type is simply the set membership
-operation. It turns out that other set operations also have useful definitions
-when you think this way:
-
-* If `foo` is a supertype of `bar`, that means `foo` is a superset of `bar`.
-* If `bar` is a subtype of `foo`, that means `bar` is a subtype of `foo`.
-
-For example:
-
-* The set of all integers is a subset of the set of all real numbers, which
- makes `integer` a subtype of `real`.
-* The set of all symbols is a superset of the set of all keyword symbols, so
- `symbol` is a supertype of `keyword`.
-* The set of all floating point numbers is neither a subset nor a superset of
- the set of all symbols, so neither is a subtype of the other.
-
-Common Lisp's numeric tower consists of nice sequences of types that get more
-and more specific as you take further subsets:
-
-`number` ⊆ `real` ⊆ `rational` ⊆ `integer` ⊆ `fixnum`
-
-### Explicit Designation
-
-There are other ways to designate sets too. You can use the `member` type
-designator to just list out the members of a set by hand if you want:
-
- (member 1 2 3) ; => designates the set {1, 2, 3}
-
-### Everything and Nothing
-
-The type `t` is the set of all objects. The type `nil` is the empty set.
-
-This last one can sometimes cause confusion. The symbol `nil` is *not* of type
-`nil` because the symbol `nil` is not a member of the empty set (because it's
-empty!). If you want to talk about the set containing the symbol `nil`, that's
-called `null`:
-
-```lisp
-(typep nil nil) ; => NIL, nil is NOT a member of the empty set
-(typep nil null) ; => T, nil IS a member of { nil }
-```
-
-### Type Combinations
-
-Let's look at some more set operations. Set complement is probably the
-simplest, and Common Lisp supports this with the `not` compound type specifier
-TODO:
-
-```lisp
-(typep 1/2 '(not integer)) ; => T
-(typep "hello" '(not integer)) ; => T
-(typep 42 '(not integer)) ; => NIL
-```
-
-Set union is covered with `and`:
-
-```lisp
-(typep 1 '(or integer string)) ; => T
-(typep "hi" '(or integer string)) ; => T
-(typep :what '(or integer string)) ; => NIL
-```
-
-And of course set intersection is done with `and`:
-
-```lisp
-(typep 1 )
-```
-
-
-## Classes
-
-## Blurring the Line
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/content/blog/2021/01/small-common-lisp-cli-programs.markdown Sun Dec 27 22:08:26 2020 -0500
@@ -0,0 +1,854 @@
+(:title "Writing Small CLI Programs in Common Lisp"
+ :snip "Somewhere between tiny shell scripts and full projects."
+ :date "2021-01-02T15:50:00Z"
+ :draft t)
+
+I've found Common Lisp to be a good language for this. But it can be a little
+intimidating to get started, especially for beginners, because Common Lisp is
+a very flexible language and doesn't lock you into one way of working. In this
+post I'll describe how I write small, stand-alone command line programs in
+Common Lisp.
+
+<div id="toc"></div>
+
+## Requirements
+
+When you're writing programs in Common Lisp, you've got a lot of options. For
+this use case, laying out the requirements I have helped me decide on an
+approach.
+
+First, each new program should be one single file. A few ancillary files for
+all scripts together (e.g. a `Makefile`) are okay, but adding a new program
+should mean adding one single file. For larger programs a full project
+directory and ASDF system are great, but for small programs having one file per
+program reduces the overhead quite a bit.
+
+The programs need to be able to be developed in the typical Common Lisp
+interactive style (in my case: with Swank and VLIME). The interactive
+development is one of the best parts of working in Common Lisp, and I won't give
+it up. In particular, this means that a shell-script style approach, with
+`#!/path/to/sbcl --script` and the top and directly running code at the top
+level in the file does not work for two main reasons:
+
+* Loading that file will fail due to the shebang unless you have some ugly
+ reader macros in your startup file.
+* The program will need to do things like parsing command-line arguments and
+ exiting with an error code, and running `exit` would kill the Swank process.
+
+The programs need to be able to use libraries, so Quicklisp will need to be
+involved. Common Lisp has a lot of nice things built-in, but there's some
+libraries that are just too useful to pass up.
+
+The programs will need to have proper user interfaces. Command line arguments
+must be robustly parsed (e.g. collapsing `-a -b -c foo -d` into `-abc foo -d`
+should work as expected), malformed or unknown options must be caught instead of
+dropping them on the floor, error messages should be meaningful, and the
+`--help` should be thoroughly and thoughtfully written.
+
+Relying on some basic conventions (e.g. a command `foo` is always in `foo.lisp`
+and defines a package `foo` with a function called `toplevel`) is okay if it
+makes my life easier. These programs are just for me, so I don't have to worry
+about people wanting to create executables with spaces in the name or something.
+
+## Solution Skeleton
+
+After trying a number of different approaches, I've settled on a solution that
+I'm pretty happy with. First I'll describe the general approach, then we'll
+look at one actual example program in its entirety.
+
+### Directory Structure
+
+I keep all my small single-file Common Lisp programs in a `lisp` directory
+inside my dotfiles repository. Its contents look like this:
+
+```
+…/lisp/
+ bin/
+ foo
+ bar
+ man/
+ man1/
+ foo.1
+ bar.1
+ build-binary.sh
+ build-manual.sh
+ Makefile
+ foo.lisp
+ bar.lisp
+```
+
+The `bin` directory is where the executable files end up. I've added it to my
+`$PATH` so I don't have to symlink or copy the binaries anywhere.
+
+`man` contains the generated `man` pages. Because it's adjacent to `bin` (which
+is on my path) the `man` program automatically finds the `man` pages as
+expected.
+
+`build-binary.sh`, `build-manual.sh`, and `Makefile` are some glue to make
+building programs easier.
+
+The `.lisp` files are, of course, the programs. Each new program I want to add
+only requires adding the `<programname>.lisp` file in this directory and running
+`make`.
+
+### Lisp Files
+
+All my small Common Lisp programs follow a few conventions, which makes building
+them easier. Let's look at the skeleton of a `foo.lisp` file as an example.
+I'll show the entire file here, and then step through it piece by piece.
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+ (ql:quickload '(… :with-user-abort) :silent t))
+
+(defpackage :foo
+ (:use :cl)
+ (:export :toplevel …))
+
+(in-package :foo)
+
+;;;; Configuration -----------------------------------------------
+(defparameter *version* "1.0.0")
+(defparameter *some-option* nil)
+
+;;;; Errors ------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-foo (user-error) ()
+ (:report "A foo is required, but none was supplied."))
+
+;;;; Functionality -----------------------------------------------
+(defun foo (string)
+ …)
+
+;;;; Run ---------------------------------------------------------
+(defun run (arguments)
+ (map nil #'foo arguments))
+
+;;;; User Interface ----------------------------------------------
+(defmacro exit-on-ctrl-c (&body body)
+ `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+ (with-user-abort:user-abort () (sb-ext:exit :code 130))))
+
+(defparameter *ui*
+ (adopt:make-interface
+ :name "foo"
+ …))
+
+(defun toplevel ()
+ (sb-ext:disable-debugger)
+ (exit-on-ctrl-c
+ (multiple-value-bind (arguments options)
+ (adopt:parse-options-or-exit *ui*)
+ … ; Handle options.
+ (handler-case (run arguments)
+ (user-error (e)
+ (format *error-output* "error: ~A~%" e)
+ (adopt:exit 1))))))
+```
+
+Let's go through each chunk of this.
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+ (ql:quickload '(:with-user-abort …) :silent t))
+```
+
+First we `quickload` any necessary libraries. We always want to do this, even
+when compiling the file, because we need the appropriate packages to be loaded
+when we try to use their symbols later in the file.
+
+[`with-user-abort`](https://github.com/compufox/with-user-abort) is a library
+for portably handling `control-c`, which all of these small programs use.
+
+```lisp
+(defpackage :foo
+ (:use :cl)
+ (:export :toplevel *ui*))
+
+(in-package :foo)
+```
+
+Next we define a package `foo` and switch to it. The package is always named
+the same as the resulting binary and the basename of the file. The package
+always exports the symbols `toplevel` and `*ui*`.
+
+```lisp
+;;;; Configuration -----------------------------------------------
+(defparameter *version* "1.0.0")
+(defparameter *some-option* nil)
+```
+
+Next we define any configuration variables. These will be set later after
+parsing the command line arguments (when we run the command line program) or
+at the REPL (when developing interactively).
+
+```lisp
+;;;; Errors ------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-foo (user-error) ()
+ (:report "A foo is required, but none was supplied."))
+```
+
+We define a `user-error` condition, and any errors the user might make will
+inherit from it. This will make it easy to treat user errors (e.g. passing
+a mangled regular expression like `(foo+` as an argument) differently from
+programming errors (i.e. bugs). This makes it easier to treat those errors
+differently:
+
+* Bugs should print a backtrace or enter the debugger.
+* Expected user errors should print a helpful error message with no backtrace or debugger.
+
+```lisp
+;;;; Functionality -----------------------------------------------
+(defun foo (string)
+ …)
+```
+
+Next we have the actual functionality of the program.
+
+```lisp
+;;;; Run ---------------------------------------------------------
+(defun run (arguments)
+ (map nil #'foo arguments))
+```
+
+We now define a function `run` that takes some arguments (as strings) and
+performs the main work of the program.
+
+Importantly, `run` does **not** handle command line argument parsing, and it does
+**not** exit the program with an error code, which means we can safely call it
+to "run the program" when we're developing interactively without worrying about
+it killing our Lisp process.
+
+Finally, we need to define the command line interface.
+
+```lisp
+;;;; User Interface ----------------------------------------------
+(defmacro exit-on-ctrl-c (&body body)
+ `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+ (with-user-abort:user-abort () (sb-ext:exit :code 130))))
+```
+
+We'll make a little macro around `with-user-abort` to make it less wordy. We'll
+[exit with a status of 130](https://tldp.org/LDP/abs/html/exitcodes.html) if the
+user presses `ctrl-c`.
+
+```lisp
+(defparameter *ui*
+ (adopt:make-interface
+ :name "foo"
+ …))
+```
+
+Here we define the `*ui*` variable whose symbol we exported above. [Adopt][] is
+a command line argument parsing library I wrote. If you want to use a different
+library, feel free.
+
+[Adopt]: https://docs.stevelosh.com/adopt
+
+```lisp
+(defun toplevel ()
+ (sb-ext:disable-debugger)
+ (exit-on-ctrl-c
+ (multiple-value-bind (arguments options)
+ (adopt:parse-options-or-exit *ui*)
+ … ; Handle options.
+ (handler-case (run arguments)
+ (user-error (e)
+ (format *error-output* "error: ~A~%" e)
+ (adopt:exit 1))))))
+```
+
+And finally we define the `toplevel` function. This will only ever be called
+when the program is run as a standalone program, never interactively. It
+handles all the work beyond the main guts of the program that are handled by the
+`run` function:
+
+* Disabling or enabling the debugger.
+* Exiting the process with the appropriate status code on errors.
+* Parsing command line arguments.
+
+That's it for the structure of the `.lisp` files.
+
+### Building Binaries
+
+`build-binary.sh` is a small script to build the executable binaries from the
+`.lisp` files. `./build-binary.sh foo.lisp` will build `foo`:
+
+```bash
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+LISP=$1
+NAME=$(basename "$1" .lisp)
+shift
+
+sbcl --load "$LISP" \
+ --eval "(sb-ext:save-lisp-and-die \"$NAME\"
+ :executable t
+ :save-runtime-options t
+ :toplevel '$NAME:toplevel)"
+```
+
+Here we see where the naming conventions have become important — we know that
+the package is named the same as the binary and that it will have the symbol
+`toplevel` exported, which names the entry point for the binary.
+
+### Building Man Pages
+
+`build-manual.sh` is similar and builds the `man` pages using [Adopt][]'s
+built-in `man` page generation. If you don't care about building `man` pages
+for your personal programs (I admit, it's a little bit silly) you can ignore
+this.
+
+```bash
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+LISP=$1
+NAME=$(basename "$LISP" .lisp)
+OUT="$NAME.1"
+shift
+
+sbcl --load "$LISP" \
+ --eval "(with-open-file (f \"$OUT\" :direction :output :if-exists :supersede)
+ (adopt:print-manual $NAME:*ui* :stream f))" \
+ --quit
+```
+
+This is why we always name the Adopt interface variable `*ui*` and export it
+from the package.
+
+### Makefile
+
+Finally we have a simple `Makefile` so we can run `make` to regenerate any
+out of date binaries and `man` pages:
+
+```make
+files := $(wildcard *.lisp)
+names := $(files:.lisp=)
+
+.PHONY: all clean $(names)
+
+all: $(names)
+
+$(names): %: bin/% man/man1/%.1
+
+bin/%: %.lisp build-binary.sh Makefile
+ mkdir -p bin
+ ./build-binary.sh $<
+ mv $(@F) bin/
+
+man/man1/%.1: %.lisp build-manual.sh Makefile
+ mkdir -p man/man1
+ ./build-manual.sh $<
+ mv $(@F) man/man1/
+
+clean:
+ rm -rf bin man
+```
+
+We use a `wildcard` to automatically find the `.lisp` files so we don't have to
+do anything other than adding a new file when we want to make a new program.
+
+The most notable line here is `$(names): %: bin/% man/man1/%.1` which uses
+a [static pattern rule](https://www.gnu.org/software/make/manual/html_node/Static-Pattern.html#Static-Pattern)
+to automatically define the phony rules for building each program. If
+`$(names)` is `foo bar` this line effectively defines two phony rules:
+
+```
+foo: bin/foo man/man1/foo.1
+bar: bin/bar man/man1/bar.1
+```
+
+This lets us run `make foo` to make the binary and `man` page for `foo.lisp`.
+
+## Case Study: A Batch Coloring Utility
+
+Now that we've seen the skeleton, let's look at one of my actual programs that
+I use all the time. It's called `batchcolor` and it's used to highlight regular
+expression matches in text (usually log files in my case) with a twist: each
+unique match is highlighted in a separate color, which makes it easier to
+visually parse the result.
+
+For example, suppose we have some log files with lines of the form `<timestamp>
+[<request ID>] <level> <message>` where request ID is a UUID, and messages might
+contain other UUIDs for various things. Such a log file might look something
+like this:
+
+```
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8b7543/
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543 is not cached, retrieving...
+2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] WARN User 28b2d548-eff1-471c-b807-cc2bcee76b7d does not have access to thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543, denying request.
+2021-01-02 14:01:46 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Returning HTTP 404.
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8d7543/
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8d7543 is not cached, retrieving...
+2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] INFO Incoming request POST /users/sign-up/
+2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Returning HTTP 200.
+2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Error running SQL query: connection refused.
+2021-01-02 14:01:47 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Returning HTTP 500.
+```
+
+If I try to just read this directly, it's easy for my eyes to glaze over unless
+I laboriously read line-by-line. I can use `grep` to highlight the UUIDs, but
+that honestly doesn't help too much:
+
+ grep -P '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}|$' foo.log
+
+`batchcolor` also highlights matches, but highlights each unique match in its
+own color:
+
+This is *much* easier for me to visually parse. The interleaving of separate
+request logs is now obvious from the colors of the IDs, and it's easy to match
+up various user IDs and thing IDs at a glance. Did you even notice that the two
+thing IDs were different before?
+
+`batchcolor` has a few other simple quality of life features, like picking
+explicit colors for specific strings (e.g. red for `ERR`):
+
+I wrap up this `batchcolor` invocation in an alias and use it to `tail` log
+files when developing locally almost every day, and it makes reading the log
+output *much* easier.
+
+Let's step through its code piece by piece.
+
+### Libraries
+
+```lisp
+(eval-when (:compile-toplevel :load-toplevel :execute)
+ (ql:quickload '(:adopt :cl-ppcre :with-user-abort) :silent t))
+```
+
+First we `quickload` libraries. We'll use [Adopt][] for command line argument
+processing, [cl-ppcre][] for regular expressions, and the previously-mentioned
+[with-user-abort][] to handle `control-c`.
+
+### Package
+
+```lisp
+(defpackage :batchcolor
+ (:use :cl)
+ (:export :toplevel :*ui*))
+
+(in-package :batchcolor)
+```
+
+We define and switch to the appropriately-named package. Nothing special here.
+
+### Configuration
+
+```lisp
+;;;; Configuration ------------------------------------------------------------
+(defparameter *version* "1.0.0")
+(defparameter *start* 0)
+(defparameter *dark* t)
+```
+
+Next we `defparameter` some variables to hold useful values (like the version)
+and settings. `*start*` will be used later when randomizing colors, don't worry
+about it for now.
+
+### Errors
+
+```lisp
+;;;; Errors -------------------------------------------------------------------
+(define-condition user-error (error) ())
+
+(define-condition missing-regex (user-error) ()
+ (:report "A regular expression is required."))
+
+(define-condition malformed-regex (user-error)
+ ((underlying-error :initarg :underlying-error))
+ (:report (lambda (c s)
+ (format s "Invalid regex: ~A" (slot-value c 'underlying-error)))))
+
+(define-condition overlapping-groups (user-error) ()
+ (:report "Invalid regex: seems to contain overlapping capturing groups."))
+
+(define-condition malformed-explicit (user-error)
+ ((spec :initarg :spec))
+ (:report (lambda (c s)
+ (format s "Invalid explicit spec ~S, must be of the form \"R,G,B:string\" with colors being 0-5."
+ (slot-value c 'spec)))))
+```
+
+Here we define the user errors. Some of these are self-explanatory, while
+others will make more sense later once we see them in action. The specific
+details aren't as important as the overall idea: for user errors we know might
+happen, display a helpful error message instead of just spewing a backtrace at
+the user.
+
+### Colorization
+
+Next we have the actual meat of the program. Obviously this is going to be
+completely different for every program, so feel free to skip this if you don't
+care about this specific problem.
+
+```lisp
+;;;; Functionality ------------------------------------------------------------
+(defun rgb-code (r g b)
+ ;; The 256 color mode color values are essentially r/g/b in base 6, but
+ ;; shifted 16 higher to account for the intiial 8+8 colors.
+ (+ (* r 36)
+ (* g 6)
+ (* b 1)
+ 16))
+```
+
+We're going to highlight different matches with different colors. We'll need
+a reasonable amount of colors to make this useful, so using the basic 8/16 ANSI
+colors isn't enough. Full 24-bit truecolor is overkill, but the 8-bit ANSI
+colors will work nicely. If we ignore the base colors, we essentially have
+6 x 6 x 6 = 216 colors to work with. `rgb-code` will take the red, green, and
+blue values from `0` to `5` and return the color code. See [Wikipedia][8bit]
+for more information.
+
+[8bit]: https://en.wikipedia.org/wiki/ANSI_escape_code#8-bit
+
+```lisp
+(defun make-colors (excludep)
+ (let ((result (make-array 256 :fill-pointer 0)))
+ (dotimes (r 6)
+ (dotimes (g 6)
+ (dotimes (b 6)
+ (unless (funcall excludep (+ r g b))
+ (vector-push-extend (rgb-code r g b) result)))))
+ result))
+
+(defparameter *dark-colors* (make-colors (lambda (v) (< v 3))))
+(defparameter *light-colors* (make-colors (lambda (v) (> v 11))))
+```
+
+Now we can build some arrays of colors. We *could* use any of the 216 available
+colors, but in practice we probably don't want to, because the darkest colors
+will be too dark to read on a dark terminal, and vice versa for light terminals.
+In a concession to practicality we'll generate two separate arrays of colors,
+one that excludes colors whose total value is too dark and one excluding those
+that are too light.
+
+(You might notice that `*dark-colors*` is "the array of colors for dark
+terminals" and not "the array of colors which are not light". Naming things is
+hard.)
+
+Note that these arrays will be generated when the `batchcolor.lisp` file is
+`load`ed, which is *when we build the binary*. They *won't* be recomputed every
+time you run the resulting binary. In this case it doesn't really matter (the
+arrays are small) but it's worth remembering in case you ever have some data you
+want (or don't want) to compute at build time instead of run time.
+
+```lisp
+(defparameter *explicits* (make-hash-table :test #'equal))
+```
+
+Here we make a hash table to store the strings and colors for strings we want to
+explicitly color (e.g. `ERR` should be red, `INFO` cyan). The keys will be the
+strings and values the RGB codes.
+
+```lisp
+(defun djb2 (string)
+ ;; http://www.cse.yorku.ca/~oz/hash.html
+ (reduce (lambda (hash c)
+ (mod (+ (* 33 hash) c) (expt 2 64)))
+ string
+ :initial-value 5381
+ :key #'char-code))
+
+(defun find-color (string)
+ (gethash string *explicits*
+ (let ((colors (if *dark* *dark-colors* *light-colors*)))
+ (aref colors
+ (mod (+ (djb2 string) *start*)
+ (length colors))))))
+```
+
+For strings that we want to explicitly color, we just look up the appropriate
+code in `*explicits*` and return it.
+
+Otherwise, we want to highlight unique matches in different colors. There are
+a number of different ways we could do this. For example, we could randomly
+pick a color the first time we see a string and store it in a hash table for
+subsequent encounters. But this would mean we'd grow over time, and one of the
+things I often use this utility for is `tail -f`ing long-running processes when
+developing locally, so the memory usage would grow and grow over time until the
+`batchcolor` process was restarted, which isn't ideal.
+
+Instead, we'll hash each string with a simple [DJB hash][djb] and use it to
+index into the appropriate array of colors. This ensures that identical matches
+get identical colors, and avoids having to store every match we've ever seen.
+
+We'll talk about `*start*` later, ignore it for now (it's `0` by default).
+
+[djb]: http://www.cse.yorku.ca/~oz/hash.html
+
+```lisp
+(defun ansi-color-start (color)
+ (format nil "~C[38;5;~Dm" #\Escape color))
+
+(defun ansi-color-end ()
+ (format nil "~C[0m" #\Escape))
+
+(defun print-colorized (string)
+ (format *standard-output* "~A~A~A"
+ (ansi-color-start (find-color string))
+ string
+ (ansi-color-end)))
+```
+
+Next we have some functions to output the appropriate ANSI escapes to highlight
+our matches. We could use a library for this but it's only 2 lines. [It's not
+worth it](http://xn--rpa.cc/irl/term.html).
+
+And now we have the beating heart of the program:
+
+```lisp
+(defun colorize-line (scanner line &aux (start 0))
+ (ppcre:do-scans (ms me rs re scanner line)
+ ;; If we don't have any register groups, colorize the entire match.
+ ;; Otherwise, colorize each matched capturing group.
+ (let* ((regs? (plusp (length rs)))
+ (starts (if regs? (remove nil rs) (list ms)))
+ (ends (if regs? (remove nil re) (list me))))
+ (map nil (lambda (word-start word-end)
+ (unless (<= start word-start)
+ (error 'overlapping-groups))
+ (write-string line *standard-output* :start start :end word-start)
+ (print-colorized (subseq line word-start word-end))
+ (setf start word-end))
+ starts ends)))
+ (write-line line *standard-output* :start start))
+```
+
+`colorize-line` takes a CL-PPCRE scanner and a line, and outputs the line with
+any of the desired matches colorized appropriately. There are a few things to
+note here.
+
+First: if the regular expression contains any capturing groups, we will only
+colorize those parts of the match. For example, if you run `batchcolor
+'^<(\\w+)> '` to colorize the nicks in an IRC log, only the nicknames themselves
+will be highlighted, not the surrounding angle brackets. Otherwise, if there
+are no capturing groups in the regular expression, we'll highlight the entire
+match (as if there were one big capturing group around the whole thing).
+
+Second: overlapping capturing groups are explicitly disallowed and
+a `user-error` signaled if we notice any. It's not clear what do to in this
+case — if we match `((f)oo|(b)oo)` against `foo`, what should the output be?
+Highlight `f` and `oo` in the same color? In different colors? Should the `oo`
+be a different color than the `oo` in `boo`? There's too many options with no
+clear winner, so we'll just tell the user to be more clear.
+
+To do the actual work, we iterate over each match and print the non-highlighted
+text before the match, then print the highlighted match. Finally we print any
+remaining text after the last match.
+
+### Not-Quite-Top-Level Interface
+
+```lisp
+;;;; Run ----------------------------------------------------------------------
+(defun run% (scanner stream)
+ (loop :for line = (read-line stream nil)
+ :while line
+ :do (colorize-line scanner line)))
+
+(defun run (pattern paths)
+ (let ((scanner (handler-case (ppcre:create-scanner pattern)
+ (ppcre:ppcre-syntax-error (c)
+ (error 'malformed-regex :underlying-error c))))
+ (paths (or paths '("-"))))
+ (dolist (path paths)
+ (if (string= "-" path)
+ (run% scanner *standard-input*)
+ (with-open-file (stream path :direction :input)
+ (run% scanner stream))))))
+```
+
+Here we have the not-quite-top-level interface to the program. `run` takes
+a pattern string and a list of paths and runs the colorization on each path.
+This is safe to call interactively from the REPL, e.g. `(run "<(\\w+)>"
+"foo.txt")`, so we can test without worrying about killing the Lisp process.
+
+### User Interface
+
+In the last chunk of the file we have the user interface. There are a couple of
+things to note here.
+
+I'm using a command line argument parsing library I wrote myself: [Adopt][].
+But if you prefer another library (and there are quite a few around) feel free
+to use it — it should be pretty easy to adapt this setup to a different library.
+The only things you'd need to change would be the `toplevel` function and the
+`build-manual.sh` script (if you even care about building `man` pages at all).
+
+You might also notice that the user interface for the program is almost as much
+code as the entire rest of the program. This may seem disconcerting at first,
+but I think it makes a certain kind of sense. When you're writing code to
+interface with an external system, a messier and more complicated external
+system will usually require more code than a cleaner and simpler external
+system. A human brain is probably the messiest and most complicated external
+system you'll ever have to deal with, so it's worth taking the extra time and
+code to be especially careful when writing an interface to it.
+
+```lisp
+(defparameter *option-help*
+ (adopt:make-option 'help
+ :help "Display help and exit."
+ :long "help"
+ :short #\h
+ :reduce (constantly t)))
+
+(defparameter *option-version*
+ (adopt:make-option 'version
+ :help "Display version information and exit."
+ :long "version"
+ :reduce (constantly t)))
+```
+
+```lisp
+(adopt:defparameters (*option-debug* *option-no-debug*)
+ (adopt:make-boolean-options 'debug
+ :long "debug"
+ :short #\d
+ :help "Enable the Lisp debugger."
+ :help-no "Disable the Lisp debugger (the default)."))
+```
+
+```lisp
+(adopt:defparameters (*option-randomize* *option-no-randomize*)
+ (adopt:make-boolean-options 'randomize
+ :help "Randomize the choice of color each run."
+ :help-no "Do not randomize the choice of color each run (the default)."
+ :long "randomize"
+ :short #\r))
+
+(adopt:defparameters (*option-dark* *option-light*)
+ (adopt:make-boolean-options 'dark
+ :name-no 'light
+ :long "dark"
+ :long-no "light"
+ :help "Optimize for dark terminals (the default)."
+ :help-no "Optimize for light terminals."
+ :initial-value t))
+```
+
+```lisp
+;;;; User Interface -----------------------------------------------------------
+(defun parse-explicit (spec)
+ (ppcre:register-groups-bind
+ ((#'parse-integer r g b) string)
+ ("^([0-5]),([0-5]),([0-5]):(.+)$" spec)
+ (return-from parse-explicit (cons string (rgb-code r g b))))
+ (error 'malformed-explicit :spec spec))
+
+(defparameter *option-explicit*
+ (adopt:make-option 'explicit
+ :parameter "R,G,B:STRING"
+ :help "Highlight STRING in an explicit color. May be given multiple times."
+ :manual (format nil "~
+ Highlight STRING in an explicit color instead of randomly choosing one. ~
+ R, G, and B must be 0-5. STRING is treated as literal string, not a regex. ~
+ Note that this doesn't automatically add STRING to the overall regex, you ~
+ must do that yourself! This is a known bug that may be fixed in the future.")
+ :long "explicit"
+ :short #\e
+ :key #'parse-explicit
+ :reduce #'adopt:collect))
+```
+
+```lisp
+(adopt:define-string *help-text*
+ "batchcolor takes a regular expression and matches it against standard ~
+ input one line at a time. Each unique match is highlighted in its own color.~@
+ ~@
+ If the regular expression contains any capturing groups, only those parts of ~
+ the matches will be highlighted. Otherwise the entire match will be ~
+ highlighted. Overlapping capturing groups are not supported.")
+
+(adopt:define-string *extra-manual-text*
+ "If no FILEs are given, standard input will be used. A file of - stands for ~
+ standard input as well.~@
+ ~@
+ Overlapping capturing groups are not supported because it's not clear what ~
+ the result should be. For example: what should ((f)oo|(b)oo) highlight when ~
+ matched against 'foo'? Should it highlight 'foo' in one color? The 'f' in ~
+ one color and 'oo' in another color? Should that 'oo' be the same color as ~
+ the 'oo' in 'boo' even though the overall match was different? There are too ~
+ many possible behaviors and no clear winner, so batchcolor disallows ~
+ overlapping capturing groups entirely.")
+
+(defparameter *examples*
+ '(("Colorize IRC nicknames in a chat log:"
+ . "cat channel.log | batchcolor '<(\\\\w+)>'")
+ ("Colorize UUIDs in a request log:"
+ . "tail -f /var/log/foo | batchcolor '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}'")
+ ("Colorize some keywords explicitly and IPv4 addresses randomly (note that the keywords have to be in the main regex too, not just in the -e options):"
+ . "batchcolor 'WARN|INFO|ERR|(?:[0-9]{1,3}\\\\.){3}[0-9]{1,3}' -e '5,0,0:ERR' -e '5,4,0:WARN' -e '2,2,5:INFO' foo.log")
+ ("Colorize earmuffed symbols in a Lisp file:"
+ . "batchcolor '(?:^|[^*])([*][-a-zA-Z0-9]+[*])(?:$|[^*])' tests/test.lisp")))
+```
+
+```lisp
+(defparameter *ui*
+ (adopt:make-interface
+ :name "batchcolor"
+ :usage "[OPTIONS] REGEX [FILE...]"
+ :summary "colorize regex matches in batches"
+ :help *help-text*
+ :manual (format nil "~A~2%~A" *help-text* *extra-manual-text*)
+ :examples *examples*
+ :contents (list
+ *option-help*
+ *option-version*
+ *option-debug*
+ *option-no-debug*
+ (adopt:make-group 'color-options
+ :title "Color Options"
+ :options (list *option-randomize*
+ *option-no-randomize*
+ *option-dark*
+ *option-light*
+ *option-explicit*)))))
+```
+
+### Top-Level Interface
+
+```lisp
+(defmacro exit-on-ctrl-c (&body body)
+ `(handler-case (with-user-abort:with-user-abort (progn ,@body))
+ (with-user-abort:user-abort () (adopt:exit 130))))
+
+(defun configure (options)
+ (loop :for (string . rgb) :in (gethash 'explicit options)
+ :do (setf (gethash string *explicits*) rgb))
+ (setf *start* (if (gethash 'randomize options)
+ (random 256 (make-random-state t))
+ 0)
+ *dark* (gethash 'dark options)))
+
+(defun toplevel ()
+ (sb-ext:disable-debugger)
+ (exit-on-ctrl-c
+ (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*)
+ (when (gethash 'debug options)
+ (sb-ext:enable-debugger))
+ (handler-case
+ (cond
+ ((gethash 'help options) (adopt:print-help-and-exit *ui*))
+ ((gethash 'version options) (write-line *version*) (adopt:exit))
+ ((null arguments) (error 'missing-regex))
+ (t (destructuring-bind (pattern . files) arguments
+ (configure options)
+ (run pattern files))))
+ (user-error (e) (adopt:print-error-and-exit e))))))
+```
+
+
+
+## More Information
+
+* ieure link
+* dotfiles repo link
+
+
+