e448a6cc762a

Start CHIP-8 entry
[view raw] [browse files]
author Steve Losh <steve@stevelosh.com>
date Sat, 17 Dec 2016 21:37:05 -0500
parents b65cd309d469
children 58287c034c3c
branches/tags (none)
files content/blog/2016/12/chip8-cpu.markdown static/media/css/sjl.css static/media/css/sjl.less

Changes

--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/content/blog/2016/12/chip8-cpu.markdown	Sat Dec 17 21:37:05 2016 -0500
@@ -0,0 +1,1000 @@
++++
+title = "CHIP-8 in Common Lisp: The CPU"
+snip = "Let's write an emulator."
+date = 2016-12-19T14:50:00Z
+draft = true
+
++++
+
+A while back I decided to try to write a Game Boy emulator in Common Lisp based
+on [this series of articles][imran].  I made some good progress but eventually
+got bogged down because I was trying to learn a bunch of complex new things at
+once:
+
+* How to write an emulator
+* How to use QT with Common Lisp
+* How the Game Boy works internally
+
+Instead of dragging on, I decided to take a break and try something simpler:
+a [CHIP-8][] emulator/interpreter.  The CHIP-8 is much simpler than the Game
+Boy, which made it easier to experiment with the rest of the infrastructure.
+
+In this post and a couple of future ones I'll walk through all of my CHIP-8
+emulator implementation.  To give you a rough idea of the size of the project,
+`cloc` reports:
+
+* Basic emulator: 415 lines
+* Debugging infrastructure: 142 lines
+* Screen GUI: 153 lines
+* Graphical debugger: 295 lines
+
+This first post will deal with emulating the CHIP-8's CPU.
+
+The full emulator source is on [BitBucket][] and [GitHub][].
+
+[imran]: http://imrannazar.com/GameBoy-Emulation-in-JavaScript
+[CHIP-8]: https://en.wikipedia.org/wiki/CHIP-8
+[BitBucket]: https://bitbucket.org/sjl/cl-chip8
+[GitHub]: https://github.com/sjl/cl-chip8
+
+<div id="toc"></div>
+
+## Libraries
+
+The emulator uses a few Common Lisp libraries to make things easier:
+
+* [bordeaux-threads][] to handle threading.
+* [cl-arrows][] for an implementation of Clojure's `-<>` threading macro.
+* [cl-losh][] is my own personal utility library.
+* [cl-portaudio][] for audio.
+* [iterate][] for a much nicer version of `loop` called `iterate`.  My utility
+  library contains several `iterate` drivers (some of which I've written about
+  before).
+* [qtools][] to handle creating a GUI with QT.
+* [quickutil][] for some miscellaneous utility functions from Alexandria and
+  elsewhere.
+
+I'll try to remember to mention whenever I use a function that's not built-in to
+Common Lisp, but I might forget, in which case it's probably in one of these.
+
+[bordeaux-threads]: https://common-lisp.net/project/bordeaux-threads/
+[cl-arrows]: https://github.com/nightfly19/cl-arrows
+[cl-losh]: https://github.com/sjl/cl-losh
+[cl-portaudio]: https://filonenko-mikhail.github.io/cl-portaudio/
+[iterate]: https://common-lisp.net/project/iterate/
+[qtools]: https://shinmera.github.io/qtools/
+[quickutil]: https://github.com/tarballs-are-good/quickutil
+
+## CHIP-8 References
+
+There's a good amount of information available about the CHIP-8 online.  The
+references I used most often were:
+
+* [Cowgod's "CHIP-8 Technical Reference"](http://devernay.free.fr/hacks/chip8/C8TECH10.HTM)
+* [Laurence Muller's "How to write an emulator (CHIP-8 interpreter)"](http://www.multigesture.net/articles/how-to-write-an-emulator-CHIP-8-interpreter/)
+* [Matthew Mikolay's "Mastering CHIP-8"](http://mattmik.com/files/chip8/mastering/chip8.html)
+* [(Super)CHIP 8 Secrets](https://github.com/AfBu/haxe-CHIP-8-emulator/wiki/\(Super\)CHIP-8-Secrets)
+
+## The Main Data Structure
+
+Let's dive into the code.  We'll start with the main data structure that will
+hold an instance of a CHIP-8 for emulation:
+
+
+```lisp
+(defstruct chip
+  (running t :type boolean)
+  ; ...more to come later
+  )
+```
+
+We're using a Lisp struct instead of a CLOS class because we're going to be
+accessing the fields of this thing *a lot* and CLOS accessors can be
+comparatively slow.  This is one of the very few concessions we'll make to
+performance.
+
+The first field is `running`, which is just a boolean that represents whether
+the emulator is currently running.
+
+We'll eventually have multiple threads poking at this struct, and they'll
+generally be doing something like `(loop :while (chip-running chip) :do ...)`.
+That way when we want to quit the emulator we can just set `running` to `nil`
+and everything will (eventually) stop.
+### Registers
+
+The CHIP-8 has sixteen main registers, as well as a few other special ones:
+
+
+```lisp
+(defstruct chip
+  ; ...
+  (registers (make-array 16 :element-type 'int8)
+             :type (simple-array int8 (16))
+             :read-only t)
+  (index 0 :type int16)
+  (program-counter #x200 :type int12)
+  ; ...
+  )
+```
+
+We'll keep the main registers in an array, and the others will just be separate
+slots.  I've added type declarations in the struct for two reasons:
+
+* With a high `safety` declaration many implementations (including SBCL, the one
+  I'm using) will type check at runtime to make sure we're setting things
+  appropriately.
+* With low `safety` and high `speed` declarations, SBCL can generate much faster
+  code if it knows the types of the struct slots.
+
+To make the integer types a bit less wordy I've defined a few simple synonyms:
+
+```lisp
+(deftype int4 () '(unsigned-byte 4))
+(deftype int8 () '(unsigned-byte 8))
+(deftype int12 () '(unsigned-byte 12))
+(deftype int16 () '(unsigned-byte 16))
+```
+
+So `registers` is a 16-element simple array of `(unsigned-byte 8)`s.  It's
+`read-only` because we'll be changing the *elements* of the array, but we should
+never be swapping out the entire array itself.
+
+The index register is 16 bits, and the program counter can store up to 12 bits
+(we'll never need more than that because of the size of the CHIP-8's memory).
+
+Also note that the program counter starts at address `#x200`, because that's
+where the ROM data eventually gets loaded into the CHIP-8 memory.
+
+### The Stack
+
+The CHIP-8 also has an internal stack to store return addresses when calling
+procedures.  We'll just use a Lisp vector with a fill pointer for this:
+
+```lisp
+(defstruct chip
+  ; ...
+  (stack (make-array 16 :element-type 'int12 :fill-pointer 0)
+         :type (vector int12 16)
+         :read-only t)
+  ; ...
+  )
+```
+
+### Memory
+
+The CHIP-8 has 4 kilobytes of main memory:
+
+```lisp
+(defconstant +memory-size+ (* 1024 4))
+
+(defstruct chip
+  ; ...
+  (memory (make-array +memory-size+ :element-type 'int8)
+          :type (simple-array int8 (#.+memory-size+))
+          :read-only t)
+  ; ...
+  )
+```
+
+Pretty simple.  Note the `#.` reader macro trick/hack to be able to use
+a variable where we normally need a raw type specifier.
+
+### Currently-Loaded ROM
+
+Finally we'll add a slot for keeping track of the path to the currently-loaded
+ROM, for easy resetting later:
+
+```lisp
+(defstruct chip
+  ; ...
+  (loaded-rom nil :type (or null string))
+  ; ...
+  )
+```
+
+That's it for now.  We'll need a few more slots once we get to things like
+graphics and sound, but I'll introduce them when we need them.
+
+### The Flag Register
+
+Register number 15, or `#xF` in hex, is special.  It's nicknamed the "flag"
+register, and gets set by some instructions.  We could just access it like the
+rest of the registers in our code:
+
+```lisp
+(setf (aref (chip-registers chip) 15) 1) ; set flag to 1
+(print (aref (chip-registers chip) #xF)) ; print the flag
+```
+
+But even with the `#xF` hex index to get that mnemonic "F" this is a bit too
+hard to read.  Let's define some extra reader and writer functions to clean
+things up:
+
+```lisp
+(defun-inline chip-flag (chip)
+  (aref (chip-registers chip) #xF))
+
+(defun-inline (setf chip-flag) (new-value chip)
+  (setf (aref (chip-registers chip) #xF) new-value))
+```
+
+[`defun-inline`][defun-inline] is from my utility library — it just `defun`s the
+function and `declaim`s it inline all in one step.  Now things are much nicer:
+
+```lisp
+(setf (chip-flag chip) 1) ; set flag to 1
+(print (chip-flag chip)) ; print the flag
+```
+
+[defun-inline]: https://github.com/sjl/cl-losh/blob/master/DOCUMENTATION.markdown#defun-inline-macro
+
+## Removing Tedium
+
+What we've got so far *works*, but I want to add one more piece of syntactic
+sugar before moving on.
+
+We're going to be accessing the slots of the `chip` struct a *lot*, and it's
+going to get tedious to write `(chip-SLOT chip)` over and over again.  For
+example, when we're resetting the emulator we'll have to do something like this:
+
+```lisp
+(defun reset (chip)
+  (fill (chip-memory chip) 0)
+  (fill (chip-registers chip) 0)
+  (setf (chip-running chip) t
+        (chip-program-counter chip) #x200
+        (fill-pointer (chip-stack chip)) 0))
+```
+
+This is annoying.  Languages like Javascript and Python use `.` for slot
+access, so it ends up being a bit more concise: `chip.memory()` instead of
+`(chip-memory chip)`.
+
+We could use Lisp's `with-accessors` to clean up the actual usage a bit:
+
+```lisp
+(defun reset (chip)
+  (with-accessors ((memory chip-memory)
+                   (registers chip-registers)
+                   (running chip-running)
+                   (program-counter chip-program-counter)
+                   (stack chip-stack))
+      chip
+    (fill memory 0)
+    (fill registers 0)
+    (setf running t
+          program-counter #x200
+          (fill-pointer stack) 0)))
+```
+
+The *usage* looks much nicer now (`(fill memory 0)` is wonderfully readable) but
+we haven't actually fixed anything.  We've just shifted all the typing a few
+lines up.  But this is Lisp, we can do better!
+
+Ideally what we'd like is to be able to do say something like `(with-chip (chip)
+...)` to mean the giant `with-accessors` form above:
+
+```lisp
+(defun reset (chip)
+  (with-chip (chip)
+    (fill memory 0)
+    (fill registers 0)
+    (setf running t
+          program-counter #x200
+          (fill-pointer stack) 0)))
+```
+
+This is nice and readable.  Some folks will dislike the fact that it introduces
+new variable bindings that are "hidden" in the macro definition, but I like the
+concision you get from it, especially for a small project like this.
+
+We could write `with-chip` ourselves, if we wanted to.  It would look something
+like this:
+
+```lisp
+(defmacro with-chip ((chip) &body body)
+  `(with-accessors ((memory chip-memory)
+                    (registers chip-registers)
+                    ; ...
+                    )
+       ,chip
+     ,@body)`
+)
+```
+
+It's not hard to write, just tedious.  But we're using Lisp, so anything tedious
+calls out for abstraction.  Another macro from my utility library is
+[`define-with-macro`][define-with-macro].  This is a macro-defining macro that
+we can use to define `with-chip` for us:
+
+```lisp
+(define-with-macro chip
+  running memory stack registers index program-counter flag)
+```
+
+If the hairiness of a macro-defining macro scares you, don't worry about it.  We
+could have just written it by hand as shown above.  I've just needed `with-FOO`
+macros like these often enough that it's been worth it to abstract away the
+tedium of writing them.
+
+So one way or another we've got a nice `with-chip` macro that will let us access
+the fields of our struct with bare names.
+
+[define-with-macro]: https://github.com/sjl/cl-losh/blob/master/DOCUMENTATION.markdown#define-with-macro-macro
+
+## Infrastructure
+
+We'll need to define a bit of infrastructure for running things before we jump
+into implementing the CHIP-8 CPU instructions.
+
+### Resetting
+
+Before we can emulate a ROM, we need to read it into our memory array.  I've
+chosen to do this in the `reset` function so we can just `(reset my-chip)` to
+reload everything at once — it wouldn't make much sense to load a new ROM
+*without* resetting the rest of the emulation state (though the results could
+be... "interesting").
+
+`reset` looks like this:
+
+```lisp
+(defun reset (chip)
+  (with-chip (chip)
+    (fill memory 0)
+    (fill registers 0)
+    (replace memory (read-file-into-byte-vector loaded-rom)
+             :start1 #x200)
+    (setf running t
+          program-counter #x200
+          (fill-pointer stack) 0))
+  (values))
+```
+
+This is pretty self explanatory, except for the actual ROM-loading bit:
+
+```lisp
+(replace memory (read-file-into-byte-vector loaded-rom)
+         :start1 #x200)
+```
+
+`replace` is just the standard Common Lisp `replace` function that copies the
+contents of one sequence into another.  `read-file-into-byte-vector` is from
+Alexandria, and will just read the file at `loaded-rom` into a byte vector.
+Then all we need to do is say that we want to start the copying at index `#x200`
+in the destination, because that's where the CHIP-8 ROM data is supposed start
+(there's other internal data (like font sprites) before it, which we'll see
+later).
+
+One last thing to note is that `reset` just returns `(values)`, i.e. it returns
+nothing at all.  This is something I sometimes do for functions I call at the
+REPL for side effects, to avoid returning a meaningless result.  I think of it
+as an application of [The Rule of Silence][] for Lisp.
+
+[The Rule of Silence]: http://www.linfo.org/rule_of_silence.html
+
+### Loading ROMs
+
+Now that we've got a reset function, it's trivial to define a function to load
+a new ROM into our emulator:
+
+```lisp
+(defun load-rom (chip filename)
+  (setf (chip-loaded-rom chip) filename)
+  (reset chip))
+```
+
+I didn't bother with the `with-chip` macro here because we're only accessing
+a single field.
+
+### The Main Loop(s)
+
+Now let's wrap things up into a nice interface.  We'll define a top-level `run`
+function that will be what we call to fire up the emulator:
+
+
+```lisp
+(defparameter *c* nil)
+
+(defun run (rom-filename)
+  (let ((chip (make-chip)))
+    (setf *c* chip)
+    (load-rom chip rom-filename)
+    (run-cpu chip)))
+```
+
+This will get more complicated in the future, but for now it's pretty simple.
+Make a new `chip` object, load the specified ROM, and start emulating.
+
+I've added a `*c*` global variable that's bound to the currently-running
+emulator so we can poke at it in NREPL or SLIME as it's running.
+
+Now to write `run-cpu`:
+
+```lisp
+(defconstant +cycles-per-second+ 500)
+(defconstant +cycles-before-sleep+ 10)
+
+(defun run-cpu (chip)
+  (iterate
+    (while (chip-running chip))
+    (emulate-cycle chip)
+    (for tick :every-nth +cycles-before-sleep+ :do
+         (sleep (/ +cycles-before-sleep+ +cycles-per-second+)))))
+```
+
+CHIP-8 was designed to run on much weaker hardware than we have these days.
+Even though we're emulating it, if we just run as fast as possible we'll be
+running *far* too fast to be playable.
+
+`+cycles-per-second+` is a constant describing our ideal emulation speed.
+Anything from 300 to 800 or so is playable, depending on how fast you want games
+to run.  We'll just use 500 for now as a happy medium.
+
+It would be wasteful to call `(sleep)` after every single cycle, so instead
+we'll batch them together: run 10 instructions, sleep for a bit, repeat.  This
+is the job of the `(for ... every-nth N)` iterate driver, which is also in my
+utility library.  Every 10 iterations through the loop it will sleep for the
+appropriate amount of time.
+
+The size of the cycle batches is arbitrary — larger batches will result in fewer
+`(sleep)` calls, but if you go too large you'll start noticing the emulator
+getting "jumpy".  10-cycle batches at 500 cycles per second means that the
+emulator will sleep for about 1/50 of a second each time, which isn't too
+noticeable.
+
+Now we've got the main loop all set up and just need to emulate each individual
+cycle.
+
+### Individual Cycles
+
+CHIP-8 instructions are each two bytes long, and are stored
+[big-endian](https://en.wikipedia.org/wiki/Endianness) in memory.  So to emulate
+a single cycle we:
+
+* Read the two bytes starting at the program counter and concatenate them to get the instruction.
+* Advance the program counter (to avoid having to do it inside every single instruction).
+* Dispatch to the appropriate instruction code.
+
+We'll define `emulate-cycle` and a couple of helper functions for this:
+
+```lisp
+(defun-inline chop (size integer)
+  (ldb (byte size 0) integer))
+
+(defun-inline cat-bytes (high-order low-order)
+  (dpb high-order (byte 8 8) low-order))
+
+(defun emulate-cycle (chip)
+  (with-chip (chip)
+   (let ((instruction (cat-bytes (aref memory program-counter)
+                                 (aref memory (1+ program-counter)))))
+     (zapf program-counter (chop 12 (+ % 2)))
+     (dispatch-instruction chip instruction))))
+```
+
+`chop` truncates an integer to the given number of bits.  `cat-bytes`
+concatenates two bytes.  This are pretty simple, but I prefer the more
+descriptive names over writing `dpb` (deposit byte) and `ldb` (load byte)
+everywhere.
+
+`zapf` is from my utility library.  I've written [an entire post][zapf] about
+it.
+
+What happens when the program counter is at the final index into memory?  None
+of the CHIP-8 references I found specify what should happen.  We could signal an
+error, or just wrap around to 0.  I've chosen the latter by chopping `(+
+program-counter 2)` to 12 bits, but signaling an error would be easy too.
+
+[zapf]: http://stevelosh.com/blog/2016/08/playing-with-syntax/
+
+### Instruction Dispatch
+
+The CHIP-8's instruction scheme is a bit different than most others that I've
+seen.
+
+For systems like the Game Boy instructions have a one- or two-byte opcode, with
+arguments following.  For systems like this you can dispatch on opcode by doing
+a giant `case` statement:
+
+```lisp
+(case opcode
+  (#x00 (op-foo ...))
+  (#x01 (op-bar ...))
+  ; ...
+  )
+```
+
+But a more efficient way to do it is often to shove all the `op-` functions into
+an array, and then just use the opcode itself as the index into the array to
+find the function:
+
+```lisp
+(funcall (aref opcodes opcode) ...)
+```
+
+But the CHIP-8 differentiates instructions a bit strangely.  All instructions
+are two bytes long, *including* their arguments.  Instead of just using the
+first N bits as the opcode, some instructions use a combination of high- and
+low-order bits.  For example, the logical `AND`, `OR`, and `XOR` are specified
+by starting the instruction with an `8` nibble and ending it with `1`, `2`, or
+`3`, with the "arguments" being the two nibbles in the middle:
+
+    8xy1 - OR Vx, Vy
+    8xy2 - AND Vx, Vy
+    8xy3 - XOR Vx, Vy
+
+This makes it rather annoying to use some kind of function table approach.
+There are only a few dozen opcodes, so instead of trying to hack something
+together we'll just use a big old `case` instead:
+
+```lisp
+(defun dispatch-instruction (chip instruction)
+  (macrolet ((call (name) `(,name chip instruction)))
+    (ecase (logand #xF000 instruction)
+      (#x0000 (ecase instruction
+                (#x00E0 (call op-cls))
+                (#x00EE (call op-ret))))
+      (#x1000 (call op-jp-imm))
+      (#x2000 (call op-call))
+      (#x3000 (call op-se-reg-imm))
+      (#x4000 (call op-sne-reg-imm))
+      (#x5000 (ecase (logand #x000F instruction)
+                (#x0 (call op-se-reg-reg))))
+      (#x6000 (call op-ld-reg<imm))
+      (#x7000 (call op-add-reg<imm))
+      (#x8000 (ecase (logand #x000F instruction)
+                (#x0 (call op-ld-reg<reg))
+                (#x1 (call op-or))
+                (#x2 (call op-and))
+                (#x3 (call op-xor))
+                (#x4 (call op-add-reg<reg))
+                (#x5 (call op-sub-reg<reg))
+                (#x6 (call op-shr))
+                (#x7 (call op-subn-reg<reg))
+                (#xE (call op-shl))))
+      (#x9000 (ecase (logand #x000F instruction)
+                (#x0 (call op-sne-reg-reg))))
+      (#xA000 (call op-ld-i<imm))
+      (#xB000 (call op-jp-imm+reg))
+      (#xC000 (call op-rand))
+      (#xD000 (call op-draw))
+      (#xE000 (ecase (logand #x00FF instruction)
+                (#x9E (call op-skp))
+                (#xA1 (call op-sknp))))
+      (#xF000 (ecase (logand #x00FF instruction)
+                (#x07 (call op-ld-reg<dt))
+                (#x0A (call op-ld-reg<key))
+                (#x15 (call op-ld-dt<reg))
+                (#x18 (call op-ld-st<reg))
+                (#x1E (call op-add-index<reg))
+                (#x29 (call op-ld-font<vx))
+                (#x33 (call op-ld-bcd<vx))
+                (#x55 (call op-ld-mem<regs))
+                (#x65 (call op-ld-regs<mem)))))))
+```
+
+The `macrolet` at the beginning saves us a bit of typing, but otherwise this is
+just a big boring table to figure out which instruction to call.
+
+That's all the boring infrastructure work we need — on to implementing the
+actual instructions.
+
+## Instructions
+
+The CHIP-8 support thirty six instructions, all of which we'll need to
+implement.  We'll start with the simpler ones, and we'll be leaving some of the
+others (the graphics/sound related ones) for later articles.
+
+(In contrast, the Game Boy has roughly five hundred instructions, so now you
+might see why I thought CHIP-8 would be simpler!)
+
+### Random Numbers
+
+We'll start with the `RND` instruction, which generates random numbers.  This
+may seem like an odd place to start, but I want to describe some extra syntactic
+sugar and `RND` is a nice standalone instruction to use as an example.
+
+Cowgod's reference describes `RND` like so:
+
+    Cxkk - RND Vx, byte
+    Set Vx = random byte AND kk.
+
+    The interpreter generates a random number from 0 to 255,
+    which is then ANDed with the value kk. The results are
+    stored in Vx.
+
+A first stab at the implementation in Lisp might look like this:
+
+
+```lisp
+(defun op-rand (chip instruction)
+  (let ((reg  (logand #x0F00 instruction))
+        (mask (logand #x00FF instruction)))
+    (with-chip (chip)
+      (setf (aref registers reg)
+            (logand (random 256) mask)))))
+```
+
+This would work, but it's an awful lot of code for something that just takes
+a sentence or two to describe.  We're going to be defining a lot of
+instructions, so let's take a few minutes and make our lives easier.
+
+### Define-Instruction
+
+We'll create a `define-instruction` macro that will abstract away some of the
+boring bits.  The goal is to end up with instruction definitions that read as
+close to the documentation as possible.
+
+We'll start by removing the need to use the `with-chip` macro.  Every
+instruction needs to deal with the `chip` struct, so let's not repeat ourselves
+thirty six times.  Every instruction will also take `chip` and `instruction`
+arguments, so we can remove those too:
+
+```lisp
+(defmacro define-instruction (name &body body)
+  `(defun ,name (chip instruction)
+      (declare (ignorable instruction))
+      (with-chip (chip)
+        ,@body)
+      nil))
+
+(define-instruction op-rand
+  (let ((reg  (logand #x0F00 instruction))
+        (mask (logand #x00FF instruction)))
+    (setf (aref registers reg)
+          (logand (random 256) mask))))
+```
+
+That's a little better.  We're going to be calling these instructions a *lot*,
+so it wouldn't hurt to add a type declaration for each function.  This is easy
+because they all take and return the same types:
+
+```lisp
+(defmacro define-instruction (name &body body)
+  `(progn
+    (declaim (ftype (function (chip int16) null) ,name)) ; NEW
+    (defun ,name (chip instruction)
+      (declare (ignorable instruction))
+      (with-chip (chip)
+        ,@body)
+      nil)))
+```
+
+Now all of our instruction functions will have type hints, and SBCL can check
+them (if `safety` is high) or use them to generate faster code (if `speed` is
+high).
+
+We'll be accessing the values of registers quite often (`(aref registers ...)`)
+so let's add a little `macrolet` to make that read nicer:
+
+```lisp
+(defmacro define-instruction (name &body body)
+  `(progn
+    (declaim (ftype (function (chip int16) null) ,name))
+    (defun ,name (chip instruction)
+      (declare (ignorable instruction))
+      (with-chip (chip)
+        (macrolet ((register (index)             ; NEW
+                     `(aref registers ,index)))  ; NEW
+          ,@body))
+      nil)))
+
+(define-instruction op-rand
+  (let ((reg  (logand #x0F00 instruction))
+        (mask (logand #x00FF instruction)))
+    (setf (register reg)                         ; NEW
+          (logand (random 256) mask))))
+```
+
+Now instead of `(aref registers ...)` we can just say `(register ...)`.  Cool.
+
+```lisp
+(defmacro define-instruction (name &body body)
+  `(progn
+    (declaim (ftype (function (chip int16) null) ,name))
+    (defun ,name (chip instruction)
+      (declare (ignorable instruction))
+      (with-chip (chip)
+        (macrolet ((register (index)
+                     `(aref registers ,index)))
+          (let ,(parse-instruction-argument-bindings argument-list)
+            ,@body))
+        nil))))
+
+(define-instruction op-rand
+  (let ((reg  (logand #x0F00 instruction))
+        (mask (logand #x00FF instruction)))
+    (setf (aref registers reg)
+          (logand (random 256) mask))))
+```
+
+The one thing that still bothers me is having to manually pull the instruction
+arguments out of the instruction with bitmasking.  It would be much nicer if we
+could just declare what the arguments look like and have the computer generate
+the appropriate masking code to deal with them.  The goal is to be able to write
+something like this:
+
+```lisp
+(define-instruction op-rand (_ r (mask 2))              ;; RND
+  (setf (register r)
+        (logand (random 256) mask)))
+```
+
+The `(_ r (mask 2))` is the "argument list" of the instruction.  It says:
+
+* A single-nibble value which we don't care about: `_`
+* A single-nibble value should be bound to `r`: `r`
+* A two-nibble value should be bound to `mask`: `(mask 2)`
+
+Compare the Lisp code and Cowgod's documentation:
+
+```
+Cxkk - RND Vx, byte
+Set Vx = random byte AND kk.
+
+(define-instruction op-rand (_ r (mask 2))              ;; RND
+  (setf (register r)
+        (logand (random 256) mask)))
+```
+
+Cowgod uses `V` as a shorthand for `register`, and calls the argument `x`
+instead of `r`, but otherwise I think this is pretty damn close to the
+documentation for actual running code.
+
+So let's implement the argument-parsing macro.  Feel free to skip this if you're
+not really comfortable with macros.  But it's a good demonstration of how you
+can use them to get a nice clean language for a simple project.
+
+First we'll wrap the `body` in a `let`:
+
+```lisp
+                                   ; NEW
+(defmacro define-instruction (name argument-list &body body)
+  `(progn
+    (declaim (ftype (function (chip int16) null) ,name))
+    (defun ,name (chip instruction)
+      (declare (ignorable instruction))
+      (with-chip (chip)
+        (macrolet ((register (index)
+                     `(aref registers ,index)))
+          (let ,(parse-instruction-argument-bindings argument-list) ; NEW
+            ,@body))
+        nil))))
+```
+
+We'll use a helper function to parse the argument list into a list of `let`
+bindings, instead of trying to cram it all into this one macro.  It might look
+scary, but it's not actually too bad:
+
+```lisp
+(defun parse-instruction-argument-bindings (argument-list)
+  (flet ((normalize-arg (arg)
+           (destructuring-bind (symbol &optional (nibbles 1))
+               (ensure-list arg)
+             (list symbol nibbles))))
+    (iterate
+      (for (symbol nibbles) :in (mapcar #'normalize-arg argument-list))
+      (for position :first 3 :then (- position nibbles))
+      (when (not (eql symbol '_))
+        (collect `(,symbol (ldb (byte ,(* nibbles 4)
+                                      ,(* position 4))
+                                instruction)))))))
+```
+
+`normalize-arg` takes each argument from the list and turns it into `(symbol
+nibbles)`, so we don't have to write the lengths for single-nibble arguments
+like this: `((_ 1) (r 1) (mask 2))`.
+
+Then we loop through each `(symbol nibbles)` pair in the argument list.  We
+start at position `3` in the byte (the highest-order nibble) because we want to
+write the argument list from left to right, high-order to low-order, like the
+documentation.
+
+For each pair, if the symbol is anything other than `_` we collect a `let`
+binding with the appropriate `ldb` call for it.  So for our example of `(_
+r (mask 2))` we end up with two `let` bindings:
+
+* `(r    (ldb (byte 4 8) instruction))`
+* `(mask (ldb (byte 8 0) instruction))`
+
+Notice that we only bother parsing out the bindings we *need* from the
+instruction.  For instructions that don't need any arguments we'll end up with
+an empty `(let () ...)` which the compiler will optimize away.
+
+This might seem a bit hairy.  The nice thing about using a helper function like
+this is that we can call it on its own with a variety of argument lists to see
+what gets generated:
+
+```
+(map nil #'print
+  (parse-instruction-argument-bindings '(_ r (mask 2))))
+
+(R (LDB (BYTE 4 8) INSTRUCTION))
+(MASK (LDB (BYTE 8 0) INSTRUCTION))
+
+(map nil #'print
+  (parse-instruction-argument-bindings '(_ x y _)))
+
+(X (LDB (BYTE 4 8) INSTRUCTION))
+(Y (LDB (BYTE 4 4) INSTRUCTION))
+
+(map nil #'print
+  (parse-instruction-argument-bindings '(_ (foo 3))))
+
+(FOO (LDB (BYTE 12 0) INSTRUCTION))
+```
+
+And that wraps up `define-instruction`.  So instead of writing `RND` as
+a vanilla function:
+
+```lisp
+(defun op-rand (chip instruction)
+  (let ((reg  (logand #x0F00 instruction))
+        (mask (logand #x00FF instruction)))
+    (with-chip (chip)
+      (setf (aref registers reg)
+            (logand (random 256) mask)))))
+```
+
+We can do something much cleaner:
+
+```lisp
+(define-instruction op-rand (_ r (mask 2))              ;; RND
+  (setf (register r)
+        (logand (random 256) mask)))
+```
+
+This is going to pay off nicely as we implement the other thirty five
+instructions.
+
+### Jumps and Calls
+
+The CHIP-8 has two instructions for jumping directly to an address.  The first
+jumps to an immediate, literal address.  The second adds an immediate value to
+whatever is in register `#x0` and jumps to the result:
+
+```lisp
+(define-instruction op-jp-imm (_ (target 3))            ;; JP addr
+  (setf program-counter target))
+
+(define-instruction op-jp-imm+reg (_ (target 3))        ;; JP V0 + addr
+  (setf program-counter (chop 12 (+ target (register 0)))))
+```
+
+Our `define-instruction` macro is already paying off.  The immediate values of
+these arguments are three nibbles each, but our macro doesn't care.
+
+There are also two instructions for implementing traditional function calls and
+returns using the CHIP-8's stack:
+
+```lisp
+(define-instruction op-call (_ (target 3))              ;; CALL addr
+  (vector-push program-counter stack)
+  (setf program-counter target))
+
+(define-instruction op-ret ()                           ;; RET
+  (setf program-counter (vector-pop stack)))
+```
+
+We just use the built in `vector-push` and `vector-pop` functions to manage the
+stack's fill pointer trivially.
+
+Note that we're NOT using `vector-push-extend` because the CHIP-8 stack is
+defined by the spec to only hold at most sixteen addresses — any ROM that tries
+to push more is broken.
+
+### Binary-Coded Decimal
+
+Let's take another self-contained instruction next: `BCD`, which stands for
+[binary-coded decimal][BCD] (some references call this instruction `LD B, Vx`
+but I think it's different enough to deserve its own name).
+
+The problem this instruction is designed to solve goes something like this.
+Let's say you've got a game where the player has a score that changes over time.
+You can store the player's score in a register or memory somewhere, and
+add/subtract to/from it with the normal arithmetic opcodes no problem.
+
+But now you want to *display* the score to the player, and you probably want to
+do that in base 10 because humans are generally bad at reading binary or hex.
+So how do you take a byte like `#xA5` and determine the hundreds, tens, and ones
+sprites to draw on the screen to read "165 points"?
+
+This is what the `BCD` instruction does.  It takes a single argument (the
+register) and stores the hundreds, tens, and ones digits of that register's
+value into three separate bytes of memory, starting at wherever the `index`
+register is currently pointing.
+
+<pre class="lineart">
+    Register V3   Index Register
+    ┌───┐         ┌───┐
+    │135│         │...│
+    └───┘         └─┬─┘
+      ┌─────────────┘
+      │
+      ▼
+    ┌───┬───┬───┬───┬───┐
+    │   │   │   │   │   │   Memory
+    └───┴───┴───┴───┴───┘
+
+    Run BCD V3 instruction =================
+
+    Register V3   Index Register
+    ┌───┐         ┌───┐
+    │135│         │...│
+    └───┘         └─┬─┘
+      ┌─────────────┘
+      │
+      ▼
+    ┌───┬───┬───┬───┬───┐
+    │ 1 │ 3 │ 5 │   │   │   Memory
+    └───┴───┴───┴───┴───┘
+</pre>
+
+We'll split this into two parts to make it easier to read.  First we'll make
+a `bcd` function to actually get the digits:
+
+```lisp
+(defun-inline bcd (integer)
+  (values (-<> integer (floor <> 100) (mod <> 10))
+          (-<> integer (floor <> 10) (mod <> 10))
+          (-<> integer (floor <> 1) (mod <> 10))))
+
+```
+
+And make sure it works on its own:
+
+```
+[SBCL] CHIP8> (bcd 135)
+
+1
+3
+5
+```
+
+And then we can define the actual operation:
+
+```lisp
+(define-instruction op-ld-bcd<vx (_ r _ _)              ;; LD B, Vx
+  (setf (values (aref memory (+ index 0)) ; hundreds
+                (aref memory (+ index 1)) ; tens
+                (aref memory (+ index 2))) ; ones
+        (bcd (register r))))
+```
+
+This takes advantage of the fact that you can use `(setf (values ...) ...)` to
+assign all the multiple values returned by `bcd` at once, without binding them
+to local variables (though because `bcd` itself is inline, there won't actually
+be any *returning* at all...).
+
+[BCD]: https://en.wikipedia.org/wiki/Binary-coded_decimal
+
+
+
+```lisp
+```
+
+```lisp
+```
+
+```lisp
+```
+
+```lisp
+```
+
+```lisp
+```
+
+```lisp
+```
+
+```lisp
+```
+
+```lisp
+```
+
+```lisp
+```
+
+
--- a/static/media/css/sjl.css	Sat Dec 17 13:54:40 2016 -0500
+++ b/static/media/css/sjl.css	Sat Dec 17 21:37:05 2016 -0500
@@ -19,7 +19,7 @@
 body h4,
 body h5,
 body h6 {
-  font-family: 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
+  font-family: HoeflerText-Regular, 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
   font-weight: normal;
 }
 body h1 a,
@@ -134,7 +134,7 @@
 }
 body .wrap .top header {
   float: left;
-  font-family: 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
+  font-family: HoeflerText-Regular, 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
   font-size: 23px;
   line-height: 50px;
   text-transform: lowercase;
@@ -151,7 +151,7 @@
   color: #e50053;
 }
 body .wrap .top nav {
-  font: normal 18px/50px 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
+  font: normal 18px/50px HoeflerText-Regular, 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
   text-align: right;
   text-transform: lowercase;
   padding-right: 2px;
@@ -195,7 +195,7 @@
 .splash {
   color: #454545;
   text-align: center;
-  font: normal 27px/32px 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
+  font: normal 27px/32px HoeflerText-Regular, 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
   margin-bottom: -1px;
   padding-top: 0;
 }
@@ -226,7 +226,7 @@
   margin: 0 0 25px 0;
 }
 .section-listing ol li a {
-  font: normal 23px/32px 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
+  font: normal 23px/32px HoeflerText-Regular, 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
   color: #222222;
   display: block;
 }
@@ -348,7 +348,7 @@
   color: #999;
   font-size: 23px;
   font-style: italic;
-  font: italic 23px 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
+  font: italic 23px HoeflerText-Regular, 'Hoefler Text', 'Goudy Old Style', 'Palatino', 'Palatino Linotype', serif;
   line-height: 30px;
   position: fixed;
   top: 75px;
--- a/static/media/css/sjl.less	Sat Dec 17 13:54:40 2016 -0500
+++ b/static/media/css/sjl.less	Sat Dec 17 21:37:05 2016 -0500
@@ -1,7 +1,7 @@
 @c-link: #e50053;
 @c-body: #222;
 @w-body: 612px;
-@font-fancy: 'Hoefler Text','Goudy Old Style','Palatino','Palatino Linotype',serif;
+@font-fancy: HoeflerText-Regular,'Hoefler Text','Goudy Old Style','Palatino','Palatino Linotype',serif;
 
 html {
     overflow-y: scroll;