# HG changeset patch # User Steve Losh # Date 1482028625 18000 # Node ID e448a6cc762a918584d5dcf87ae6db9ac3454464 # Parent b65cd309d469e4e745a1acfda63993d6232e9e3f Start CHIP-8 entry diff -r b65cd309d469 -r e448a6cc762a content/blog/2016/12/chip8-cpu.markdown --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/blog/2016/12/chip8-cpu.markdown Sat Dec 17 21:37:05 2016 -0500 @@ -0,0 +1,1000 @@ ++++ +title = "CHIP-8 in Common Lisp: The CPU" +snip = "Let's write an emulator." +date = 2016-12-19T14:50:00Z +draft = true + ++++ + +A while back I decided to try to write a Game Boy emulator in Common Lisp based +on [this series of articles][imran]. I made some good progress but eventually +got bogged down because I was trying to learn a bunch of complex new things at +once: + +* How to write an emulator +* How to use QT with Common Lisp +* How the Game Boy works internally + +Instead of dragging on, I decided to take a break and try something simpler: +a [CHIP-8][] emulator/interpreter. The CHIP-8 is much simpler than the Game +Boy, which made it easier to experiment with the rest of the infrastructure. + +In this post and a couple of future ones I'll walk through all of my CHIP-8 +emulator implementation. To give you a rough idea of the size of the project, +`cloc` reports: + +* Basic emulator: 415 lines +* Debugging infrastructure: 142 lines +* Screen GUI: 153 lines +* Graphical debugger: 295 lines + +This first post will deal with emulating the CHIP-8's CPU. + +The full emulator source is on [BitBucket][] and [GitHub][]. + +[imran]: http://imrannazar.com/GameBoy-Emulation-in-JavaScript +[CHIP-8]: https://en.wikipedia.org/wiki/CHIP-8 +[BitBucket]: https://bitbucket.org/sjl/cl-chip8 +[GitHub]: https://github.com/sjl/cl-chip8 + +
+ +## Libraries + +The emulator uses a few Common Lisp libraries to make things easier: + +* [bordeaux-threads][] to handle threading. +* [cl-arrows][] for an implementation of Clojure's `-<>` threading macro. +* [cl-losh][] is my own personal utility library. +* [cl-portaudio][] for audio. +* [iterate][] for a much nicer version of `loop` called `iterate`. My utility + library contains several `iterate` drivers (some of which I've written about + before). +* [qtools][] to handle creating a GUI with QT. +* [quickutil][] for some miscellaneous utility functions from Alexandria and + elsewhere. + +I'll try to remember to mention whenever I use a function that's not built-in to +Common Lisp, but I might forget, in which case it's probably in one of these. + +[bordeaux-threads]: https://common-lisp.net/project/bordeaux-threads/ +[cl-arrows]: https://github.com/nightfly19/cl-arrows +[cl-losh]: https://github.com/sjl/cl-losh +[cl-portaudio]: https://filonenko-mikhail.github.io/cl-portaudio/ +[iterate]: https://common-lisp.net/project/iterate/ +[qtools]: https://shinmera.github.io/qtools/ +[quickutil]: https://github.com/tarballs-are-good/quickutil + +## CHIP-8 References + +There's a good amount of information available about the CHIP-8 online. The +references I used most often were: + +* [Cowgod's "CHIP-8 Technical Reference"](http://devernay.free.fr/hacks/chip8/C8TECH10.HTM) +* [Laurence Muller's "How to write an emulator (CHIP-8 interpreter)"](http://www.multigesture.net/articles/how-to-write-an-emulator-CHIP-8-interpreter/) +* [Matthew Mikolay's "Mastering CHIP-8"](http://mattmik.com/files/chip8/mastering/chip8.html) +* [(Super)CHIP 8 Secrets](https://github.com/AfBu/haxe-CHIP-8-emulator/wiki/\(Super\)CHIP-8-Secrets) + +## The Main Data Structure + +Let's dive into the code. We'll start with the main data structure that will +hold an instance of a CHIP-8 for emulation: + + +```lisp +(defstruct chip + (running t :type boolean) + ; ...more to come later + ) +``` + +We're using a Lisp struct instead of a CLOS class because we're going to be +accessing the fields of this thing *a lot* and CLOS accessors can be +comparatively slow. This is one of the very few concessions we'll make to +performance. + +The first field is `running`, which is just a boolean that represents whether +the emulator is currently running. + +We'll eventually have multiple threads poking at this struct, and they'll +generally be doing something like `(loop :while (chip-running chip) :do ...)`. +That way when we want to quit the emulator we can just set `running` to `nil` +and everything will (eventually) stop. +### Registers + +The CHIP-8 has sixteen main registers, as well as a few other special ones: + + +```lisp +(defstruct chip + ; ... + (registers (make-array 16 :element-type 'int8) + :type (simple-array int8 (16)) + :read-only t) + (index 0 :type int16) + (program-counter #x200 :type int12) + ; ... + ) +``` + +We'll keep the main registers in an array, and the others will just be separate +slots. I've added type declarations in the struct for two reasons: + +* With a high `safety` declaration many implementations (including SBCL, the one + I'm using) will type check at runtime to make sure we're setting things + appropriately. +* With low `safety` and high `speed` declarations, SBCL can generate much faster + code if it knows the types of the struct slots. + +To make the integer types a bit less wordy I've defined a few simple synonyms: + +```lisp +(deftype int4 () '(unsigned-byte 4)) +(deftype int8 () '(unsigned-byte 8)) +(deftype int12 () '(unsigned-byte 12)) +(deftype int16 () '(unsigned-byte 16)) +``` + +So `registers` is a 16-element simple array of `(unsigned-byte 8)`s. It's +`read-only` because we'll be changing the *elements* of the array, but we should +never be swapping out the entire array itself. + +The index register is 16 bits, and the program counter can store up to 12 bits +(we'll never need more than that because of the size of the CHIP-8's memory). + +Also note that the program counter starts at address `#x200`, because that's +where the ROM data eventually gets loaded into the CHIP-8 memory. + +### The Stack + +The CHIP-8 also has an internal stack to store return addresses when calling +procedures. We'll just use a Lisp vector with a fill pointer for this: + +```lisp +(defstruct chip + ; ... + (stack (make-array 16 :element-type 'int12 :fill-pointer 0) + :type (vector int12 16) + :read-only t) + ; ... + ) +``` + +### Memory + +The CHIP-8 has 4 kilobytes of main memory: + +```lisp +(defconstant +memory-size+ (* 1024 4)) + +(defstruct chip + ; ... + (memory (make-array +memory-size+ :element-type 'int8) + :type (simple-array int8 (#.+memory-size+)) + :read-only t) + ; ... + ) +``` + +Pretty simple. Note the `#.` reader macro trick/hack to be able to use +a variable where we normally need a raw type specifier. + +### Currently-Loaded ROM + +Finally we'll add a slot for keeping track of the path to the currently-loaded +ROM, for easy resetting later: + +```lisp +(defstruct chip + ; ... + (loaded-rom nil :type (or null string)) + ; ... + ) +``` + +That's it for now. We'll need a few more slots once we get to things like +graphics and sound, but I'll introduce them when we need them. + +### The Flag Register + +Register number 15, or `#xF` in hex, is special. It's nicknamed the "flag" +register, and gets set by some instructions. We could just access it like the +rest of the registers in our code: + +```lisp +(setf (aref (chip-registers chip) 15) 1) ; set flag to 1 +(print (aref (chip-registers chip) #xF)) ; print the flag +``` + +But even with the `#xF` hex index to get that mnemonic "F" this is a bit too +hard to read. Let's define some extra reader and writer functions to clean +things up: + +```lisp +(defun-inline chip-flag (chip) + (aref (chip-registers chip) #xF)) + +(defun-inline (setf chip-flag) (new-value chip) + (setf (aref (chip-registers chip) #xF) new-value)) +``` + +[`defun-inline`][defun-inline] is from my utility library — it just `defun`s the +function and `declaim`s it inline all in one step. Now things are much nicer: + +```lisp +(setf (chip-flag chip) 1) ; set flag to 1 +(print (chip-flag chip)) ; print the flag +``` + +[defun-inline]: https://github.com/sjl/cl-losh/blob/master/DOCUMENTATION.markdown#defun-inline-macro + +## Removing Tedium + +What we've got so far *works*, but I want to add one more piece of syntactic +sugar before moving on. + +We're going to be accessing the slots of the `chip` struct a *lot*, and it's +going to get tedious to write `(chip-SLOT chip)` over and over again. For +example, when we're resetting the emulator we'll have to do something like this: + +```lisp +(defun reset (chip) + (fill (chip-memory chip) 0) + (fill (chip-registers chip) 0) + (setf (chip-running chip) t + (chip-program-counter chip) #x200 + (fill-pointer (chip-stack chip)) 0)) +``` + +This is annoying. Languages like Javascript and Python use `.` for slot +access, so it ends up being a bit more concise: `chip.memory()` instead of +`(chip-memory chip)`. + +We could use Lisp's `with-accessors` to clean up the actual usage a bit: + +```lisp +(defun reset (chip) + (with-accessors ((memory chip-memory) + (registers chip-registers) + (running chip-running) + (program-counter chip-program-counter) + (stack chip-stack)) + chip + (fill memory 0) + (fill registers 0) + (setf running t + program-counter #x200 + (fill-pointer stack) 0))) +``` + +The *usage* looks much nicer now (`(fill memory 0)` is wonderfully readable) but +we haven't actually fixed anything. We've just shifted all the typing a few +lines up. But this is Lisp, we can do better! + +Ideally what we'd like is to be able to do say something like `(with-chip (chip) +...)` to mean the giant `with-accessors` form above: + +```lisp +(defun reset (chip) + (with-chip (chip) + (fill memory 0) + (fill registers 0) + (setf running t + program-counter #x200 + (fill-pointer stack) 0))) +``` + +This is nice and readable. Some folks will dislike the fact that it introduces +new variable bindings that are "hidden" in the macro definition, but I like the +concision you get from it, especially for a small project like this. + +We could write `with-chip` ourselves, if we wanted to. It would look something +like this: + +```lisp +(defmacro with-chip ((chip) &body body) + `(with-accessors ((memory chip-memory) + (registers chip-registers) + ; ... + ) + ,chip + ,@body)` +) +``` + +It's not hard to write, just tedious. But we're using Lisp, so anything tedious +calls out for abstraction. Another macro from my utility library is +[`define-with-macro`][define-with-macro]. This is a macro-defining macro that +we can use to define `with-chip` for us: + +```lisp +(define-with-macro chip + running memory stack registers index program-counter flag) +``` + +If the hairiness of a macro-defining macro scares you, don't worry about it. We +could have just written it by hand as shown above. I've just needed `with-FOO` +macros like these often enough that it's been worth it to abstract away the +tedium of writing them. + +So one way or another we've got a nice `with-chip` macro that will let us access +the fields of our struct with bare names. + +[define-with-macro]: https://github.com/sjl/cl-losh/blob/master/DOCUMENTATION.markdown#define-with-macro-macro + +## Infrastructure + +We'll need to define a bit of infrastructure for running things before we jump +into implementing the CHIP-8 CPU instructions. + +### Resetting + +Before we can emulate a ROM, we need to read it into our memory array. I've +chosen to do this in the `reset` function so we can just `(reset my-chip)` to +reload everything at once — it wouldn't make much sense to load a new ROM +*without* resetting the rest of the emulation state (though the results could +be... "interesting"). + +`reset` looks like this: + +```lisp +(defun reset (chip) + (with-chip (chip) + (fill memory 0) + (fill registers 0) + (replace memory (read-file-into-byte-vector loaded-rom) + :start1 #x200) + (setf running t + program-counter #x200 + (fill-pointer stack) 0)) + (values)) +``` + +This is pretty self explanatory, except for the actual ROM-loading bit: + +```lisp +(replace memory (read-file-into-byte-vector loaded-rom) + :start1 #x200) +``` + +`replace` is just the standard Common Lisp `replace` function that copies the +contents of one sequence into another. `read-file-into-byte-vector` is from +Alexandria, and will just read the file at `loaded-rom` into a byte vector. +Then all we need to do is say that we want to start the copying at index `#x200` +in the destination, because that's where the CHIP-8 ROM data is supposed start +(there's other internal data (like font sprites) before it, which we'll see +later). + +One last thing to note is that `reset` just returns `(values)`, i.e. it returns +nothing at all. This is something I sometimes do for functions I call at the +REPL for side effects, to avoid returning a meaningless result. I think of it +as an application of [The Rule of Silence][] for Lisp. + +[The Rule of Silence]: http://www.linfo.org/rule_of_silence.html + +### Loading ROMs + +Now that we've got a reset function, it's trivial to define a function to load +a new ROM into our emulator: + +```lisp +(defun load-rom (chip filename) + (setf (chip-loaded-rom chip) filename) + (reset chip)) +``` + +I didn't bother with the `with-chip` macro here because we're only accessing +a single field. + +### The Main Loop(s) + +Now let's wrap things up into a nice interface. We'll define a top-level `run` +function that will be what we call to fire up the emulator: + + +```lisp +(defparameter *c* nil) + +(defun run (rom-filename) + (let ((chip (make-chip))) + (setf *c* chip) + (load-rom chip rom-filename) + (run-cpu chip))) +``` + +This will get more complicated in the future, but for now it's pretty simple. +Make a new `chip` object, load the specified ROM, and start emulating. + +I've added a `*c*` global variable that's bound to the currently-running +emulator so we can poke at it in NREPL or SLIME as it's running. + +Now to write `run-cpu`: + +```lisp +(defconstant +cycles-per-second+ 500) +(defconstant +cycles-before-sleep+ 10) + +(defun run-cpu (chip) + (iterate + (while (chip-running chip)) + (emulate-cycle chip) + (for tick :every-nth +cycles-before-sleep+ :do + (sleep (/ +cycles-before-sleep+ +cycles-per-second+))))) +``` + +CHIP-8 was designed to run on much weaker hardware than we have these days. +Even though we're emulating it, if we just run as fast as possible we'll be +running *far* too fast to be playable. + +`+cycles-per-second+` is a constant describing our ideal emulation speed. +Anything from 300 to 800 or so is playable, depending on how fast you want games +to run. We'll just use 500 for now as a happy medium. + +It would be wasteful to call `(sleep)` after every single cycle, so instead +we'll batch them together: run 10 instructions, sleep for a bit, repeat. This +is the job of the `(for ... every-nth N)` iterate driver, which is also in my +utility library. Every 10 iterations through the loop it will sleep for the +appropriate amount of time. + +The size of the cycle batches is arbitrary — larger batches will result in fewer +`(sleep)` calls, but if you go too large you'll start noticing the emulator +getting "jumpy". 10-cycle batches at 500 cycles per second means that the +emulator will sleep for about 1/50 of a second each time, which isn't too +noticeable. + +Now we've got the main loop all set up and just need to emulate each individual +cycle. + +### Individual Cycles + +CHIP-8 instructions are each two bytes long, and are stored +[big-endian](https://en.wikipedia.org/wiki/Endianness) in memory. So to emulate +a single cycle we: + +* Read the two bytes starting at the program counter and concatenate them to get the instruction. +* Advance the program counter (to avoid having to do it inside every single instruction). +* Dispatch to the appropriate instruction code. + +We'll define `emulate-cycle` and a couple of helper functions for this: + +```lisp +(defun-inline chop (size integer) + (ldb (byte size 0) integer)) + +(defun-inline cat-bytes (high-order low-order) + (dpb high-order (byte 8 8) low-order)) + +(defun emulate-cycle (chip) + (with-chip (chip) + (let ((instruction (cat-bytes (aref memory program-counter) + (aref memory (1+ program-counter))))) + (zapf program-counter (chop 12 (+ % 2))) + (dispatch-instruction chip instruction)))) +``` + +`chop` truncates an integer to the given number of bits. `cat-bytes` +concatenates two bytes. This are pretty simple, but I prefer the more +descriptive names over writing `dpb` (deposit byte) and `ldb` (load byte) +everywhere. + +`zapf` is from my utility library. I've written [an entire post][zapf] about +it. + +What happens when the program counter is at the final index into memory? None +of the CHIP-8 references I found specify what should happen. We could signal an +error, or just wrap around to 0. I've chosen the latter by chopping `(+ +program-counter 2)` to 12 bits, but signaling an error would be easy too. + +[zapf]: http://stevelosh.com/blog/2016/08/playing-with-syntax/ + +### Instruction Dispatch + +The CHIP-8's instruction scheme is a bit different than most others that I've +seen. + +For systems like the Game Boy instructions have a one- or two-byte opcode, with +arguments following. For systems like this you can dispatch on opcode by doing +a giant `case` statement: + +```lisp +(case opcode + (#x00 (op-foo ...)) + (#x01 (op-bar ...)) + ; ... + ) +``` + +But a more efficient way to do it is often to shove all the `op-` functions into +an array, and then just use the opcode itself as the index into the array to +find the function: + +```lisp +(funcall (aref opcodes opcode) ...) +``` + +But the CHIP-8 differentiates instructions a bit strangely. All instructions +are two bytes long, *including* their arguments. Instead of just using the +first N bits as the opcode, some instructions use a combination of high- and +low-order bits. For example, the logical `AND`, `OR`, and `XOR` are specified +by starting the instruction with an `8` nibble and ending it with `1`, `2`, or +`3`, with the "arguments" being the two nibbles in the middle: + + 8xy1 - OR Vx, Vy + 8xy2 - AND Vx, Vy + 8xy3 - XOR Vx, Vy + +This makes it rather annoying to use some kind of function table approach. +There are only a few dozen opcodes, so instead of trying to hack something +together we'll just use a big old `case` instead: + +```lisp +(defun dispatch-instruction (chip instruction) + (macrolet ((call (name) `(,name chip instruction))) + (ecase (logand #xF000 instruction) + (#x0000 (ecase instruction + (#x00E0 (call op-cls)) + (#x00EE (call op-ret)))) + (#x1000 (call op-jp-imm)) + (#x2000 (call op-call)) + (#x3000 (call op-se-reg-imm)) + (#x4000 (call op-sne-reg-imm)) + (#x5000 (ecase (logand #x000F instruction) + (#x0 (call op-se-reg-reg)))) + (#x6000 (call op-ld-reg + Register V3 Index Register + ┌───┐ ┌───┐ + │135│ │...│ + └───┘ └─┬─┘ + ┌─────────────┘ + │ + ▼ + ┌───┬───┬───┬───┬───┐ + │ │ │ │ │ │ Memory + └───┴───┴───┴───┴───┘ + + Run BCD V3 instruction ================= + + Register V3 Index Register + ┌───┐ ┌───┐ + │135│ │...│ + └───┘ └─┬─┘ + ┌─────────────┘ + │ + ▼ + ┌───┬───┬───┬───┬───┐ + │ 1 │ 3 │ 5 │ │ │ Memory + └───┴───┴───┴───┴───┘ + + +We'll split this into two parts to make it easier to read. First we'll make +a `bcd` function to actually get the digits: + +```lisp +(defun-inline bcd (integer) + (values (-<> integer (floor <> 100) (mod <> 10)) + (-<> integer (floor <> 10) (mod <> 10)) + (-<> integer (floor <> 1) (mod <> 10)))) + +``` + +And make sure it works on its own: + +``` +[SBCL] CHIP8> (bcd 135) + +1 +3 +5 +``` + +And then we can define the actual operation: + +```lisp +(define-instruction op-ld-bcd