content/blog/2016/12/chip8-debugging-infrastructure.markdown @ ae7bfb3acac3

Merge.
author Steve Losh <steve@stevelosh.com>
date Sun, 25 Dec 2016 11:34:06 -0500
parents ef37b9f3e398
children (none)
+++
title = "CHIP-8 in Common Lisp: Debugging Infrastructure"
snip = "Let's figure out what the hell is going on."
date = 2016-12-31T14:50:00Z
draft = true

+++

In the previous posts we looked at how to emulate a [CHIP-8][] CPU with Common
Lisp.  After adding a screen, input, and sound the core of the emulator is
essentially complete.

I've been guiding you through the code step by step and it might look pretty
simple, but that's only because I went down all the dead ends myself first.  In
practice, when you're writing an emulator for a system you'll need a way to
debug the execution of code, so let's look at how to add some debugging
capabilities to our simple CHIP-8 emulator.

The full series of posts so far:

1. [CHIP-8 in Common Lisp: The CPU](http://stevelosh.com/blog/2016/12/chip8-cpu/)
2. [CHIP-8 in Common Lisp: Graphics](http://stevelosh.com/blog/2016/12/chip8-graphics/)
3. [CHIP-8 in Common Lisp: Input](http://stevelosh.com/blog/2016/12/chip8-input/)

The full emulator source is on [BitBucket][] and [GitHub][].

[CHIP-8]: https://en.wikipedia.org/wiki/CHIP-8
[BitBucket]: https://bitbucket.org/sjl/cl-chip8
[GitHub]: https://github.com/sjl/cl-chip8

<div id="toc"></div>

## Disassembling

The first thing we'll need is a way to take an instruction like `#x8055` and
turn it into something we can read.  The easiest way to do this seemed to be to
copy the dispatch loop from the CPU emulator and turn it into a disassembly
function:

```lisp
(defun disassemble-instruction (instruction)
  (flet ((v (n) (symb 'v (format nil "~X" n))))
    (let ((_x__ (ldb (byte 4 8) instruction))
          (__x_ (ldb (byte 4 4) instruction))
          (___x (ldb (byte 4 0) instruction))
          (__xx (ldb (byte 8 0) instruction))
          (_xxx (ldb (byte 12 0) instruction)))
      (case (logand #xF000 instruction)
        (#x0000 (case instruction
                  (#x00E0 '(cls))
                  (#x00EE '(ret))))
        (#x1000 `(jp ,_xxx))
        (#x2000 `(call ,_xxx))
        (#x3000 `(se ,(v _x__) ,__xx))
        (#x4000 `(sne ,(v _x__) ,__xx))
        (#x5000 (case (logand #x000F instruction)
                  (#x0 `(se ,(v _x__) ,(v __x_)))))
        (#x6000 `(ld ,(v _x__) ,__xx))
        (#x7000 `(add ,(v _x__) ,__xx))
        (#x8000 (case (logand #x000F instruction)
                  (#x0 `(ld ,(v _x__) ,(v __x_)))
                  (#x1 `(or ,(v _x__) ,(v __x_)))
                  (#x2 `(and ,(v _x__) ,(v __x_)))
                  (#x3 `(xor ,(v _x__) ,(v __x_)))
                  (#x4 `(add ,(v _x__) ,(v __x_)))
                  (#x5 `(sub ,(v _x__) ,(v __x_)))
                  (#x6 `(shr ,(v _x__) ,(v __x_)))
                  (#x7 `(subn ,(v _x__) ,(v __x_)))
                  (#xE `(shl ,(v _x__) ,(v __x_)))))
        (#x9000 (case (logand #x000F instruction)
                  (#x0 `(sne ,(v _x__) ,(v __x_)))))
        (#xA000 `(ld i ,_xxx))
        (#xB000 `(jp ,(v 0) ,_xxx))
        (#xC000 `(rnd ,(v _x__) ,__xx))
        (#xD000 `(drw ,(v _x__) ,(v __x_) ,___x))
        (#xE000 (case (logand #x00FF instruction)
                  (#x9E `(skp ,(v _x__)))
                  (#xA1 `(sknp ,(v _x__)))))
        (#xF000 (case (logand #x00FF instruction)
                  (#x07 `(ld ,(v _x__) dt))
                  (#x0A `(ld ,(v _x__) k))
                  (#x15 `(ld dt ,(v _x__)))
                  (#x18 `(ld st ,(v _x__)))
                  (#x1E `(add i ,(v _x__)))
                  (#x29 `(ld f ,(v _x__)))
                  (#x33 `(ld b ,(v _x__)))
                  (#x55 `(ld (mem i) ,_x__))
                  (#x65 `(ld ,_x__ (mem i)))))))))
```

There are a lot of other ways we could have done this, like making a proper
parser or adding functionality to `define-opcode`, but since there's not that
many instructions I think this is pretty reasonable.  Now we can pass in a raw,
two-byte instruction and get out something readable:

```
[SBCL] CHIP8> (disassemble-instruction #x8055)
(SUB V0 V5)

[SBCL] CHIP8> (disassemble-instruction #x4077)
(SNE V0 119)
```

Disassembling a single instruction will be useful, but it would also be nice to
disassemble an entire ROM at once to see what its code looks like.  Let's make
a little helper function to handle that:

```lisp
(defun dump-disassembly (array &optional (start 0) (end (length array)))
  (iterate
    (for i :from start :below end :by 2)
    (print-disassembled-instruction array i)
    (sleep 0.001)))
```

The `sleep` is there because Neovim's terminal seems to shit the bed if you dump
too much text at it at once.  Computers are garbage.

Other that than, `dump-disassembly` is pretty straightforward: just iterate
through the array of instructions two bytes at a time and print the information.
Let's look at the printing function now:

```lisp
(defun print-disassembled-instruction (array index)
  (destructuring-bind (address instruction disassembly)
      (instruction-information array index)
    (let ((*print-base* 16))
      (format t "~3,'0X: ~4,'0X ~24A~%"
              address
              instruction
              (or disassembly "")))))
```

Once again we'll delegate to a helper function.
`print-disassembled-instruction` just handles the string formatting to dump an
instruction to the screen.  Running it for a single instruction would print
something like:

```
Address    Disassembly
 |          |
 v          v
200: 8055 (SUB V0 V5)
      ^
      |
      Raw instruction
```

The helper function `instruction-information` is simple, but we'll be using it
in the future for something else, so it's nice to have:

```lisp
(defun instruction-information (array index)
  (let ((instruction (retrieve-instruction array index)))
    (list index
          instruction
          (disassemble-instruction instruction))))
```

`retrieve-instruction` is simple (for now):

```lisp
(defun retrieve-instruction (array index)
  (cat-bytes (aref array index)
             (aref array (1+ index))))
```

These functions *could* be combined into a single, bigger function, but I'm
a strong believer in having each function do exactly one thing only.  And as
we'll see, each of these "simple" tasks is going to get more complicated in the
real world.

```lisp
```

```lisp
```

```lisp
```

```lisp
```

```lisp
```

```lisp
```

```lisp
```

```lisp
```

```lisp
```