content/blog/2016/12/chip8-graphics.markdown @ 219d8676e914

Publish input
author Steve Losh <steve@stevelosh.com>
date Fri, 23 Dec 2016 10:55:08 -0500
parents ce548c40237a
children 35680677eb4f
+++
title = "CHIP-8 in Common Lisp: Graphics"
snip = "Let's draw some pixels."
date = 2016-12-21T16:55:00Z
draft = false

+++

In the previous post we looked at how to emulate a [CHIP-8][] CPU with Common
Lisp.  But a CPU alone isn't much fun to play, so in this post we'll add
a screen to the emulator with [Qt][].

The full emulator source is on [BitBucket][] and [GitHub][].

[CHIP-8]: https://en.wikipedia.org/wiki/CHIP-8
[Qt]: https://www.qt.io/
[BitBucket]: https://bitbucket.org/sjl/cl-chip8
[GitHub]: https://github.com/sjl/cl-chip8

<div id="toc"></div>

## Qtools

[Qtools][] is a library that wraps up a few other libraries to make it easier to
write Qt interfaces with Common Lisp.  It's a big library so I'm not going to
try to explain everything about it here.  Most of the code here should be pretty
easy to follow even if you haven't used it before (I'll explain the high-level
concepts), but if you're interested in *exactly* how some code works you should
check out its documentation.

[qtools]: https://shinmera.github.io/qtools/

## Architecture

Let's take a moment to look at the overall architecture of the project.  So far
we've got a `chip` struct that holds the state of the emulated system, and
a bunch of `op-...` instruction functions that emulate the instructions.  We
could start plugging in drawing calls right in the appropriate instructions, and
this is the approach a lot of emulators take.  But I'd like to separate things
a bit more strictly.

My goal is to keep the emulated system entirely self-contained, and then layer
the screen and user interface on top.  The emulator should ideally know
*nothing* about the existence of an interface.  This keeps the emulator simple
and (mostly) free of cruft.  It will also let us play around with alternate
interfaces if we want to — I think it might be fun to add an ASCII screen with
[ncurses][] and [cl-charms][] some day!

With that said, we will make a *few* concessions to performance along the way.

[ncurses]: https://en.wikipedia.org/wiki/Ncurses
[cl-charms]: https://github.com/HiTECNOLOGYs/cl-charms

## The Emulation Layer

We'll start with the emulation side of things.

### Video Memory and Performance

The CHIP-8 has a 64x32 pixel display, and each pixel has only two colors: on and
off.  It's about as simple a screen as you can get.

To keep the emulator from having to know about the user interface we'll model
the screen as a big array of video memory.  The emulator can set the video
memory appropriately, and the user interface can read it to determine what to
draw to the screen at any given point.

There are other ways we could have separated things, but let's run with this
strategy for now.

We'll add a video memory array to our `chip` struct:

```lisp
(defconstant +screen-width+ 64)
(defconstant +screen-height+ 32)

(defstruct chip
  ; ...
  (video (make-array (* +screen-height+ +screen-width+) :element-type 'fixnum)
         :type (simple-array fixnum (#.(* +screen-height+ +screen-width+)))
         :read-only t)
  ; ...
  )
```

Here we already see a first concession to performance.  We could have used
a multidimensional array to make the indexing a bit nicer, but by using a simple
flat array we'll be able to pass it directly to OpenGL later.

OpenGL is going to want this array to be in "X-major" order, so the array will
need to look like:

    [(x₀, y₀), (x₁, y₀), (x₂, y₀), ...,
     (x₀, y₁), (x₁, y₁), (x₂, y₁), ...,
     ...]

We'll add a couple of helper functions to make the indexing of this array a bit
less painful:

```lisp
(defun-inline vref (chip x y)
  (aref (chip-video chip) (+ (* +screen-width+ y) x)))

(defun-inline (setf vref) (new-value chip x y)
  (setf (aref (chip-video chip) (+ (* +screen-width+ y) x))
        new-value))
```

Now we can simply say `(vref chip 5 15)` to get the pixel at (5, 15) instead of
manually calculating out the `aref`.

We'll add one more field to `chip` before moving on, a "dirty" flag:

```lisp
(defstruct chip
  ; ...
  (video-dirty t :type boolean)
  ; ...
  )
```

This will make it easier for any interface to determine whether it needs to
update the display or not.



### Fonts

The CHIP-8 spec sets aside a portion of main memory starting at address `#x50`
to contain sprites for the hex digits 0 through F.  Check out [the display
chapter of Cowgod's guide][cg-display] for an overview of how the CHIP-8 defines
sprites.

We'll need to load these sprites into our emulator's memory at the correct
location when resetting it.  We'll also clear out the video memory while we're
at it:

```lisp
(defun load-font (chip)
  ;; Thanks http://www.multigesture.net/articles/how-to-write-an-emulator-chip-8-interpreter/
  (replace (chip-memory chip)
           #(#xF0 #x90 #x90 #x90 #xF0  ; 0
             #x20 #x60 #x20 #x20 #x70  ; 1
             #xF0 #x10 #xF0 #x80 #xF0  ; 2
             #xF0 #x10 #xF0 #x10 #xF0  ; 3
             #x90 #x90 #xF0 #x10 #x10  ; 4
             #xF0 #x80 #xF0 #x10 #xF0  ; 5
             #xF0 #x80 #xF0 #x90 #xF0  ; 6
             #xF0 #x10 #x20 #x40 #x40  ; 7
             #xF0 #x90 #xF0 #x90 #xF0  ; 8
             #xF0 #x90 #xF0 #x10 #xF0  ; 9
             #xF0 #x90 #xF0 #x90 #x90  ; A
             #xE0 #x90 #xE0 #x90 #xE0  ; B
             #xF0 #x80 #x80 #x80 #xF0  ; C
             #xE0 #x90 #x90 #x90 #xE0  ; D
             #xF0 #x80 #xF0 #x80 #xF0  ; E
             #xF0 #x80 #xF0 #x80 #x80) ; F
           :start1 #x50))

(defun reset (chip)
  (with-chip (chip)
    (fill memory 0)
    (fill registers 0)
    (fill video 0)                 ; NEW
    (load-font chip)               ; NEW
    (replace memory (read-file-into-byte-vector loaded-rom)
             :start1 #x200)
    (setf running t
          video-dirty t            ; NEW
          program-counter #x200
          (fill-pointer stack) 0))
  (values))
```

Once again the handy `replace` function makes things easy.

[cg-display]: http://devernay.free.fr/hacks/chip8/C8TECH10.HTM

### Clearing the Screen: CLS

Now we can start implementing the graphics-related instructions.  The first is
the very simple `CLS` to clear the screen:

```lisp
(define-instruction op-cls ()                           ;; CLS
  (fill video 0)
  (setf video-dirty t))
```

### Loading Fonts: LD F, Vx

Next up is the "load font" instruction, which sets the index register to the
address of the sprite for the digit in the argument register.  So `LD F, V2`
where register 2 contains `6` would set the index register to the address of the
`6` sprite.

```lisp
(defun-inline font-location (character)
  (+ #x50 (* character 5))) ; each sprite is 5 bytes wide

(define-instruction op-ld-font<vx (_ r _ _)             ;; LD F, Vx
  (setf index (font-location (register r))))
```

### Drawing Sprites: DRW X, Y, Size

The most complicated part of the emulator's code is certainly the portion that
draws sprites.  The instruction itself will delegate to a helper function:

```lisp
(define-instruction op-draw (_ rx ry size)       ;; DRW Vx, Vy, size
  (draw-sprite chip (register rx) (register ry) size))
```

Check out [Cowgod's guide][cg-display] for a good overview of how the CHIP-8
drawing system works.  I'll assume you've read that before moving on.

Let's implement the `draw-sprite` function in chunks, because it can be a bit
intimidating if I just slap it all down at once.  Before we start drawing
anything we reset `flag` to 0, and we'll mark the dirty flag at the end:

```lisp
(defun draw-sprite (chip start-x start-y size)
  (with-chip (chip)
    (setf flag 0)
    ; ... draw the sprite ...
    (setf video-dirty t))
  nil)
```

Simple enough.  Now we need to loop through each row of the sprite and draw it.
The address of the sprite we're drawing is given by the index register, and each
row in the sprite is represented by a byte of memory.  Again, check out Cowgod's
guide for the full details.

```lisp
(defun draw-sprite (chip start-x start-y size)
  (with-chip (chip)
    (setf flag 0)
    (iterate                         ; NEW
      (repeat size)                  ; NEW
      (for i :from index)            ; NEW
      (for y :from start-y)          ; NEW
      (for sprite = (aref memory i)) ; NEW
      ; ... draw the row ...
      )
    (setf video-dirty t))
  nil)
```

To draw a row, we just have to draw each of the eight pixels in it:

```lisp
(defun draw-sprite (chip start-x start-y size)
  (with-chip (chip)
    (setf flag 0)
    (iterate
      (repeat size)
      (for i :from index)
      (for y :from start-y)
      (for sprite = (aref memory i))
      (iterate                      ; NEW
        (for x :from start-x)       ; NEW
        (for col :from 7 :downto 0) ; NEW
        ; ... draw the pixel ...
        ))
    (setf video-dirty t))
  nil)
```

Unfortunately we hit a snag at this point.  All the references I've found say
that if any X or Y values go outside of the range of valid screen coordinates
the sprite should wrap around the screen.  And indeed, some ROMs (e.g.
`ufo.rom`) require this behavior to work properly.  But unfortunately some
*other* ROMs (e.g. `blitz.rom`) expect the screen to *clip*, not wrap!

This is the first case where our emulator will need to bend the rules of the
spec to accommodate buggy ROMs.  Sadly this is not uncommon in the emulation
world.

We'll deal with this by adding a setting to the emulator:

```lisp
(defstruct chip
  ; ...
  (screen-wrapping-enabled t :type boolean)
  ; ...
  )
```

Then we can hide the "to wrap, or not to wrap" logic in its own helper:

```lisp
(defun-inline wrap (chip x y)
  (cond ((chip-screen-wrapping-enabled chip)
         (values (mod x +screen-width+)
                 (mod y +screen-height+)
                 t))
        ((and (in-range-p 0 x +screen-width+)
              (in-range-p 0 y +screen-height+))
         (values x y t))
        (t (values nil nil nil))))
```

`wrap` will take `x` and `y` coordinates and return three values:

* The screen X coordinate to draw to (if any).
* The screen Y coordinate to draw to (if any).
* A boolean that will be `t` when the pixel should be drawn, or `nil` if not.

[`in-range-p`][in-range-p] is a predicate from my utility library that checks if
`low <= val < high` (which is [often useful][dijkstra]).

Now we can use this in `draw-sprite` to determine whether and where to draw each
pixel:

[in-range-p]: https://github.com/sjl/cl-losh/blob/master/DOCUMENTATION.markdown#in-range-p-function
[dijkstra]: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html

```lisp
(defun draw-sprite (chip start-x start-y size)
  (with-chip (chip)
    (setf flag 0)
    (iterate
      (repeat size)
      (for i :from index)
      (for y :from start-y)
      (for sprite = (aref memory i))
      (iterate
        (for x :from start-x)
        (for col :from 7 :downto 0)
        (multiple-value-bind (x y should-draw) ; NEW
            (wrap chip x y)                    ; NEW
          (when should-draw                    ; NEW
            ; ... actually draw the damn pixel ...
            ))))
    (setf video-dirty t))
  nil)
```

Now we come to the second concession to performance.  Ideally we'd store the
pixel values as `t` and `nil` or `1` and `0`.  But due to a bug in Qt 4 (which
we'll talk about later) we need to store `255` and `0`.  So we need to do a bit
of an ugly dance to `XOR` the pixels together:

```lisp
(defun draw-sprite (chip start-x start-y size)
  (with-chip (chip)
    (setf flag 0)
    (iterate
      (repeat size)
      (for i :from index)
      (for y :from start-y)
      (for sprite = (aref memory i))
      (iterate
        (for x :from start-x)
        (for col :from 7 :downto 0)
        (multiple-value-bind (x y should-draw)
            (wrap chip x y)
          (when should-draw
            (for old-pixel = (plusp (vref chip x y)))       ; NEW
            (for new-pixel = (plusp (get-bit col sprite)))  ; NEW
            (when (and old-pixel new-pixel)                 ; NEW
              (setf flag 1))                                ; NEW
            (setf (vref chip x y)                           ; NEW
                  (if (xor old-pixel new-pixel) 255 0)))))) ; NEW
    (setf video-dirty t))
  nil)
```

`xor` is the exclusive-or variant of `and`/`or` from Alexandria.

And we're finally done.  `draw-sprite` is 22 lines, which is getting a bit long
by Lisp standards, so it might be worth breaking into separate functions.  But
it's also the most performance-critical function in the emulator, so keeping it
as a single loop will let us optimize it heavily later if necessary (spoiler: it
won't be necessary).

## The User Interface Layer

That's it for the emulation side of things.  If we run a ROM now the video
memory array in the `chip` struct will be updated properly.  But unless we want
to look at raw memory, we need some kind of a screen.

We'll start off with a fresh package:

```lisp
(in-package :chip8.gui.screen)
(named-readtables:in-readtable :qtools)
```

### Basic Plan

We're going to use OpenGL to draw the actual pixels for our screen.  The basic
plan will be:

* Ship the video memory up to the graphics card as a texture each frame.
* Draw a single quad with this texture to get the actual pixels onto the display.

We can use a tiny 64 by 64 texture so this will be fine for our performance
needs.

### Screen Widget

The main UI for our screen will be a `QGLWidget`:

```lisp
(define-widget screen (QGLWidget)
  ((texture :accessor screen-texture)
   (chip :accessor screen-chip :initarg :chip)))

(defun make-screen (chip)
  (make-instance 'screen :chip chip))
```

We'll define some initializers for this widget:

```lisp
(defparameter *scale* 8)
(defparameter *width* (* *scale* 64))
(defparameter *height* (* *scale* 32))

(define-initializer (screen setup)
  (setf (q+:window-title screen) "cl-chip8"
        (q+:fixed-size screen) (values *width* *height*)))

(define-override (screen "initializeGL") ()
  (setf (screen-texture screen) (initialize-texture 64))
  (stop-overriding))
```

Single pixels are almost impossible to see on today's high-resolution displays,
so we'll scale them up by 8 to make them bigger.

We need to do the OpenGL texture initialization in the `initializeGL` method,
*not* the normal initializer, because we need the OpenGL context to be ready.
The actual texture initialization code is typical verbose ugly OpenGL, so we'll
tuck it away in a helper function:

```lisp
(defun initialize-texture (size)
  (let ((handle (gl:gen-texture)))
    (gl:bind-texture :texture-2d handle)

    (gl:tex-image-2d :texture-2d 0 :luminance size size 0 :luminance
                     :unsigned-byte (cffi:null-pointer))
    (gl:tex-parameter :texture-2d :texture-min-filter :nearest)
    (gl:tex-parameter :texture-2d :texture-mag-filter :nearest)
    (gl:enable :texture-2d)

    (gl:bind-texture :texture-2d 0)

    handle))
```

We'll use [nearest-neighbor interpolation][] to get nice sharp sprites.

The texture array will be `unsigned-byte`s of luminance values, with `0` being
black and `255` being white.  This explains the dance we had to do back in the
`draw-sprite`.

A better way to do this would be to have video memory contain `0` and `1`, and
use an OpenGL fragment shader to map these to the desired colors.  Unfortunately
due to [a nasty bug][qt-bug] in Qt 4 on OS X we can't use shaders, so we're
stuck with this workaround for the time being.  Computers are awful.

[nearest-neighbor interpolation]: https://en.wikipedia.org/wiki/Nearest-neighbor_interpolation
[qt-bug]: https://github.com/Shinmera/qtools/issues/17

### Drawing Frames

Now that we've got a widget we'll need to paint it on each frame.  We'll use
a Qtimer to handle firing off the paint events:

```lisp
(defparameter *fps* 60)

(define-subwidget (screen timer) (q+:make-qtimer screen)
  (setf (q+:single-shot timer) NIL)
  (q+:start timer (round 1000 *fps*)))

(define-slot (screen update) ()
  (declare (connected timer (timeout)))
  (if (chip8::chip-running (screen-chip screen))
    (q+:repaint screen)
    (die screen)))

(defun die (screen)
  (setf (chip8::chip-running (screen-chip screen)) nil)
  (q+:close screen))
```

The `timer` widget will fire a Qt `timeout` signal sixty times per second.  The
screen's `update` slot is connected to this signal, and will either initiate
a repaint or kill the screen, depending on whether the `chip` is still running.

The `die` function also tells the `chip` to stop running.  This obviously isn't
necessary here (we just checked that it's not running!), but we'll be using
`die` in another place later.

(It's really a shame that Qt and Common Lisp both use the words "signal" and
"slot" to mean wildly different things.  It makes using Qt with Common Lisp more
painful than it should be...)

Now on to the meat of the code, repainting.  We'll define the `paint-event` and
delegate to a helper function:

```lisp
(define-override (screen paint-event) (ev)
  (declare (ignore ev))
  (with-finalizing ((painter (q+:make-qpainter screen)))
    (render-screen screen painter)))
```

And here we go with another pile of verbose graphics code:

```lisp
(defun render-screen (screen painter)
  (q+:begin-native-painting painter)

  ;; Clear the screen
  (gl:clear-color 0.0 0.0 0.0 1.0)
  (gl:clear :color-buffer-bit)

  (gl:bind-texture :texture-2d (screen-texture screen))

  ;; Update the texture
  (let ((chip (screen-chip screen)))
    (when (chip8::chip-video-dirty chip)
      (setf (chip8::chip-video-dirty chip) nil)
      (gl:tex-sub-image-2d :texture-2d 0 0 0 64 32
                           :luminance :unsigned-byte
                           (chip8::chip-video chip))))

  ;; Draw the quad
  (let ((tw 1)
        (th 0.5))
    (gl:with-primitives :quads
      (gl:tex-coord 0 0)
      (gl:vertex 0 0)

      (gl:tex-coord tw 0)
      (gl:vertex *width* 0)

      (gl:tex-coord tw th)
      (gl:vertex *width* *height*)

      (gl:tex-coord 0 th)
      (gl:vertex 0 *height*)))

  (gl:bind-texture :texture-2d 0)

  (q+:end-native-painting painter))
```

Each frame, if `video-dirty` is set we'll update the texture with the contents
of the `chip`'s video memory.  Then we draw a quad using this texture to get the
pixels on the actual display.

<pre class="lineart">
                Screen                       Texture
    ┌───────────────────────────────┐ ┌──────────────────────┐
    │                               │ │   ▉   ▉  ▉  ▉        │
    │  ▉▉      ▉▉    ▉▉    ▉▉       │ │   ▉   ▉  ▉  ▉        │
    │  ▉▉      ▉▉    ▉▉    ▉▉       │ │   ▉▉▉▉▉  ▉  ▉        │
    │  ▉▉      ▉▉    ▉▉    ▉▉       │ │   ▉   ▉  ▉           │
    │  ▉▉      ▉▉    ▉▉    ▉▉       │ │   ▉   ▉  ▉  ▉        │
    │  ▉▉▉▉▉▉▉▉▉▉    ▉▉    ▉▉       │ │░░░░░░░░░░░░░░░░░░░░░░│
    │  ▉▉      ▉▉    ▉▉    ▉▉       │ │░░░░░░░░░░░░░░░░░░░░░░│
    │  ▉▉      ▉▉    ▉▉    ▉▉       │ │░░░░░░░░░░░░░░░░░░░░░░│
    │  ▉▉      ▉▉    ▉▉             │ │░░░░░░░░░░░░░░░░░░░░░░│
    │  ▉▉      ▉▉    ▉▉    ▉▉       │ │░░░░░░░░░░░░░░░░░░░░░░│
    │  ▉▉      ▉▉    ▉▉    ▉▉       │ └──────────────────────┘
    └───────────────────────────────┘
</pre>

Note that we're only every using the top half of the texture — the screen is
a 2:1 rectangle but OpenGL likes square textures.

Getting the texture coordinates on the quad's vertices correct is important,
otherwise you'll end up drawing whatever garbage happened to be in memory at the
time which, while entertaining, is probably not what you want.

### Wrapping Up

The last thing we need is a function to actually create the GUI:

```lisp
(defun run-gui (chip)
  (with-main-window
    (window (make-screen chip))))
```

And now we can modify our emulator's `run` function to start up a GUI in
addition to the system emulation:

```lisp
(defun run (rom-filename)
  (let ((chip (make-chip)))
    (setf *c* chip)
    (load-rom chip rom-filename)
    (bt:make-thread (curry #'run-cpu chip)) ; NEW
    (chip8.gui.screen::run-gui chip)))      ; NEW
```

Qt will take control of the thread and block when run, so we'll need to run our
CPU emulation in a separate thread.

The only thing they both write to is the `video-dirty` flag, so there's not much
synchronization to deal with (yet).  It's theoretically possible that a badly
timed repaint could draw a half-finished sprite on the screen, but in practice
it's not noticeable.

A different architecture (e.g. passing down pixel-drawing functions into the
emulator from the UI) could solve that problem while keeping the layer separate,
but it didn't seem worth the extra effort for this little toy project.

## Results

And with all that done we've *finally* got a screen to play games on!

[![Screenshot of CHIP-8 screen running UFO.rom](/media/images/blog/2016/12/chip8-screen.png)](/media/images/blog/2016/12/chip8-screen.png)

## Future

That's all for the graphics.  In the next post we'll add user input, and then
later we'll look at sound and debugging.