content/blog/2016/06/symbolic-computation.html @ 8d1cd7431860

Proofread
author Steve Losh <steve@stevelosh.com>
date Tue, 28 Jun 2016 23:39:20 +0000
parents c24ab12c65fc
children 5bc07cde293f
    {% extends "_post.html" %}

    {% hyde
        title: "What the Hell is Symbolic Computation?"
        snip: "Symbols, REPLs, and Quoting -- Oh My!"
        created: 2016-06-29 13:45:00
    %}

    {% block article %}

I've been reading a lot of Lisp books lately, some more advanced than others.
All of the introductory books I've seen cover the idea of symbolic computation,
but most of them breeze past it with just a few pages about "symbols" and
"quoting".  For many programmers coming from non-Lisp languages these are new,
foreign concepts that don't really map back to anything in their previous
experience.  But the books seem to expect you to understand this things after
just a couple of paragraphs.

One book that *does* spend more time on this is the appropriately-named [Common
Lisp: A Gentle Introduction to Symbolic Computation][gentle].  If you're looking
for a good introductory Lisp book, that's the one I'd recommend.  However, it's
quite a long book and starts at the *very* basics (you don't even write code for
the first few chapters -- you program with drawings).  If you already know the
some Lisp then you might find yourself bored if you go through the entire thing
just for the symbolic computation bits.

This post is an attempt to explain what symbols *actually are* and what quoting
does.  You'll get the most out of it if you're already familiar with the basics
of Lisp (i.e. you aren't scared by the parentheses).  But if you already know
what something like this would print, you'll probably be bored:

    :::text
    (let ((foo ''a))
      (print foo)
      (print 'foo)
      (print (car foo))
      (print (cdr foo)))

[gentle]: http://www.amazon.com/dp/0486498204/?tag=stelos-20

[TOC]

## Disclaimer

Before we start I'll get this warning out of the way: I'm going to gloss over
a lot of gory details to get at the important core ideas.

When I say "this thing is implemented like so" it's safe to assume that in
reality things are messier and more complicated and will vary between
implementations and dialects.  My goal is to give you some intuition for the
basic ideas, not to exhaustively cover how a particular implementation works.

I'll be using Common Lisp for the examples, but the concepts apply equally to
Scheme, Clojure, and other Lisps too.

With that out of the way, let's dive in!

## The REPL

The Read-Eval-Print-Loop that so many languages these days have built-in is
a good place to start in our exploration.  You're probably familiar with the
idea of a REPL from languages like Python, Ruby, Javascript, etc.  In Python it
looks like this:

    :::text
    $ python
    >>> if True:
    ...     print "Yes"
    ... else:
    ...     print "No"
    ...
    Yes

A handwavey definition of a REPL could be something like this:

> A REPL is a program that lets you type in code.  It runs that code, prints out
> the result, and loops back to the beginning to let you type in more.

This is a good first start, but to understand symbols we're going to need to get
a much clearer, more precise handle on what exactly each letter in "REPL" means.
We'll start backwards and outside-in with the simplest component.

## Loop

The "L" in "REPL" stands for "Loop", and it's the easiest part to understand.
A REPL doesn't just process one bit of code and then exit, it loops forever (or
until you tell it to quit).

There's not a lot to say here, except that we'll go ahead and lump together the
extra busywork a REPL needs to do into this step so we can ignore it for the
rest of the post.  This includes things like:

* Printing the nice prompt at the beginning (`>>>` or `...` in Python).
* Checking for `Ctrl-D` to quit.
* Handling exceptions (so they don't kill the process).

## Print

The next letter is "P" for "Print".  Like `loop` this should be familiar
territory, but let's nail down *exactly* what it does.  The job of the `print`
function is to:

* Take as input some kind of object/data structure in memory.
* Produce some series of characters (a string) that represent this object.
* Write those characters somewhere (standard output, a text panel in a GUI, etc).

For example: if we give the integer `1` to Python's `print`, it produces
a string of two characters (the digit `1` and a newline) and writes it to the
terminal.  There are a lot of little details that aren't important here (line
endings, string encoding, where the output actually goes, etc).  The important
part that you should fix in your mind is this:

**`print` takes an object, converts it into a string, and outputs it.**

Note: I've called `print` a function here, and I'll do the same for `read` and
`eval` later on.  This is correct for Common Lisp, but not necessarily for
languages like Python or Javascript.  For the purposes of this post you can just
imagine them as functions and realize that in the real world they're more like
"phases" that have a lot more complex machinery under them.

## Read

We'll jump over to the other side of the acronym now.  "R" is for "Read", and
this is where things start to get tricky.

In most non-Lisp languages the "read" and "eval" phases of the REPL tend to get
blurred together, but if you want to understand symbolic computation in Lisp you
absolutely *must* keep these two parts cleanly separated in your mind.

We can define `read` as almost the polar opposite of `print`.  `read` is
a function whose job is to:

* Take as input a string of characters from some source.
* "Decode" that string and produce some kind of object in memory.
* Return that object.

Notice how this mirrors `print`:

* `print`: Object → Characters
* `read`: Characters → Object

This is the second definition you need to fix in your brain:

**`print` takes an object, converts it into a string, and outputs it.**  
**`read` takes a string, converts it into an object, and returns it.**

If we `read` in the string `1.0` (that's three ASCII characters), we'll get some
kind of floating point number object back.  Sometimes there are many ways to
represent the same object in memory.  For example: `read`ing the strings `1.0`
and `1.0000` will both produce equivalent floating point objects representing
the number "one".

Side note (if you're not feeling pedantic feel free to skip this): the word
"equivalent" here is a bit tricky.  When you `read` two strings like `1.0` and
`1.000` you might get back two pointers to the same hunk of memory, or you might
get back two pointers to two separate hunks of memory, or even just a machine
word representing the floating point number directly.  It depends on the
language (and even the implementation).  What I mean by "equivalent" is that
there's no way to tell which value came from which input.  You can't say "this
hunk of memory must have come from the string `1.000`" because `read` has
"sanitized" the input by that point.

## The RPL

So now we've got `read`, `print`, and `loop`.  `read` takes characters from
a stream and produces objects, and `print` takes objects and produces characters
and writes them to a stream.  We're still missing a letter, but what would
happen if we just hook these three up together right now?

    :::text
    CL-USER> (loop (print (read)))
    > 1
    1
    >    1.0
    1.0
    > 1.00000
    1.0

I've added a little `>` prompt to the output here to make it clear where I'm
typing.

Essentially what we've got here is a little pretty-printer!  Notice how things
like extra whitespace and useless zeroes got stripped out.  This isn't terribly
useful, but at least it's doing *something*.

However, it's important to realize where the "prettifying" actually happened:
`read` actually did the hard work!  By the time `print` got its grubby little
paws on those one-point-zeroes the extra whitespace and digits were long gone.
Maybe we should give credit where it's due and call this a "pretty-reader"
instead.

## Reading Other Data

Up to this point I've been using numbers as examples because they're simple, but
of course we can read and print lots of other kinds of things.  Lists and
strings (and lists *of* strings!) are some common examples:

    :::text
    CL-USER> (print (read))
    > ("hello"       "world")
    ("hello" "world")

    $ python
    >>> [    "hello"  , "world"]
    ['hello', 'world']

There are lots of other kinds of things you can read and print, depending on the
particular language.  What's important here is the strict separation of `read`
and `print`.

## Readable Printing

Before we move on, one thing we should talk about is the idea of "readable
printing".  This idea becomes clear almost immediately when you work at the
Python REPL:

    :::text
    $ python
    >>> "foo\nbar"
    'foo\nbar'
    >>> print "foo\nbar"
    foo
    bar

Why was the first string printed with quotes and an escaped newline, and the
second without quotes and an actual newline?

When we defined `print` earlier, one of the steps was:

* Produce some series of characters (a string) that represent this object.

Much like there are many character strings that result in the same object (e.g.
`1.0` and `1.000`) there are often many ways we can represent a particular
object as a series of characters.  For example: we can represent the number two
as `2`, or `10` in binary, or `two` in English.  We can represent a string as
the actual characters in it, or we could represent it "escaped" like Python does
in the first line.

The core idea here is that of **"readable printing": printing the characters you
would need to give to `read` to produce this object**.

This is different from what I'll call **"aesthetic printing": printing
characters for this object which look nice to a human**.

In Python these two kinds of printing are seen in the separate [`__str__` and
`__repr__` methods][str_repr].  In Common Lisp it's controlled by the
[`*print-readably*` variable][print-readably].  Other languages have different
mechanisms.

[str_repr]: https://brennerm.github.io/posts/python-str-vs-repr.html
[print-readably]: http://clhs.lisp.se/Body/v_pr_rda.htm

## Reading Non-Lisp

So we know how reading and printing *data* works, but you can do more than that
at a typical REPL.  Let's revisit the first Python example:

    :::text
    $ python
    >>> if True:
    ...     print "Yes"
    ... else:
    ...     print "No"
    ...
    Yes

We know that `read` takes a series of characters and turns them into an object.
When we `read` this series of characters, what comes out the other end?  For
most languages (Python, Ruby, Javascript, Scala, etc) the answer is some kind of
[abstract syntax tree][] represented as a data structure.  You might imagine
something like this buried deep within the guts of the Python interpreter:

    :::python
    class IfStatement(ASTNode):
        def __init__(self, test_node, then_node, else_node):
            self.test_node = test_node
            self.then_node = then_node
            self.else_node = else_node

The details would surely be a lot hairier in the real world, but the general
idea is that the stream of characters representing your code gets read in and
parsed into some kind of data structure in memory.

This is all well and good, but our example printed "Yes", which certainly isn't
a representation of an AST node.  Where did that come from?  To answer that
we'll need the final letter in the acronym.

[abstract syntax tree]: https://en.wikipedia.org/wiki/Abstract_syntax_tree

## Evaluating Non-Lisp

The "E" in REPL stands for "Eval".

This is the last time we'll consider Python in this post, because we're about to
move past it to more powerful ideas.  In Python, the purpose of the `eval` phase
of the REPL is to:

* Take an AST node (or simple data) as input.
* Run the code (also known as *evaluating* it).
* Return the result (if any).

This is fairly straightforward, at least for our simple `if` statement.  Python
will evaluate the `test_node` part of the `if`, and then evaluate one of the
other parts depending on whether the test was truthy or falsey.

Of course the actual process of evaluating real-life Python code is much more
complicated than I've glossed over here, but this is good enough for us to use
as a springboard to dive into the beautiful world of Lisp.

## Reading Lisp

When we talked about reading the Python `if` statement I said that Python would
parse it into some kind of data structure representing an AST node.  I didn't
point you at the *actual* data structure in the Python source code because
I have no idea where it actually *is*, and it's probably not nearly as simple as
the one I sketched out.

Now that we're talking about Lisp, however, I *really can* show you exactly what
an `if` statement in Lisp reads as!  I'll use `read-from-string` to make it
a bit easier to see what's input and what's output, but it would work the same
way with just plain old `read` and input from standard in.  Here we go:

    :::text
    CL-USER> (print
               (read-from-string "(if t (print 1) (print 2))"))

    (IF T (PRINT 1) (PRINT 2))

    CL-USER> (type-of
               (read-from-string "(if t (print 1) (print 2))"))

    CONS

Instead of some kind of AST node data structure buried deep within the heart of
our implementation we get... a list!  An ordinary, garden-variety, vanilla list
made of good old trusty cons cells!

The list we get contains four elements, two of which are themselves lists.  But
what do these lists actually *contain*?

The numbers are simple enough -- we saw them earlier.  `read` turns a series of
digits into a number somewhere in memory, and `print` spits out a series of
characters that represent the number:

    :::text
    CL-USER> (type-of (read-from-string "100"))

    (INTEGER 0 4611686018427387903)

The string `100` does indeed represent an integer between zero and four
squillion-something.

But what are those *other* things in our list?  The `if` and `t` and `print`?
Are they some magic keywords internal to the guts of our Lisp that `print` is
just outputting nicely for our benefit?

    :::text
    CL-USER> (type-of (read-from-string "if"))

    SYMBOL

Buckle up -- it's time to learn about symbols.

## Symbols

A symbol in Lisp is [a simple data structure][clhs-symbol] that turns out to be
*extremely* useful.  You can imagine it being defined something like this:

    :::text
    (defstruct symbol
      name
      function
      value
      property-list
      package)

The `name` of a symbol is a string of characters representing the symbol's name:

    :::text
    CL-USER> (type-of (read-from-string "if"))

    SYMBOL

    CL-USER> (symbol-name (read-from-string "if"))

    "IF"

Symbols get created (usually) when they are `read` for the first time.  If you
`read` a sequence of characters that matches the name of an existing symbol
(let's ignore the lowercase/uppercase thing for now), `read` just returns that
existing symbol:

    :::text
    CL-USER> (eq (read-from-string "cats")
                 (read-from-string "cats"))

    T

Here `read` saw the string `cats` and created a new symbol somewhere in memory
with `CATS` as its `name`.  The second call to `read-from-string` read in the
string `cats`, saw an existing symbol with the matching name, and just returned
it straight away.  That's why `eq` returned true -- the two `read` calls
returned pointers to the exact same hunk of memory.

(`read` also "interned" the symbol into the current package.  We're going to
ignore the package system for this post -- it's not important to the core
topic.  We'll also ignore the property list.)

So to recap:

* `read` is a function that takes characters and turns them into objects.
* When you feed `read` certain strings of characters (i.e. without special
  characters), it returns a symbol object whose name matches those characters
  (creating it if necessary).
* Symbols are just normal Lisp data structures with a few specific fields.

So what does `print` do with a symbol?  It just prints the `name`!

    :::text
    CL-USER> (print (read-from-string "cats"))

    CATS

Simple enough.  Note that this is *readably printed* -- if we feed it back to
`read` we'll get an equivalent symbol back (the *exact same* symbol, in fact).

But wait!  If we `read` symbols in and they're just normal vanilla structures,
then how do we actually *do* anything in Lisp?  How does anything actually
*run*?

[clhs-symbol]: http://www.lispworks.com/documentation/lw70/CLHS/Body/t_symbol.htm

## Evaluating Lisp

Now we need to go back to `eval`.  In the Python example I handwaved a bit and
said that `eval` "ran the code" but didn't really explain how that happened.
Now that we're in the warm embrace of Lisp we can nail things down a lot more
cleanly.

`eval` (in Lisp) is a function whose job is to:

* Take an object.
* Do something useful with it to get another object.
* Return the resulting object.

I know, I know, that seems as clear as a brick.  But if we can actually define
what "something useful" means it will be a lot more helpful.  Lisp's `eval`
follows a few simple rules to produce its output.

### Strings and Numbers

First of all, `eval` will just pass numbers and strings straight through without
touching them at all:

    :::lisp
    (defun eval (thing)
      (cond
        ((numberp thing) thing)
        ((stringp thing) thing)
        ; ...
        (t (error "What the hell is this?"))))

This makes sense because there's not a lot more useful stuff you can do with
a number or a string on its own.  I suppose you could write a language where the
evaluation phase ran all numbers through the absolute value function because you
wanted to remove all negativity from programming or something, but that seems
a bit esoteric.

### Basic Lists

What about lists, like our friend the `if` statement (a list of four things)?
Well the first thing anyone learns about Lisp is the funny syntax.  In most
cases, when passed a list `eval` treats the first element as the name of
a function, and the rest as the arguments.  It will evaluate the arguments and
call the function with them:

    :::lisp
    (defun eval (thing)
      (cond
        ((numberp thing) thing)
        ((stringp thing) thing)
        ((typep thing 'list)
         (destructuring-bind (head . arguments) thing
           (apply (symbol-function head)
                  (mapcar #'eval arguments))))
        ; ...
        (t (error "What the hell is this?"))))

At this point we know enough to understand exactly what happens when we say
something like `(+ 1 2)` at a Lisp REPL!

* Lisp `read`s the seven characters we gave it, which becomes a list of three
  objects: the symbol `+`, the integer `1`, and the integer `2`.
* Lisp passes that three-object list to `eval`.
* `eval` looks up what is in the `function` slot of the symbol at the front of
  the list (that symbol is `+`).
* The symbol `+` was helpfully given a function by the Lisp implementation (run
  `(symbol-function (read-from-string "+"))` to confirm that I'm not lying to
  you).
* Lisp evaluates each item in the rest of the list.  In this case they're each
  a number, so each one evaluates to itself.
* Lisp calls the function with the evaluated arguments to get a result (in this
  case: the integer `3`).
* Lisp passes the result to `print`, which turns it into a string of characters
  (one character, in this case) and prints it to standard out.

Whew!  That seems like a lot, but each piece is pretty small.  Go slowly through
it and make sure you *really* understand each step in the preceding list before
you move on.

If you don't believe me that this is really how it works, try this:

    :::text
    CL-USER> (setf (symbol-function (read-from-string "square"))
                   (lambda (x) (* x x)))

    #<FUNCTION (LAMBDA (X)) {1005DCF2FB}>

    CL-USER> (square 4)

    16

We set the `function` slot of the symbol `square` to be a function that squares
its argument, and then we can run it like any other function.

Trace through the evaluation of that last `(square 4)` call before moving on.

### Variables

What about symbols that *aren't* at the front of a list?  For example:

    :::text
    CL-USER> (defvar ten 10)

    10

    CL-USER> (square ten)

    100

In this case, everything proceeds as it did before until the recursive call to
`eval` the argument to `square`.  The argument is just a bare symbol (not
a number or a list or a string), so we need another rule in our `eval` function:

    :::lisp
    (defun eval (thing)
      (cond
        ((numberp thing) thing)
        ((stringp thing) thing)
        ((symbolp thing) (symbol-value thing)) ; NEW
        ((listp thing)
         (destructuring-bind (head . arguments)
           (apply (symbol-function head)
                  (mapcar #'eval arguments))))
        ; ...
        (t (error "What the hell is this?"))))

If `eval` sees a symbol on its own, it will just return whatever is in the
`value` slot of the symbol data structure.  Simple enough!

### Special Forms
If course there are a few things we haven't talked about yet.  First of all:
there are some "magic" symbols that aren't functions, but are hardcoded into the
evaluation function to do certain things.  In Common Lisp they're called
["special operators"][special-operators] and there are twenty-five of them.

Let's add a special rule to our `eval` function to handle one of them (our old
friend `if`):

    :::lisp
    (defun eval (thing)
      (cond
        ((numberp thing) thing)
        ((stringp thing) thing)
        ((symbolp thing) (symbol-value thing))
        ((listp thing)
         (destructuring-bind (head . arguments) thing
           (cond
             ((eq head (read-from-string "if")) ; NEW
              (destructuring-bind (test then else) arguments
                (if (eval test)
                  (eval then)
                  (eval else))))
             (t
              (apply (symbol-function head)
                     (mapcar #'eval arguments))))))
        ; ...
        (t (error "What the hell is this?"))))

At this point your head might be starting to hurt.  How have we just used `if`
to define itself?  Is anything here actually *real*, or is Lisp just a serpent
eating its own tail forever?

### An Ouroborromean Paradox?

If the "metacircularity" (god, I hate that word) of what we've just done bothers
you, remember that this `eval` function here is just a sample of how you *can*
write it in Lisp.  You could also write it in some other language, like C or
Java (see: [ABCL][]) or really anything.  That would let you "bottom out" to
something that's not Lisp.

But really, is that any *less* circular?  You still need *some* language to run
the thing!  Why does an `eval` written in C seem less "cheaty" than one written
in Lisp?  Because it compiles to machine code?  Try
`(disassemble #'sb-impl::simple-eval-in-lexenv)` in SBCL and you'll get a few
thousand bytes of x86 assembly that was [originally some Lisp code much like
ours][sbcl-eval].

My advice is to just let it flow over you, even if you feel a bit uneasy at the
sight of something like this.  Eventually as you become more at home in your
chosen Lisp, writing high-level code one moment and `disassemble`ing a function
the next, you'll make peace with the serpent.

### Exercises

There are quite a few things in `eval` that we haven't covered here, like the
other twenty-four special operators, lexical variables, lambda forms, and more.
But this should be enough for you to wrap your head around symbols and how
they're used in Lisp.

Before we move on to the final part of this story, make sure you can figure out
the answers to these questions:

1. What does the single character `1` read as?  What does it evaluate to?
2. What do the characters `"hello\"world"` read as?  What do they evaluate to?
   How would the result be printed readably?  How would the result be printed
   aesthetically?
3. What do the characters `(list (+ 1 2) ten)` read as?  What does it evaluate
   to?  How does each step of the evaluation work?  Trace out each of the calls
   to `eval` on paper, with their arguments and their results.  What does the
   final result print as?
4. Use `symbol-function` to see what's in the function slot of the symbol `car`.
   How does your Lisp print that thing?
5. Try to look up the function slot of a symbol without a function.  What
   happens?
4. Use `symbol-function` to see what's in the function slot of the symbol `if`.
   What happens?  Does that seem right?  If so, why?
5. Go to the [HyperSpec page for `symbol`][clhs-symbol] and search for
   "the consequences are undefined if".  Try all of those things and find out
   what interesting things break in your Lisp.

If you're not using Common Lisp, port these exercises to your language.

[special-operators]: http://www.lispworks.com/documentation/HyperSpec/Body/03_ababa.htm
[ABCL]: https://common-lisp.net/project/armedbear/
[sbcl-eval]: https://github.com/sbcl/sbcl/blob/fdc4e9fa86b5eaaf8939f004a66e4be075069aa8/src/code/eval.lisp#L131-L272

## Quote

Up to now we've been using the clumsy `(read-from-string "foo")` form to get
ahold of symbols, but there's an easier way called "quoting".

Quoting often confuses new Lisp programmers, and they'll end up randomly
adding/removing quotes to something until it appears to do what they want.  Now
that you've got `read`, `eval`, and `print` firmly separated in your mind,
`quote` will be much easier to understand.

### The Special Operator

Quoting actually has two distinct components: one that affects `read` and one
that affects `eval`.  In most books the distinction between these two only gets
barely a sentence or two, which is probably why beginners are so confused.
Let's look at the `eval` side of things first.

`quote` is one of the twenty-five special operators.  All it does it pass its
(single) argument back from `eval` *untouched*.  Normally the argument itself
would have been evaluated, but `quote` prevents that:

    :::lisp
    (defun eval (thing)
      (cond
        ((numberp thing) thing)
        ((stringp thing) thing)
        ((symbolp thing) (symbol-value thing))
        ((listp thing)
         (destructuring-bind (head . arguments) thing
           (cond
             ((eq head (read-from-string "if"))
              (destructuring-bind (test then else) arguments
                (if (eval test)
                  (eval then)
                  (eval else))))
             ((eq head (read-from-string "quote")) ; NEW
              (first arguments))
             (t
              (apply (symbol-function head)
                     (mapcar #'eval arguments))))))
        ; ...
        (t (error "What the hell is this?"))))

I know that seems strange, but let's look at it in action:

    :::text
    CL-USER> (+ 1 2)

    3

    CL-USER> (quote (+ 1 2))

    (+ 1 2)

Our first example is the same one we used in the previous section:

* `read` takes in seven characters and returns a list of three items: the symbol
  `+` and two numbers.
* `eval` gets the three-element list, looks up the function for the symbol `+`,
  recursively evaluates the arguments (which evaluate to themselves), and
  applies the function.
* `print` gets the integer `3`, turns it into ASCII, and writes it to the
  terminal.

The second example uses `quote`:

* `read` takes in fifteen characters and returns a list of two items: the symbol
  `quote` and a list of three items (the symbol `+` and two numbers).
* `eval` gets the three element list and notices that the first element is the
  special operator `quote`.  It simply returns the second element in the list
  untouched, which in this case is a list of three items.
* `print` gets that three element list, turns it into seven characters, and
  writes it to the terminal.

Just for fun, let's ride the serpent and use `quote` in the definition of
itself inside `eval` to clean things up a bit:

    :::lisp
    (defun eval (thing)
      (cond
        ((numberp thing) thing)
        ((stringp thing) thing)
        ((symbolp thing) (symbol-value thing))
        ((listp thing)
         (destructuring-bind (head . arguments) thing
           (cond
             ((eq head (quote if))
              (destructuring-bind (test then else) arguments
                (if (eval test)
                  (eval then)
                  (eval else))))
             ((eq head (quote quote)) ; WHAT IN THE
              (first arguments))
             (t
              (apply (symbol-function head)
                     (mapcar #'eval arguments))))))
        ; ...
        (t (error "What the hell is this?"))))

That `(quote quote)` looks just completely bonkers, right?  Step through it
piece by piece to figure it out:

* What do the five characters `quote` read as?  What do they evaluate to?
* What do the thirteen (spooky!) characters `(quote quote)` read as?  When
  `eval` gets its hands on that, what clause in the `cond` does it hit?  What
  does that return?

Try these in a Lisp REPL if you're not 100% sure of the answers.  The functions
`read-from-string`, `eval`, and `type-of` will come in handy.

### The Read Macro

Now let's look at the final piece of the puzzle.  `'` (that's a single ASCII
quote character) is a [read macro][].  Don't worry if you don't know exactly
what that means right now.  The important part is that when `read` sees a quote
character, it returns a two-element list containing the symbol `quote` and the
next form, which is read normally.

The four characters `'foo` go into `read` and come out as a two-element list
containing: the symbol `quote` and the symbol `foo`.  Again, because it's
important: this happens *at read time*.  By the time `eval` sees anything it's
already been turned into that two-element list.

It can be a bit slippery to get a grip on this, because `print` will "helpfully"
print a list of `(quote whatever)` as `'whatever`!  But we can use trusty old
`car` and `cdr` to see what's really going on:

    :::lisp
    CL-USER> (read-from-string "'foo")

    'FOO

    CL-USER> (type-of (read-from-string "'foo"))

    CONS

    CL-USER> (car (read-from-string "'foo"))

    QUOTE

    CL-USER> (cdr (read-from-string "'foo"))

    (FOO)

That's really all there is to it: `'form` is returned by `read` as `(quote
form)`.  Then that gets passed along to `eval`, which sees the `quote` special
operator and passes the argument back untouched.

Note that `form` doesn't have to be a symbol:

* `'(1 2 (4 5))` reads as `(quote (1 2 (4 5)))`
* `'150` reads as `(quote 150)`
* `'"hello"` reads as `(quote "hello")`
* `'(if if if if)` reads as `(quote (if if if if))`

[read macro]: https://gist.github.com/chaitanyagupta/9324402

### Exercises

That pretty much wraps up `quote`.  Once you've cleanly separated `read`,
`eval`, and `print` in your mind it becomes a lot less mystical (or perhaps
*more*, depending on your point of view).

If you want to twist your brain just a bit more, try these exercises:

1. What do the characters `foo` read in as?  What do they evaluate to?  How does it print?
2. What do the characters `'foo` read in as?  What do they evaluate to?  How does it print?
3. What do the characters `''foo` read in as?  What do they evaluate to?  How does it print?
4. What do the characters `'1.0` read in as?  What do they evaluate to?  How does it print?
5. What do the characters `'(1 2 3)` read in as?  How many elements are in the list?  What does it evaluate to?  How many elements are in *that* list?
6. Update our implementation of `eval` to use the `'` read macro instead of `quote`.
7. Certain symbols like `t` and `nil` are special -- they evaluate to themselves like numbers or strings.  Add this to `eval`.
8. What do the characters `nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?
9. What do the characters `'nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?
10. What do the characters `''nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?

## A Quick Recap

If you've gotten this far and understood everything, you should have a good
grasp of how symbolic computation works.  Here are the main things I hope you've
taken away from this post:

* `read`, `eval`, and `print` are three separate, distinct functions/phases in
  the interpretation of Lisp (and really *all* (okay, *most*)) code.
* Keeping these three phases *clearly separated* in your mind will make it
  easier to understand symbols and quoting in Lisp.
* Symbols are nothing magic, they're just a particular data structure with
  a couple of fields.
* The important thing is how `read` and `eval` *treat* symbols.  That's where
  the magic happens.
* `quote` is just a "short-circuit" in `eval` that turns out to be really
  useful.
* `'` is just a lazier way of writing `(quote ...)`.

## Where to Go From Here

Now that you've got a firmer grasp on what the hell symbols actually *are*,
there are a lot of cool things you might want to check out that will show you
what you can do with them:

* Reread some of your introductory Lisp books again, seeing if things seem to
  make a bit more sense now.
* Read about backquote/quasiquote.  How does it compare to the `quote` we
  explored here?
* Read [Paradigms of Artificial Intelligence Programming][PAIP] and pay
  attention to how Norvig uses symbols for various tasks.  Also pay attention to
  what he says about property lists, because we skipped them here.
* Read the [guide to the Common Lisp package system][packaging] to learn all the
  stuff I left out about the Common Lisp package system.
* Gaze upon [this macro][symbolicate] in wonder (or horror).
* Read the HyperSpec pages for things like `symbol` and `quote`.
* Find and read your Lisp implementation's `eval` function.
* Find and read *another* Lisp implementation's `eval` function.
* Watch the [SICP Lectures][SICP] on YouTube.

[PAIP]: http://www.amazon.com/dp/1558601910/?tag=stelos-20
[symbolicate]: https://github.com/zkat/squirl/blob/f0dc57dfee728df94b2a35956e18e143e8b8d275/src/vec.lisp#L28-L38
[packaging]: https://www-fourier.ujf-grenoble.fr/~sergerar/Papers/Packaging.pdf
[SICP]: http://www.amazon.com/dp/0262510871/?tag=stelos-20

    {% endblock article %}