hg.stevelosh.com > stevelosh.com

--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/content/blog/2016/06/symbolic-computation.html	Sun Jun 26 01:01:44 2016 +0000
@@ -0,0 +1,852 @@
+    {% extends "_post.html" %}
+
+    {% hyde
+        title: "What the Hell is Symbolic Computation?"
+        snip: "Symbols, REPLs, and Quoting -- Oh My!"
+        created: 2016-06-29 13:45:00
+    %}
+
+    {% block article %}
+
+I've been reading a lot of Lisp books lately, some more advanced than others.
+All of the introductory books I've seen cover the idea of "Symbolic Computation"
+to some extent, but most of them breeze past it pretty quickly with just a few
+pages about "symbols" and "quote".
+
+For most programmers coming from non-Lisp languages, these are new, foreign
+concepts that don't really map back to anything in their previous experience.
+But the books seem to expect you to understand this things after just a couple
+of paragraphs.
+
+One book that *does* spend more time on this is the appropriately-named [Common
+Lisp: A Gentle Introduction to Symbolic Computation][gentle].  If you're looking
+for a good introductory Lisp book, that's the one I'd recommend.  However, it's
+quite a long book and starts at the *very* basics.  If you already know the some
+Lisp then you might find yourself bored if you go through the entire thing just
+for the symbolic computation bits.
+
+So this post will be an attempt to explain what symbols are, what quoting does,
+and why it's all interesting and useful.  If you already know what something
+like this would print, you'll be bored by this post:
+
+    :::text
+    (let ((foo ''a))
+      (print foo)
+      (print 'foo)
+      (print (car foo))
+      (print (cdr foo)))
+
+[gentle]:
+
+[TOC]
+
+## Disclaimer
+
+Before we start, I'll get this warning out of the way: this post is going to
+gloss over a lot of gory details to get at the important core ideas.
+
+When I say "this thing is implemented like so" it's safe to assume that in
+reality things are messier and more complicated and will vary between dialects
+like Common Lisp, Scheme, and Clojure.  My goal is to give you some intuition
+for the basic ideas, not to exhaustively cover how a particular implementation
+works.
+
+With that out of the way, let's dive in!
+
+## The REPL
+
+The Read-Eval-Print-Loop that so many languages these days have built-in is
+a good place to start in our exploration.  You're probably familiar with the
+idea of a REPL from languages like Python, Ruby, Javascript, etc.  In Python it
+looks like this:
+
+    :::text
+    $ python
+    >>> if True:
+    ...     print "Yes"
+    ... else:
+    ...     print "No"
+    ...
+    Yes
+
+A handwavey definition of a REPL could be something like this:
+
+> A REPL is a program that lets you type in code.  It runs that code, prints out
+> the result, and loops back to the beginning to let you type in more.
+
+This is a good first start, but to understand symbols we're going to need to get
+a much clearer, more precise handle on what exactly each letter in "REPL" means.
+We'll start backwards and outside-in with the simplest component.
+
+## Loop
+
+The "L" in "REPL" stands for "Loop", and it's the easiest part to understand.
+A REPL doesn't just process one bit of code and then exit, it loops forever (or
+until you tell it to quit).
+
+There's not a lot to say here, except that we'll go ahead and lump together the
+extra busywork that a REPL needs to do into this step so we can ignore it for
+the rest of the post.  Things like:
+
+* Printing the nice prompt at the beginning (`>>>` or `...` in Python).
+* Checking for `Ctrl-D` to quit.
+* Handling exceptions (so they don't kill the process).
+
+## Print
+
+The next letter is "P" for "print".  Like loop, this should be familiar
+territory, but let's nail down *exactly* what it does.  The job of the `print`
+function is to:
+
+* Take as input some kind of object/data structure in memory.
+* Produce some series of characters (a string) that represent this object.
+* Write those characters somewhere (standard output, a text panel in a GUI, etc).
+
+For example: if we give the integer `1` to Python's `print`, it produces
+a string of two characters (the digit `1` and a newline) and writes it to the
+terminal.  There are a lot of little details that aren't important here (line
+endings, string encoding, where the output actually goes, etc).  The important
+part that you should fix in your mind is this:
+
+**`print` takes an object, converts it into a string, and outputs it.**
+
+Note: I've called `print` a function here, and I'll do the same for `read` and
+`eval` later on.  This is correct for Common Lisp, but not necessarily for
+languages like Python or Javascript.  For the purposes of this post you can just
+imagine them as functions and realize that in the real world they're more like
+"phases" that have a lot more complex machinery under them.
+
+## Read
+
+We'll jump over to the other side of the acronym now.  "R" is for "read", and
+this is where things start to get tricky.
+
+In most non-Lisp languages the "read" and "eval" phases of the REPL tend to get
+blurred together, but if you want to understand symbolic computation in Lisp you
+absolutely *must* keep these two parts cleanly separated in your mind.
+
+We can define `read` as almost the polar opposite of `print`.  `read` is
+a function whose job is to:
+
+* Take as input a string of characters from some source.
+* "Decode" that string and produce some kind of object in memory.
+* Return that object.
+
+Notice how this mirrors `print`:
+
+* `print`: Object → Characters
+* `read`: Characters → Object
+
+If we `read` in the string `1.0` (that's three ASCII characters), we'll get some
+kind of floating point number object back.
+
+Sometimes there are many ways to represent the same object in memory.  For
+example: `read`ing the strings `1.0`, and `1.0000` will (in just about any sane
+language) both produce equivalent floating point objects representing the number
+"one".
+
+Side note (if you're not feeling pedantic feel free to skip this): the word
+"equivalent" here is a bit tricky.  When you `read` two strings like `1.0` and
+`1.000` you might get back two pointers to the same hunk of memory, or you might
+get back two pointers to two separate hunks of memory, or even just a machine
+word representing the floating point number directly.  It depends on the
+language (or even the implementation).  What I mean by "equivalent" is that
+there's no way to tell which value came from which input.  You can't say "this
+hunk of memory must have come from the string `1.000`" because `read` has
+"sanitized" the input by that point.
+
+## The RPL
+
+So now we've got `read`, `print`, and `loop`.  `read` takes characters from
+a stream and produces objects, and `print` takes objects and produces characters
+and writes them to a stream.  What would happen if we just hook them up together
+right now?
+
+    :::text
+    CL-USER> (loop (print (read)))
+    > 1
+    1
+    >    1.0
+    1.0
+    > 1.00000
+    1.0
+
+I've added a little `>` prompt to the output here to make it clear where I'm
+typing.
+
+Essentially what we've got here is a little pretty-printer!  Notice how things
+like extra whitespace and useless zeroes got stripped out.  This isn't terribly
+useful, but at least it's doing something.
+
+However, it's important to realize where the "prettifying" actually happened.
+`read` actually did the hard work!  By the time `print` got its grubby little
+paws on those floating point one-point-zeroes the extra whitespace and digits
+were long gone.  Maybe it would be more accurate to give credit where it's due
+and call this a "pretty-reader".
+
+## Reading Other Data
+
+Up to this point I've been using numbers as examples because they're simple, but
+of course we can read and print lots of other kinds of things.  Lists and
+strings (and lists *of* strings!) are some common examples:
+
+    :::text
+    CL-USER> (print (read))
+    > ("hello" "world")
+    ("hello" "world")
+
+    $ python
+    >>> ["hello", "world"]
+    ['hello', 'world']
+
+There are lots of other kinds of data you can read and print, which of course is
+highly dependent on the particular language.  What's important is the strict
+separation of `read` and `print`.
+
+## Readable Printing
+
+Before we move on, one thing we should talk about is the idea of "readable
+printing".  This idea becomes clear almost immediately when you work at the
+Python REPL:
+
+    :::text
+    $ python
+    >>> "foo\nbar"
+    'foo\nbar'
+    >>> print "foo\nbar"
+    foo
+    bar
+
+Why was the first string printed with quotes and an escaped newline, and the
+second without quotes and an actual newline?
+
+When we defined `print` earlier, one of the steps was:
+
+* Produce some series of characters (a string) that represent this object.
+
+But there are often many ways we can represent a particular object as a series
+of characters.  For example: we can represent the number two as `2`, or `10` in
+binary.  We can represent a string as the actual characters in it, or we could
+represent it "escaped" like Python does in the first line.
+
+The core idea here is that of "readable printing", or *printing the characters
+you would need to give to `read` to produce this object*.
+
+This is different from what I'll call "aesthetic printing", which is *printing
+characters for this object that look nice to a human*.
+
+In Python these two kinds of printing are seen in the separate [`__str__` and
+`__repr__` methods][str_repr].  In Common Lisp it's controlled by the
+[`*print-readably*` variable][print-readably].  Other language have different
+mechanisms.
+
+[str_repr]:
+[print-readably]:
+
+## Reading Non-Lisp
+
+So we know how reading and printing data works, but you can do more than that at
+a typical REPL.  Let's revisit the first Python example:
+
+    :::text
+    $ python
+    >>> if True:
+    ...     print "Yes"
+    ... else:
+    ...     print "No"
+    ...
+    Yes
+
+We know that `read` takes a series of characters and turns them into an object.
+When we `read` this series of characters, what comes out the other end?  For
+most languages (Python, Ruby, Javascript, Scala, etc) the answer is some kind of
+[abstract syntax tree][] represented as a data structure.  You might imagine
+something like this buried deep within the guts of the Python interpreter:
+
+    :::python
+    class IfStatement(ASTNode):
+        def __init__(self, test_node, then_node, else_node):
+            self.test_node = test_node
+            self.then_node = then_node
+            self.else_node = else_node
+
+The details would surely be a lot hairier in the real world, but the general
+idea is that the stream of characters representing your code gets read in and
+parsed into some kind of data structure in memory.
+
+This is all well and good, but our example printed "Yes", which certainly isn't
+a representation of an AST node.  Where did that come from?  To answer that
+we'll need the final letter in the acronym.
+
+[abstract syntax tree]:
+
+## Evaluating Non-Lisp
+
+The "E" in REPL stands for "eval".
+
+This is the last time we'll consider Python in this post, because we're about to
+move past it to more powerful ideas.  In Python, the purpose of the `eval` phase
+of the REPL is to:
+
+* Take an AST node (or simple data) as input.
+* Run the code (also known as *evaluating* it).
+* Return the result (if any).
+
+This is fairly straightforward, at least for our simple `if` statement.  Python
+will evaluate the `test_node` part of the `if`, and then evaluate one of the
+other parts depending on whether the test was truthy or falsey.
+
+Of course the actual process of evaluating real-life Python code is much more
+complicated than I've glossed over here, but this is good enough for us to use
+as a springboard to dive into the beautiful world of Lisp.
+
+## Reading Lisp
+
+When we talked about reading in the Python `if` statement I said that Python
+would read/parse it into some kind of data structure representing an AST node.
+I didn't point you at the *actual* data structure in the Python source code
+because I have no idea where it actually *is*, and it's probably not nearly as
+simple as the one I sketched out.
+
+Now that we're talking about Lisp, however, I really *can* show you exactly what
+an `if` statement in Lisp reads as!  I'll use `read-from-string` to make it
+a bit easier to see what's input and what's output, but it would work the same
+way with just plain old `read` and input from standard in.  Here we go:
+
+    :::text
+    CL-USER> (print (read-from-string "(if t (print 1) (print 2))"))
+
+    (IF T (PRINT 1) (PRINT 2))
+
+    CL-USER> (type-of (read-from-string "(if t (print 1) (print 2))"))
+
+    CONS
+
+Instead of some kind of AST node data structure buried deep within the heart of
+our implementation we get... a list!  An ordinary, garden-variety, vanilla list
+made of good old trusty cons cells!
+
+The list we get contains four elements, two of which are themselves lists.  But
+what do these lists actually *contain*?
+
+The numbers are simple enough -- we saw them earlier.  `read` turns a series of
+digits into a number somewhere in memory, and `print` spits out a series of
+characters that represent the number:
+
+    :::text
+    CL-USER> (type-of (read-from-string "100"))
+
+    (INTEGER 0 4611686018427387903)
+
+The string `100` does indeed represent an integer between zero and four
+squillion-something.
+
+But what are the other things in our `if` statement's list?  The `if` and `t`
+and `print`?  Are they some magic keywords internal to the guts of our Lisp that
+`print` is just outputting nicely for our benefit?
+
+    :::text
+    CL-USER> (type-of (read-from-string "if"))
+
+    SYMBOL
+
+Buckle up -- it's time to learn about symbols.
+
+## Symbols
+
+A symbol in Lisp is [a pretty basic data structure][clhs-symbol] that turns out
+to be *extremely* useful.  You can imagine it to be defined something like this:
+
+    :::text
+    (defstruct symbol
+      name
+      function
+      value
+      property-list
+      package)
+
+The `name` of a symbol is a string of characters representing the symbol's name:
+
+    :::text
+    CL-USER> (type-of (read-from-string "if"))
+
+    SYMBOL
+
+    CL-USER> (symbol-name (read-from-string "if"))
+
+    "IF"
+
+Symbols get created (usually) when they are `read` for the first time.  If you
+`read` a sequence of characters that matches the name of an existing symbol
+(let's ignore the lowercase/uppercase thing for now), `read` just returns that
+existing symbol:
+
+    :::text
+    CL-USER> (eq (read-from-string "cats") (read-from-string "cats"))
+
+    T
+
+Here `read` saw the string `cats` and created a new symbol somewhere in memory
+with `CATS` as its `name`.  The second call to `read-from-string` read in the
+string `cats`, saw an existing symbol with the matching name, and just returned
+it straight away.  That's why `eq` returned true -- the two `read` calls
+returned pointers to the exact same hunk of memory.
+
+(`read` also "interned" the symbol into the current package.  We're going to
+ignore the package system for this post -- it's not important to the core
+topic.  We'll also ignore the property list.)
+
+So to recap:
+
+* `read` is a function that takes characters and turns them into objects.
+* When you feed `read` certain strings of characters (i.e. without special
+  characters), it returns a symbol object whose name matches those characters
+  (creating it if necessary).
+* Symbols are just normal Lisp data structures with a few specific fields.
+
+So what does `print` do with a symbol?  It just prints the `name`!
+
+    :::text
+    CL-USER> (print (read-from-string "cats"))
+
+    CATS
+
+Simple enough.  Note that this is *readably printed* -- if we feed it back to
+`read` we'll get an equivalent symbol back (the *exact same* symbol, in fact).
+
+But wait!  If we `read` symbols in and they're just normal vanilla structures,
+then how do we actually *do* anything in Lisp?  How does anything actually
+*run*?
+
+[clhs-symbol]:
+
+## Evaluating Lisp
+
+Now we need to go back to `eval`.  In the Python example I handwaved a bit and
+said that `eval` "ran the code" but didn't really explain how that happened.
+Now that we're in the warm embrace of Lisp we can nail things down a lot more
+cleanly.
+
+`eval` (in Lisp) is a function whose job is to:
+
+* Take an object.
+* Do something useful with it to get another object.
+* Return the resulting object.
+
+I know, I know, that seems as clear as a brick.  But if we can actually define
+what "something useful" means, it will be a lot more helpful.  Lisp's `eval`
+follows a few simple rules to produce its output.
+
+### Strings and Numbers
+
+First of all, `eval` will just pass numbers and strings straight through without
+touching them at all:
+
+    :::lisp
+    (defun eval (thing)
+      (cond
+        ((numberp thing) thing)
+        ((stringp thing) thing)
+        ; ...
+        (t (error "What the hell is this?"))))
+
+This makes sense because there's not a lot more useful stuff you can do with
+a number or a string on its own.  I suppose you could write a language where the
+evaluation phase ran all numbers through the absolute value function because you
+wanted to remove all negativity from programming or something, but that seems
+a bit esoteric.
+
+### Basic Lists
+
+What about lists, like our friend the `if` statement (a list of four things)?
+Well the first thing anyone learns about Lisp is the funny syntax.  In most
+cases, when passed a list `eval` treats the first element as the name of
+a function, and the rest as the arguments.  It will evaluate the arguments and
+call the function with them:
+
+    :::lisp
+    (defun eval (thing)
+      (cond
+        ((numberp thing) thing)
+        ((stringp thing) thing)
+        ((typep thing 'list)
+         (destructuring-bind (head . arguments)
+           (apply (symbol-function head)
+                  (mapcar #'eval arguments))))
+        ; ...
+        (t (error "What the hell is this?"))))
+
+At this point we know enough to understand exactly what happens when we say
+something like `(+ 1 2)` at a Lisp REPL!
+
+* Lisp `read`s the seven characters we gave it, which becomes a list of three
+  objects: the symbol `+`, the integer `1`, and the integer `2`.
+* Lisp passes that three-object list to `eval`.
+* `eval` looks up what is in the `function` slot of the symbol at the front of
+  the list (that symbol is `+`).
+* The symbol `+` was helpfully given a function by the Lisp implementation (run
+  `(symbol-function (read-from-string "+"))` to confirm that I'm not lying to
+  you).
+* Lisp evaluates each item in the rest of the list.  In this case they're each
+  a number, so each one evaluates to itself.
+* Lisp calls the function with the evaluated arguments to get a result (in this
+  case: the integer `3`).
+* Lisp passes the result to `print`, which turns it into a string of characters
+  (one character, in this case) and prints it to standard out.
+
+Whew!  That seems like a lot, but each piece is pretty small.  Go slowly through
+it and make sure you *really* understand each step in the preceding list before
+you move on.
+
+If you don't believe me that this is really how it works, try this:
+
+    :::text
+    CL-USER> (setf (symbol-function (read-from-string "square"))
+                   (lambda (x) (* x x)))
+
+    #<FUNCTION (LAMBDA (X)) {1005DCF2FB}>
+
+    CL-USER> (square 4)
+
+    16
+
+We just set the `function` slot of the symbol `square` to be a function that
+squares its argument.  We can then run it just like any other function.
+
+### Variables
+
+What about symbols that *aren't* at the front of a list?  For example:
+
+    :::text
+    CL-USER> (defvar ten 10)
+
+    10
+
+    CL-USER> (square ten)
+
+    100
+
+In this case, everything proceeds as it did before until the recursive call to
+`eval` the argument to `square`.  The argument is just a bare symbol (not
+a number or a list or a string), so we need another rule in our `eval` function:
+
+    :::lisp
+    (defun eval (thing)
+      (cond
+        ((numberp thing) thing)
+        ((stringp thing) thing)
+        ((symbolp thing) (symbol-value thing))
+        ((listp thing)
+         (destructuring-bind (head . arguments)
+           (apply (symbol-function head)
+                  (mapcar #'eval arguments))))
+        ; ...
+        (t (error "What the hell is this?"))))
+
+If `eval` sees a symbol on its own, it will just return whatever is in the
+`value` slot of the symbol data structure.  Simple enough!
+
+### Special Forms
+If course there are a few things we haven't talked about yet.  First of all,
+there are some "magic" symbols that aren't functions, but are hardcoded into the
+evaluation function to do certain things.  In Common Lisp they're called
+["special operators"][special-operators] and there are twenty-five of them.
+
+Let's add a special rule to our `eval` function to handle one of them (our old
+friend `if`):
+
+    :::lisp
+    (defun eval (thing)
+      (cond
+        ((numberp thing) thing)
+        ((stringp thing) thing)
+        ((symbolp thing) (symbol-value thing))
+        ((listp thing)
+         (destructuring-bind (head . arguments) thing
+           (cond
+             ((eq head (read-from-string "if")) ; NEW
+              (destructuring-bind (test then else) arguments
+                (if (eval test)
+                  (eval then)
+                  (eval else))))
+             (t
+              (apply (symbol-function head)
+                     (mapcar #'eval arguments))))))
+        ; ...
+        (t (error "What the hell is this?"))))
+
+At this point your head might be starting to hurt.  How have we just used `if`
+to define itself?  Is anything here actually *real*, or is Lisp just a serpent
+eating its own tail forever?
+
+### An Ouroborromean Paradox
+
+If the "metacircularity" (god, I hate that word) of what we've just done bothers
+you, remember that this `eval` function here is just a sample of how you *can*
+write it in Lisp.  You could also write it in some other language, like C or
+Java (see: [ABCL][]) or really anything.  That would let you "bottom out" to
+something that's not Lisp.
+
+But really, is that any *less* circular?  You still need *some* language to run
+the thing!  Why does an `eval` written in C seem less "cheaty" than one written
+in Lisp?  Because it compiles to machine code?  Try
+`(disassemble #'sb-impl::simple-eval-in-lexenv)` in SBCL and you'll get a few
+thousand bytes of x86 assembly that was [originally some Lisp code much like
+ours][sbcl-eval].
+
+My advice is to just let it flow over you, even if you feel a bit uneasy at the
+sight of something like this.  Eventually as you become more at home in your
+chosen Lisp, writing high-level code one moment and `disassemble`ing a function
+the next, you'll make peace with the serpent.
+
+### Exercises
+
+There are quite a few things in `eval` that we haven't covered here, like the
+other twenty-four special operators, lexical variables, lambda forms, and more.
+But this should be enough for you to wrap your head around symbols and how
+they're used in Lisp.
+
+Before we move on to the final part of this story, make sure you can figure out
+the answers to these questions:
+
+1. What does the single character `1` read as?  What does it evaluate to?
+2. What do the characters `"hello\"world"` read as?  What do they evaluate to?
+   How would the result be printed readably?  How would the result be printed
+   aesthetically?
+3. What do the characters `(list (+ 1 2) ten)` read as?  What does it evaluate
+   to?  How does each step of the evaluation work?  Trace out each of the calls
+   to `eval` on paper, with their arguments and their results.  What does the
+   final result print as?
+4. Use `symbol-function` to see what's in the function slot of the symbol `car`.
+   How does your Lisp print that thing?
+5. Try to look up the function slot of a symbol without a function.  What
+   happens?
+4. Use `symbol-function` to see what's in the function slot of the symbol `if`.
+   What happens?  Does that seem right?  If so, why?
+5. Go to the [HyperSpec page for `symbol`][clhs-symbol] and search for
+   "the consequences are undefined if".  Try all of those things and find out
+   what interesting things break.
+
+If you're not using Common Lisp, port these exercises to your language.
+
+[special-operators]:
+[ABCL]:
+[sbcl-eval]:
+[clhs-symbol]:
+
+## Quote
+
+Up to now we've been using the clumsy `(read-from-string "foo")` form to get
+ahold of symbols, but there's an easier way called "quoting".
+
+Quoting often confuses new Lisp programmers, and they'll end up randomly
+adding/removing quotes to something until it appears to do what they want.  Now
+that you've got `read`, `eval`, and `print` firmly separated in your mind,
+`quote` will be much easier to understand.
+
+### The Special Operator
+
+Quoting actually has two distinct components: one that affects `read` and one
+that affects `eval`.  In most books the distinction between these two only gets
+barely a sentence or two, which is probably why beginners are so confused.
+Let's look at the `eval` side of things first.
+
+`quote` is one of the twenty-five special operators.  All it does it pass its
+(single) argument back from `eval` *untouched*.  Normally the argument itself
+would have been evaluated, but `quote` prevents that:
+
+    :::lisp
+    (defun eval (thing)
+      (cond
+        ((numberp thing) thing)
+        ((stringp thing) thing)
+        ((symbolp thing) (symbol-value thing))
+        ((listp thing)
+         (destructuring-bind (head . arguments) thing
+           (cond
+             ((eq head (read-from-string "if"))
+              (destructuring-bind (test then else) arguments
+                (if (eval test)
+                  (eval then)
+                  (eval else))))
+             ((eq head (read-from-string "quote")) ; NEW
+              (first arguments))
+             (t
+              (apply (symbol-function head)
+                     (mapcar #'eval arguments))))))
+        ; ...
+        (t (error "What the hell is this?"))))
+
+I know that seems strange, but let's look at it in action:
+
+    :::text
+    CL-USER> (+ 1 2)
+
+    3
+
+    CL-USER> (quote (+ 1 2))
+
+    (+ 1 2)
+
+Our first example is the same one we used in the previous section:
+
+* `read` takes in seven characters and returns a list of three items: the symbol
+  `+` and two numbers.
+* `eval` gets the three-element list, looks up the function for the symbol `+`,
+  recursively evaluates the arguments (which evaluate to themselves), and
+  applies the function.
+* `print` gets the integer `3`, turns it into ASCII, and writes it to the
+  terminal.
+
+The second example uses `quote`:
+
+* `read` takes in fifteen characters and returns a list of two items: the symbol
+  `quote` and a list of three items (the symbol `+` and two numbers).
+* `eval` gets the three element list and notices that the first element is the
+  special operator `quote`.  It simply returns the second element in the list
+  untouched, which in this case is a list of three items.
+* `print` gets that three element list, turns it into seven characters, and
+  writes it to the terminal.
+
+Just for fun, let's ride the serpent and use `quote` in the definition of
+itself inside `eval` to clean things up a bit:
+
+    :::lisp
+    (defun eval (thing)
+      (cond
+        ((numberp thing) thing)
+        ((stringp thing) thing)
+        ((symbolp thing) (symbol-value thing))
+        ((listp thing)
+         (destructuring-bind (head . arguments) thing
+           (cond
+             ((eq head (quote if))
+              (destructuring-bind (test then else) arguments
+                (if (eval test)
+                  (eval then)
+                  (eval else))))
+             ((eq head (quote quote)) ; WHAT IN THE
+              (first arguments))
+             (t
+              (apply (symbol-function head)
+                     (mapcar #'eval arguments))))))
+        ; ...
+        (t (error "What the hell is this?"))))
+
+That `(quote quote)` looks just completely bonkers, right?  Step through it
+piece by piece to figure it out:
+
+* What do the five characters `quote` read as?  What do they evaluate to?
+* What do the thirteen (spooky!) characters `(quote quote)` read as?  When
+  `eval` gets its hands on that, what clause in the `cond` does it hit?  What
+  does that return?
+
+Try these in a Lisp REPL if you're not 100% sure of the answers.  The functions
+`read-from-string`, `eval`, and `type-of` will come in handy.
+
+### The Read Macro
+
+Now let's look at the final piece of the puzzle.  `'` (that's a single ASCII
+quote character) is a [read macro][].  Don't worry if you don't know exactly
+what that means right now.  The important part is that when `read` sees a quote
+character, it returns a two-element list containing the symbol `quote` and the
+next form, which is read normally.
+
+The four characters `'foo` go into `read` and come out as a two-element list
+containing: the symbol `quote` and the symbol `foo`.  Again, because it's
+important, this happens *at read time*.  By the time `eval` sees anything, it's
+already been turned into that two-element list.
+
+It can be a bit slippery to get a grip on this, because `print` will "helpfully"
+print a list of `(quote whatever)` as `'whatever`!  But we can use trusty old
+`car` and `cdr` to see what's really going on:
+
+    :::lisp
+    CL-USER> (read-from-string "'foo")
+
+    'FOO
+
+    CL-USER> (type-of (read-from-string "'foo"))
+
+    CONS
+
+    CL-USER> (car (read-from-string "'foo"))
+
+    QUOTE
+
+    CL-USER> (cdr (read-from-string "'foo"))
+
+    (FOO)
+
+That's really all there is to it: `'form` is returned by `read` as `(quote
+form)`.  Note that `form` doesn't have to be a symbol:
+
+* `'(1 2 (4 5))` reads as `(quote (1 2 (4 5)))`
+* `'150` reads as `(quote 150)`
+* `'"hello"` reads as `(quote "hello")`
+* `'(if if if if)` reads as `(quote (if if if if))`
+
+[read macro]:
+
+### Exercises
+
+That's just about all there is to `quote`.  Once you've cleanly separate `read`,
+`eval`, and `print` in your mind it becomes a lot less mystical (or perhaps
+*more*, depending on your point of view).
+
+If you want to twist your brain just a bit more, try these exercises:
+
+1. What do the characters `foo` read in as?  What do they evaluate to?  How does it print?
+2. What do the characters `'foo` read in as?  What do they evaluate to?  How does it print?
+3. What do the characters `''foo` read in as?  What do they evaluate to?  How does it print?
+4. What do the characters `'1.0` read in as?  What do they evaluate to?  How does it print?
+5. What do the characters `'(1 2 3)` read in as?  How many elements are in the list?  What does it evaluate to?  How many elements are in *that* list?
+6. Update the `eval` function to use the `'` read macro instead of `quote`.
+7. Certain symbols like `t` and `nil` are special -- they evaluate to themselves much like numbers or strings.  Add this to `eval`.
+8. What do the characters `nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?
+9. What do the characters `'nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?
+10. What do the characters `''nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?
+
+## A Quick Recap
+
+This post has gotten quite a bit longer than I intended.  Here are the main
+things I hope you've gotten from it:
+
+* `read`, `eval`, and `print` are three separate, distinct functions/phases in
+  the interpretation of Lisp (and really *all* (okay, *most*)) code.
+* Keeping these three phases clearly separated in your mind will make it easier
+  to understand symbols and quoting in Lisp.
+* Symbols are nothing magic, they're just a particular data structure with
+  a couple of fields.
+* The important thing is how `read` and `eval` *treat* symbols.  That's where
+  the magic happens.
+* `quote` is just a "short-circuit" in `eval` that turns out to be really
+  useful.
+* `'` is just a lazier way of writing `(quote ...)`.
+
+## Where to Go From Here
+
+Now that you've got a firmer grasp on what the hell symbols actually *are*,
+there are a lot of cool things you might want to check out.
+
+* Reread some of your introductory Lisp books again, seeing if things seem to
+  make a bit more sense now.
+* Read about backquote/quasiquote.  How does it compare to the `quote` we
+  explored here?
+* Read [Paradigms of Artificial Intelligence Programming][PAIP] and pay
+  attention to how Norvig uses symbols for various tasks.  Pay attention to what
+  he says about property lists, because we skipped them here.
+* Read the [guide to the Common Lisp package system][packaging] to learn all the
+  stuff I left out about the Common Lisp package system.
+* Gaze upon [this macro][symbolicate] in wonder (or horror).
+* Read the HyperSpec pages for things like `symbol` and `quote`.
+* Find and read your Lisp implementation's `eval` function.
+* Find and read *another* Lisp implementation's `eval` function.
+* Watch the [SICP Lectures][SICP] on YouTube.
+
+[PAIP]:
+[symbolicate]:
+[packaging]:
+[SICP]:
+
+    {% endblock article %}
author	Steve Losh <steve@stevelosh.com>
date	Sun, 26 Jun 2016 01:01:44 +0000
parents	2d9281a1f7e7
children	c24ab12c65fc
branches/tags	(none)
files	content/blog/2016/06/symbolic-computation.html