8d1cd7431860

Proofread
[view raw] [browse files]
author Steve Losh <steve@stevelosh.com>
date Tue, 28 Jun 2016 23:39:20 +0000
parents c24ab12c65fc
children af70a83b2de9
branches/tags (none)
files content/blog/2016/06/symbolic-computation.html

Changes

--- a/content/blog/2016/06/symbolic-computation.html	Mon Jun 27 15:39:58 2016 +0000
+++ b/content/blog/2016/06/symbolic-computation.html	Tue Jun 28 23:39:20 2016 +0000
@@ -9,25 +9,25 @@
     {% block article %}
 
 I've been reading a lot of Lisp books lately, some more advanced than others.
-All of the introductory books I've seen cover the idea of "Symbolic Computation"
-to some extent, but most of them breeze past it pretty quickly with just a few
-pages about "symbols" and "quote".
-
-For most programmers coming from non-Lisp languages, these are new, foreign
-concepts that don't really map back to anything in their previous experience.
-But the books seem to expect you to understand this things after just a couple
-of paragraphs.
+All of the introductory books I've seen cover the idea of symbolic computation,
+but most of them breeze past it with just a few pages about "symbols" and
+"quoting".  For many programmers coming from non-Lisp languages these are new,
+foreign concepts that don't really map back to anything in their previous
+experience.  But the books seem to expect you to understand this things after
+just a couple of paragraphs.
 
 One book that *does* spend more time on this is the appropriately-named [Common
 Lisp: A Gentle Introduction to Symbolic Computation][gentle].  If you're looking
 for a good introductory Lisp book, that's the one I'd recommend.  However, it's
-quite a long book and starts at the *very* basics.  If you already know the some
-Lisp then you might find yourself bored if you go through the entire thing just
-for the symbolic computation bits.
+quite a long book and starts at the *very* basics (you don't even write code for
+the first few chapters -- you program with drawings).  If you already know the
+some Lisp then you might find yourself bored if you go through the entire thing
+just for the symbolic computation bits.
 
-So this post will be an attempt to explain what symbols are, what quoting does,
-and why it's all interesting and useful.  If you already know what something
-like this would print, you'll be bored by this post:
+This post is an attempt to explain what symbols *actually are* and what quoting
+does.  You'll get the most out of it if you're already familiar with the basics
+of Lisp (i.e. you aren't scared by the parentheses).  But if you already know
+what something like this would print, you'll probably be bored:
 
     :::text
     (let ((foo ''a))
@@ -42,14 +42,16 @@
 
 ## Disclaimer
 
-Before we start, I'll get this warning out of the way: this post is going to
-gloss over a lot of gory details to get at the important core ideas.
+Before we start I'll get this warning out of the way: I'm going to gloss over
+a lot of gory details to get at the important core ideas.
 
 When I say "this thing is implemented like so" it's safe to assume that in
-reality things are messier and more complicated and will vary between dialects
-like Common Lisp, Scheme, and Clojure.  My goal is to give you some intuition
-for the basic ideas, not to exhaustively cover how a particular implementation
-works.
+reality things are messier and more complicated and will vary between
+implementations and dialects.  My goal is to give you some intuition for the
+basic ideas, not to exhaustively cover how a particular implementation works.
+
+I'll be using Common Lisp for the examples, but the concepts apply equally to
+Scheme, Clojure, and other Lisps too.
 
 With that out of the way, let's dive in!
 
@@ -85,8 +87,8 @@
 until you tell it to quit).
 
 There's not a lot to say here, except that we'll go ahead and lump together the
-extra busywork that a REPL needs to do into this step so we can ignore it for
-the rest of the post.  Things like:
+extra busywork a REPL needs to do into this step so we can ignore it for the
+rest of the post.  This includes things like:
 
 * Printing the nice prompt at the beginning (`>>>` or `...` in Python).
 * Checking for `Ctrl-D` to quit.
@@ -94,7 +96,7 @@
 
 ## Print
 
-The next letter is "P" for "print".  Like loop, this should be familiar
+The next letter is "P" for "Print".  Like `loop` this should be familiar
 territory, but let's nail down *exactly* what it does.  The job of the `print`
 function is to:
 
@@ -118,7 +120,7 @@
 
 ## Read
 
-We'll jump over to the other side of the acronym now.  "R" is for "read", and
+We'll jump over to the other side of the acronym now.  "R" is for "Read", and
 this is where things start to get tricky.
 
 In most non-Lisp languages the "read" and "eval" phases of the REPL tend to get
@@ -137,20 +139,23 @@
 * `print`: Object → Characters
 * `read`: Characters → Object
 
-If we `read` in the string `1.0` (that's three ASCII characters), we'll get some
-kind of floating point number object back.
+This is the second definition you need to fix in your brain:
+
+**`print` takes an object, converts it into a string, and outputs it.**  
+**`read` takes a string, converts it into an object, and returns it.**
 
-Sometimes there are many ways to represent the same object in memory.  For
-example: `read`ing the strings `1.0`, and `1.0000` will (in just about any sane
-language) both produce equivalent floating point objects representing the number
-"one".
+If we `read` in the string `1.0` (that's three ASCII characters), we'll get some
+kind of floating point number object back.  Sometimes there are many ways to
+represent the same object in memory.  For example: `read`ing the strings `1.0`
+and `1.0000` will both produce equivalent floating point objects representing
+the number "one".
 
 Side note (if you're not feeling pedantic feel free to skip this): the word
 "equivalent" here is a bit tricky.  When you `read` two strings like `1.0` and
 `1.000` you might get back two pointers to the same hunk of memory, or you might
 get back two pointers to two separate hunks of memory, or even just a machine
 word representing the floating point number directly.  It depends on the
-language (or even the implementation).  What I mean by "equivalent" is that
+language (and even the implementation).  What I mean by "equivalent" is that
 there's no way to tell which value came from which input.  You can't say "this
 hunk of memory must have come from the string `1.000`" because `read` has
 "sanitized" the input by that point.
@@ -159,8 +164,8 @@
 
 So now we've got `read`, `print`, and `loop`.  `read` takes characters from
 a stream and produces objects, and `print` takes objects and produces characters
-and writes them to a stream.  What would happen if we just hook them up together
-right now?
+and writes them to a stream.  We're still missing a letter, but what would
+happen if we just hook these three up together right now?
 
     :::text
     CL-USER> (loop (print (read)))
@@ -176,13 +181,13 @@
 
 Essentially what we've got here is a little pretty-printer!  Notice how things
 like extra whitespace and useless zeroes got stripped out.  This isn't terribly
-useful, but at least it's doing something.
+useful, but at least it's doing *something*.
 
-However, it's important to realize where the "prettifying" actually happened.
+However, it's important to realize where the "prettifying" actually happened:
 `read` actually did the hard work!  By the time `print` got its grubby little
-paws on those floating point one-point-zeroes the extra whitespace and digits
-were long gone.  Maybe it would be more accurate to give credit where it's due
-and call this a "pretty-reader".
+paws on those one-point-zeroes the extra whitespace and digits were long gone.
+Maybe we should give credit where it's due and call this a "pretty-reader"
+instead.
 
 ## Reading Other Data
 
@@ -192,16 +197,16 @@
 
     :::text
     CL-USER> (print (read))
-    > ("hello" "world")
+    > ("hello"       "world")
     ("hello" "world")
 
     $ python
-    >>> ["hello", "world"]
+    >>> [    "hello"  , "world"]
     ['hello', 'world']
 
-There are lots of other kinds of data you can read and print, which of course is
-highly dependent on the particular language.  What's important is the strict
-separation of `read` and `print`.
+There are lots of other kinds of things you can read and print, depending on the
+particular language.  What's important here is the strict separation of `read`
+and `print`.
 
 ## Readable Printing
 
@@ -224,20 +229,22 @@
 
 * Produce some series of characters (a string) that represent this object.
 
-But there are often many ways we can represent a particular object as a series
-of characters.  For example: we can represent the number two as `2`, or `10` in
-binary.  We can represent a string as the actual characters in it, or we could
-represent it "escaped" like Python does in the first line.
+Much like there are many character strings that result in the same object (e.g.
+`1.0` and `1.000`) there are often many ways we can represent a particular
+object as a series of characters.  For example: we can represent the number two
+as `2`, or `10` in binary, or `two` in English.  We can represent a string as
+the actual characters in it, or we could represent it "escaped" like Python does
+in the first line.
 
-The core idea here is that of "readable printing", or *printing the characters
-you would need to give to `read` to produce this object*.
+The core idea here is that of **"readable printing": printing the characters you
+would need to give to `read` to produce this object**.
 
-This is different from what I'll call "aesthetic printing", which is *printing
-characters for this object that look nice to a human*.
+This is different from what I'll call **"aesthetic printing": printing
+characters for this object which look nice to a human**.
 
 In Python these two kinds of printing are seen in the separate [`__str__` and
 `__repr__` methods][str_repr].  In Common Lisp it's controlled by the
-[`*print-readably*` variable][print-readably].  Other language have different
+[`*print-readably*` variable][print-readably].  Other languages have different
 mechanisms.
 
 [str_repr]: https://brennerm.github.io/posts/python-str-vs-repr.html
@@ -245,8 +252,8 @@
 
 ## Reading Non-Lisp
 
-So we know how reading and printing data works, but you can do more than that at
-a typical REPL.  Let's revisit the first Python example:
+So we know how reading and printing *data* works, but you can do more than that
+at a typical REPL.  Let's revisit the first Python example:
 
     :::text
     $ python
@@ -282,7 +289,7 @@
 
 ## Evaluating Non-Lisp
 
-The "E" in REPL stands for "eval".
+The "E" in REPL stands for "Eval".
 
 This is the last time we'll consider Python in this post, because we're about to
 move past it to more powerful ideas.  In Python, the purpose of the `eval` phase
@@ -302,23 +309,25 @@
 
 ## Reading Lisp
 
-When we talked about reading in the Python `if` statement I said that Python
-would read/parse it into some kind of data structure representing an AST node.
-I didn't point you at the *actual* data structure in the Python source code
-because I have no idea where it actually *is*, and it's probably not nearly as
-simple as the one I sketched out.
+When we talked about reading the Python `if` statement I said that Python would
+parse it into some kind of data structure representing an AST node.  I didn't
+point you at the *actual* data structure in the Python source code because
+I have no idea where it actually *is*, and it's probably not nearly as simple as
+the one I sketched out.
 
-Now that we're talking about Lisp, however, I really *can* show you exactly what
+Now that we're talking about Lisp, however, I *really can* show you exactly what
 an `if` statement in Lisp reads as!  I'll use `read-from-string` to make it
 a bit easier to see what's input and what's output, but it would work the same
 way with just plain old `read` and input from standard in.  Here we go:
 
     :::text
-    CL-USER> (print (read-from-string "(if t (print 1) (print 2))"))
+    CL-USER> (print
+               (read-from-string "(if t (print 1) (print 2))"))
 
     (IF T (PRINT 1) (PRINT 2))
 
-    CL-USER> (type-of (read-from-string "(if t (print 1) (print 2))"))
+    CL-USER> (type-of
+               (read-from-string "(if t (print 1) (print 2))"))
 
     CONS
 
@@ -341,9 +350,9 @@
 The string `100` does indeed represent an integer between zero and four
 squillion-something.
 
-But what are the other things in our `if` statement's list?  The `if` and `t`
-and `print`?  Are they some magic keywords internal to the guts of our Lisp that
-`print` is just outputting nicely for our benefit?
+But what are those *other* things in our list?  The `if` and `t` and `print`?
+Are they some magic keywords internal to the guts of our Lisp that `print` is
+just outputting nicely for our benefit?
 
     :::text
     CL-USER> (type-of (read-from-string "if"))
@@ -354,8 +363,8 @@
 
 ## Symbols
 
-A symbol in Lisp is [a pretty basic data structure][clhs-symbol] that turns out
-to be *extremely* useful.  You can imagine it to be defined something like this:
+A symbol in Lisp is [a simple data structure][clhs-symbol] that turns out to be
+*extremely* useful.  You can imagine it being defined something like this:
 
     :::text
     (defstruct symbol
@@ -382,7 +391,8 @@
 existing symbol:
 
     :::text
-    CL-USER> (eq (read-from-string "cats") (read-from-string "cats"))
+    CL-USER> (eq (read-from-string "cats")
+                 (read-from-string "cats"))
 
     T
 
@@ -434,7 +444,7 @@
 * Return the resulting object.
 
 I know, I know, that seems as clear as a brick.  But if we can actually define
-what "something useful" means, it will be a lot more helpful.  Lisp's `eval`
+what "something useful" means it will be a lot more helpful.  Lisp's `eval`
 follows a few simple rules to produce its output.
 
 ### Strings and Numbers
@@ -470,7 +480,7 @@
         ((numberp thing) thing)
         ((stringp thing) thing)
         ((typep thing 'list)
-         (destructuring-bind (head . arguments)
+         (destructuring-bind (head . arguments) thing
            (apply (symbol-function head)
                   (mapcar #'eval arguments))))
         ; ...
@@ -510,8 +520,10 @@
 
     16
 
-We just set the `function` slot of the symbol `square` to be a function that
-squares its argument.  We can then run it just like any other function.
+We set the `function` slot of the symbol `square` to be a function that squares
+its argument, and then we can run it like any other function.
+
+Trace through the evaluation of that last `(square 4)` call before moving on.
 
 ### Variables
 
@@ -535,7 +547,7 @@
       (cond
         ((numberp thing) thing)
         ((stringp thing) thing)
-        ((symbolp thing) (symbol-value thing))
+        ((symbolp thing) (symbol-value thing)) ; NEW
         ((listp thing)
          (destructuring-bind (head . arguments)
            (apply (symbol-function head)
@@ -547,7 +559,7 @@
 `value` slot of the symbol data structure.  Simple enough!
 
 ### Special Forms
-If course there are a few things we haven't talked about yet.  First of all,
+If course there are a few things we haven't talked about yet.  First of all:
 there are some "magic" symbols that aren't functions, but are hardcoded into the
 evaluation function to do certain things.  In Common Lisp they're called
 ["special operators"][special-operators] and there are twenty-five of them.
@@ -579,7 +591,7 @@
 to define itself?  Is anything here actually *real*, or is Lisp just a serpent
 eating its own tail forever?
 
-### An Ouroborromean Paradox
+### An Ouroborromean Paradox?
 
 If the "metacircularity" (god, I hate that word) of what we've just done bothers
 you, remember that this `eval` function here is just a sample of how you *can*
@@ -625,7 +637,7 @@
    What happens?  Does that seem right?  If so, why?
 5. Go to the [HyperSpec page for `symbol`][clhs-symbol] and search for
    "the consequences are undefined if".  Try all of those things and find out
-   what interesting things break.
+   what interesting things break in your Lisp.
 
 If you're not using Common Lisp, port these exercises to your language.
 
@@ -753,7 +765,7 @@
 
 The four characters `'foo` go into `read` and come out as a two-element list
 containing: the symbol `quote` and the symbol `foo`.  Again, because it's
-important, this happens *at read time*.  By the time `eval` sees anything, it's
+important: this happens *at read time*.  By the time `eval` sees anything it's
 already been turned into that two-element list.
 
 It can be a bit slippery to get a grip on this, because `print` will "helpfully"
@@ -778,7 +790,10 @@
     (FOO)
 
 That's really all there is to it: `'form` is returned by `read` as `(quote
-form)`.  Note that `form` doesn't have to be a symbol:
+form)`.  Then that gets passed along to `eval`, which sees the `quote` special
+operator and passes the argument back untouched.
+
+Note that `form` doesn't have to be a symbol:
 
 * `'(1 2 (4 5))` reads as `(quote (1 2 (4 5)))`
 * `'150` reads as `(quote 150)`
@@ -789,7 +804,7 @@
 
 ### Exercises
 
-That's just about all there is to `quote`.  Once you've cleanly separate `read`,
+That pretty much wraps up `quote`.  Once you've cleanly separated `read`,
 `eval`, and `print` in your mind it becomes a lot less mystical (or perhaps
 *more*, depending on your point of view).
 
@@ -800,21 +815,22 @@
 3. What do the characters `''foo` read in as?  What do they evaluate to?  How does it print?
 4. What do the characters `'1.0` read in as?  What do they evaluate to?  How does it print?
 5. What do the characters `'(1 2 3)` read in as?  How many elements are in the list?  What does it evaluate to?  How many elements are in *that* list?
-6. Update the `eval` function to use the `'` read macro instead of `quote`.
-7. Certain symbols like `t` and `nil` are special -- they evaluate to themselves much like numbers or strings.  Add this to `eval`.
+6. Update our implementation of `eval` to use the `'` read macro instead of `quote`.
+7. Certain symbols like `t` and `nil` are special -- they evaluate to themselves like numbers or strings.  Add this to `eval`.
 8. What do the characters `nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?
 9. What do the characters `'nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?
 10. What do the characters `''nil` read in as?  What is the type of that object?  What do they evaluate to?  What type is that result?
 
 ## A Quick Recap
 
-This post has gotten quite a bit longer than I intended.  Here are the main
-things I hope you've gotten from it:
+If you've gotten this far and understood everything, you should have a good
+grasp of how symbolic computation works.  Here are the main things I hope you've
+taken away from this post:
 
 * `read`, `eval`, and `print` are three separate, distinct functions/phases in
   the interpretation of Lisp (and really *all* (okay, *most*)) code.
-* Keeping these three phases clearly separated in your mind will make it easier
-  to understand symbols and quoting in Lisp.
+* Keeping these three phases *clearly separated* in your mind will make it
+  easier to understand symbols and quoting in Lisp.
 * Symbols are nothing magic, they're just a particular data structure with
   a couple of fields.
 * The important thing is how `read` and `eval` *treat* symbols.  That's where
@@ -826,15 +842,16 @@
 ## Where to Go From Here
 
 Now that you've got a firmer grasp on what the hell symbols actually *are*,
-there are a lot of cool things you might want to check out.
+there are a lot of cool things you might want to check out that will show you
+what you can do with them:
 
 * Reread some of your introductory Lisp books again, seeing if things seem to
   make a bit more sense now.
 * Read about backquote/quasiquote.  How does it compare to the `quote` we
   explored here?
 * Read [Paradigms of Artificial Intelligence Programming][PAIP] and pay
-  attention to how Norvig uses symbols for various tasks.  Pay attention to what
-  he says about property lists, because we skipped them here.
+  attention to how Norvig uses symbols for various tasks.  Also pay attention to
+  what he says about property lists, because we skipped them here.
 * Read the [guide to the Common Lisp package system][packaging] to learn all the
   stuff I left out about the Common Lisp package system.
 * Gaze upon [this macro][symbolicate] in wonder (or horror).