c467a9d32cdd

Add playing with syntax entry
[view raw] [browse files]
author Steve Losh <steve@stevelosh.com>
date Mon, 15 Aug 2016 20:55:05 +0000
parents 22d820814a12
children ac43706c95d0
branches/tags (none)
files content/blog/2016/08/playing-with-syntax.html content/links.html layout/_post.html media/css/pygments-clean.css

Changes

--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/content/blog/2016/08/playing-with-syntax.html	Mon Aug 15 20:55:05 2016 +0000
@@ -0,0 +1,493 @@
+    {% extends "_post.html" %}
+
+    {% hyde
+        title: "Playing With Syntax"
+        snip: "Lisp lets you evolve your language."
+        created: 2016-08-16 13:45:00
+    %}
+
+    {% block article %}
+
+One of the things I love about Lisp is that it gives you the ability to change
+and mold the syntax of the language to what you need.  In this post I want to
+look at the evolution of a little macro I've been playing around with for
+a while now.
+
+Mutation is a fundamental concept in most programming languages.  Functional
+programming may be beautiful, but mutation is still useful (and fast).  In most
+languages assignment is done with an `=` or `:=` operator:
+
+    :::lisp
+    x = 10
+
+In Common Lisp the operator is named `setf` instead of `=` for historical
+reasons, and of course it's used in prefix notation:
+
+    :::lisp
+    (setf x 10)
+
+Aside from the prefix ordering, Common Lisp's syntax is already a bit more
+elegant because you can set arbitrarily many variables without repeating the
+assignment operator over and over again:
+
+    :::lisp
+    ; Have to type out `=` three times
+    x = 10
+    y = 20
+    z = 30
+
+    ; But only one `setf` required
+    (setf x 10
+          y 20
+          z 30)
+
+`setf` assigns values one-by-one, in-order.  Common Lisp also has `psetf` which
+is "parallel `setf`".  This evaluates all the values first, *then* sets all the
+variables at once:
+
+    :::lisp
+    (let ((a 10) (b 20))
+      (setf a 50
+            b (+ 1 a))   ; => (50 51)
+
+      (psetf a 90
+             b (+ 1 a))) ; => (90 52)
+
+Assignment is nice, but often you want to change the value of a variable by
+taking its *current* value and computing something with it.  If you're making
+a ball that travels across the screen you might have something like this:
+
+    :::text
+    def move_ball(ball, distance):
+        ball.x = ball.x + distance
+
+    (defun move-ball (ball distance)
+      (setf (slot-value ball 'x)
+            (+ (slot-value ball 'x)
+                distance)))
+
+If you find `slot-value` too wordy to type, you could make a macro to get rid of
+it (and save yourself a quote too).  `.` is already used for something else so
+you'd need to pick a different character:
+
+    :::lisp
+    (defmacro $ (instance slot-symbol)
+      `(slot-value ,instance ',slot-symbol))
+
+    (defun move-ball (ball distance)
+      (setf ($ ball x)
+            (+ ($ ball x)
+                distance)))
+
+In practice, though, you rarely use `slot-value`.  Instead you usually use
+accessor functions:
+
+    :::lisp
+    (defun move-ball (ball distance)
+      (setf (ball-x ball)
+            (+ (ball-x ball)
+                distance)))
+
+Anyway, back to assignment.  What we wrote above *works*, but it's a bit
+annoying to have to type the "place" that we're working with twice.  To cut down
+on this repetition many languages offer operators like `+=` and `++`:
+
+    :::text
+    x = x + 10
+    ⬇
+    x += 10
+
+    y = y + 1
+    ⬇
+    y++
+
+Common Lisp's version of these is called `incf`.
+
+    :::lisp
+    (incf x 10)
+    (incf y)
+
+Notice how the s-expression syntax means we can use the same operator for both
+of the forms, instead of needing separate `+=` and `++` operators.
+
+Other languages often have similar "in-place" operators for other numeric
+operations like `-=`, `--`, `*=`, and so on.  Common Lisp has `decf` for
+subtraction, but doesn't have versions for multiplication or division built in.
+But we can add them with a few macros:
+
+    :::lisp
+    (defmacro mulf (place factor)
+      `(setf ,place (* ,place ,factor)))
+
+    (defmacro divf (place divisor)
+      `(setf ,place (/ ,place ,divisor)))
+
+Unfortunately these are not quite right -- we've committed the cardinal
+macro-writing sin of splicing in the `place` expression multiple times.  If
+`place` has side effects they'll happen more times than expected.
+
+Luckily, Common Lisp has a macro that wraps up the boring process of handling
+this for us.  `define-modify-macro` lets us fix the problem:
+
+    :::lisp
+    (define-modify-macro mulf (factor) *)
+    (define-modify-macro divf (divisor) /)
+
+Now we've got versions of `*=` and `/=` in Common Lisp.  While we're here we
+should do things *right* and allow `divf` to be used with just a place, which
+will mean "set `place` to `1/place`":
+
+    :::lisp
+    (define-modify-macro divf (&optional divisor)
+      (lambda (value divisor)
+        (if (null divisor)
+          (/ 1 value)
+          (/ value divisor))))
+
+    (let ((x 10))
+      (divf x 2) ; => 5
+      (divf x)   ; => 1/5
+      (divf x))  ; => 5
+
+Note that we don't really need the `1` in `(/ 1 value)`, because Common Lisp's
+`/` works the same way when you call it with a single argument.  I'm not sure
+what behavior would make sense for a unary `(mulf place)`, so let's ignore that
+one and move on.
+
+So far we've been talking about the idea of setting a variable to the result of
+computing some function on its current value.  We used `define-modify-macro` to
+define some support for extra functions, but defining a mutator macro for every
+function we might possibly want to use could get tedious.  What if we generalize
+this a bit more?
+
+    :::lisp
+    (define-modify-macro callf (function)
+      (lambda (value function)
+        (funcall function value)))
+
+    (let ((x -1))
+      (callf x #'abs)   ; => 1
+      (callf x #'1+)    ; => 2
+      (callf x #'sin)   ; => 0.9092974
+      (callf x #'round) ; => 1
+      (callf x #'-))    ; => -1
+
+Now we've got something a bit higher-level.  With `callf` we can mutate
+a variable with a unary function.  But what about functions that need more
+arguments?  Again, Common Lisp does the right thing and handles `&rest`
+parameters correctly in `define-modify-macro`:
+
+    :::lisp
+    (define-modify-macro callf (function &rest args)
+      (lambda (value function &rest args)
+        (apply function value args)))
+
+    (let ((x -1))
+      (callf x #'abs)    ; => 1
+      (callf x #'+ 10)   ; => 11
+      (callf x #'* 2 4)) ; => 88
+
+    (let ((foo (list 0 1 2 3)))
+      (callf foo #'reverse)
+      ; => (3 2 1 0)
+
+      (callf foo #'append (list -1 -2 -3)))
+      ; => (3 2 1 0 -1 -2 -3)
+
+That's better.  Now we can mutate a variable by applying a function to its
+current value and some other arguments.  This might come in handy when we don't
+want to bother defining a full-blown modification macro just to use it once or
+twice.
+
+At this point we've got something similar to Michael Malis' [zap][] macro.
+A minor difference is for `zap` the order of the function and the place are
+reversed.  I believe his reason was that it makes the resulting form look more
+like the function call that is conceptually being made:
+
+    :::lisp
+    (zap     #'+  x  5)
+    ;          ↓  ↓  ↓
+    (setf x (  +  x  5  ))
+
+Unfortunately, because the place is no longer the first argument we can't use
+`define-modify-macro` to handle the single-evaluation boilerplate for us any
+more.  Instead we need to use `get-setf-expander` and work with its results.
+Malis' excellent [Getting Places][] article explains how that works, and if you
+want to fully understand the rest of the code in this post you should read that
+one first.
+
+The difference between `callf` and Malis' `zap` are just cosmetic.  But they're
+still not completely flexible.  What if we want to call a function, but we need
+the value to be an argument other than the first?  Of course we could use
+a `lambda`:
+
+    :::lisp
+    (let ((x (list :coffee :soda :tea)))
+      (callf x (lambda (l) (remove :soda l))))
+
+
+But that is *really* ugly.  It would be nice if we had a way to specify which
+argument to the function should be the current value.  This is Lisp, so we can
+pick whatever syntax we want.  What should we choose?  Often a good first step
+is to look at how other languages do things.  Clojure's anonymous function
+syntax is one possibility:
+
+    :::lisp
+    (map #(+ 10 %) [1 2 3])
+    ; => (11 12 13)
+
+Clojure uses the magic symbol `%` as a placeholder for the argument.  We can
+modify Malis' `zap` macro to do the same.  We'll call it `zapf`, and we'll swap
+the place back into the first argument like all the other modify macros, since
+the aesthetic reason no longer holds.  The changes are in bold:
+
+<div class="codehilite"><pre>
+; Original version
+(defmacro zap (fn place &rest args)
+  (multiple-value-bind
+        (temps exprs stores store-expr access-expr)
+      (get-setf-expansion place)
+    `(let* (,@(mapcar #'list temps exprs)
+            (,(car stores)
+              (funcall ,fn ,access-expr ,@args)))
+       ,store-expr)))
+
+; New version
+(defmacro <b>zapf</b> (<b>place fn</b> &rest args)
+  (multiple-value-bind
+        (temps exprs stores store-expr access-expr)
+      (get-setf-expansion place)
+    `(let* (,@(mapcar #'list temps exprs)
+            (,(car stores)
+             (funcall ,fn <b>,@(substitute access-expr '% args)</b>)))
+      ,store-expr)))
+</pre></div>
+
+And now we can use `%` to say where the place's value should go:
+
+    :::lisp
+    (let ((x (list :coffee :soda :tea)))
+      (zapf x #'remove :soda %)
+      ; => (:coffee :tea)
+
+      (zapf x #'cons :water %)
+      ; => (:water :coffee :tea)
+
+      (zapf x #'append % (list :cocoa :chai)))
+      ; => (:water :coffee :tea :cocoa :chai)
+
+Note that `substitute` will replace *all* occurrences of `%` with the access
+expression, so we can do something like `(zapf x #'append % %)` if we want.
+
+Some people will find this new version unpleasant, because it effectively
+captures the `%` variable.  It's not hygienic, by design.  I personally don't
+mind it, and I like the brevity it allows.
+
+We're almost done, but I want to push things just a *bit* further.  Let's
+revisit the original example of the ball moving across the screen:
+
+    :::text
+    def move_ball(ball, distance):
+        ball.x = ball.x + distance
+
+    (defun move-ball (ball distance)
+      (setf (ball-x ball)
+            (+ (ball-x ball) distance)))
+
+Of course we can simply use `+=` or `incf` here:
+
+    :::text
+    def move_ball(ball, distance):
+        ball.x += distance
+
+    (defun move-ball (ball distance)
+      (incf (ball-x ball) distance))
+
+Suppose we also wanted to make the ball wrap around the screen when it flies off
+one side.  We can do this using modulus:
+
+    :::text
+    def move_ball(ball, distance, screen_width):
+        ball.x += distance
+        ball.x %= screen_width
+
+    (defun move-ball (ball distance screen-width)
+      (incf (ball-x ball) distance)
+      (callf (ball-x ball) #'mod % screen-width))
+
+We could define `modf` if we wanted to make that second call a bit nicer.  But
+let's take a moment to reflect.  Notice how we're back to having to type out
+`(ball-x ball)` twice again.  It's better than typing it four times, but can we
+do better?
+
+`zapf` currently takes a place, a function, and a list of arguments.  What if we
+made it more like `setf`, and just have it take a place and an expression?  We
+could still use `%` as a placeholder, but could let it appear anywhere in the
+(possibly nested) expression form.
+
+One way to do this would be to simply replace `substitute` with `subst` in the
+`zapf` code.  (The difference between these extremely-confusingly-named
+functions is that `substitute` replaces element of a sequence, and `subst`
+replaces elements of a tree.)
+
+This, however, is the wrong strategy.  Blindly replacing `%` everywhere in the
+expression will work for many cases, but will break in edge cases like (god help
+you) nested `zapf` forms.  We could try to walk the code of the expression we
+get, but down this path lies madness and implementation-specificity.
+
+The solution is much, much simpler.  We can just use a normal `let` to capture
+`%` in the expression:
+
+<div class="codehilite"><pre>
+; Final version
+(defmacro <b>zapf</b> (<b>place expr</b>)
+  (multiple-value-bind
+        (temps exprs stores store-expr access-expr)
+      (get-setf-expansion place)
+    `(let* (,@(mapcar #'list temps exprs)
+            (,(car stores)
+             <b>(let ((% ,access-expr))
+               ,expr)</b>))
+      ,store-expr)))
+</pre></div>
+
+And now `move-ball` only requires the bare minimum of typing:
+
+    :::lisp
+    (defun move-ball (ball distance screen-width)
+      (zapf (ball-x ball)
+            (mod (+ distance %) screen-width)))
+
+For completeness it would be nice to allow `zapf` to take any number of
+place/value pairs, just like `setf`.  I'll leave this as an exercise for you to
+try if you're interested.
+
+It took me a while to arrive at this form of the macro.  I personally find it
+lovely and readable.  If you disagree, that's fine too!  Lisp is a wonderful
+substrate for building a language that fits how you think and talk about
+problems.  Play around with your own syntax and find out what feels most natural
+for you.
+
+Before we leave, let's just ponder one more thing.  Performance is often ignored
+these days, but what if we care about making the computer go fast?  The final
+`zapf` macro is handy and expressive, but what cost have we paid for this
+abstraction?
+
+First let's tell SBCL that we want to go fast and return to our original `setf`
+strategy:
+
+    :::lisp
+    (deftype nonnegative-fixnum ()
+      `(integer 0 ,most-positive-fixnum))
+
+    (deftype positive-fixnum ()
+      `(integer 1 ,most-positive-fixnum))
+
+    (defstruct ball
+      (x 0 :type nonnegative-fixnum)
+      (y 0 :type nonnegative-fixnum))
+
+    (declaim (ftype (function (ball fixnum positive-fixnum))
+                    move-ball))
+
+    (defun move-ball (ball distance screen-width)
+      (declare (optimize (speed 3) (safety 0) (debug 0)))
+      (setf (ball-x ball)
+            (mod (+ distance (ball-x ball))
+                 screen-width)))
+
+How does that turn out?
+
+    :::text
+    ; disassembly for MOVE-BALL
+    ; Size: 82 bytes. Origin: #x1005E5E12E
+    ; 2E:       488B410D         MOV RAX, [RCX+13]       ; no-arg-parsing entry point
+    ; 32:       48D1F8           SAR RAX, 1
+    ; 35:       498D3C00         LEA RDI, [R8+RAX]
+    ; 39:       488BDE           MOV RBX, RSI
+    ; 3C:       48D1FB           SAR RBX, 1
+    ; 3F:       4885DB           TEST RBX, RBX
+    ; 42:       7431             JEQ L3
+    ; 44:       488BC7           MOV RAX, RDI
+    ; 47:       4899             CQO
+    ; 49:       48F7FB           IDIV RAX, RBX
+    ; 4C:       4C8BC0           MOV R8, RAX
+    ; 4F:       488BC2           MOV RAX, RDX
+    ; 52:       48D1E0           SHL RAX, 1
+    ; 55:       4885C0           TEST RAX, RAX
+    ; 58:       7510             JNE L2
+    ; 5A: L0:   488BD8           MOV RBX, RAX
+    ; 5D: L1:   4889590D         MOV [RCX+13], RBX
+    ; 61:       488BD3           MOV RDX, RBX
+    ; 64:       488BE5           MOV RSP, RBP
+    ; 67:       F8               CLC
+    ; 68:       5D               POP RBP
+    ; 69:       C3               RET
+    ; 6A: L2:   4885FF           TEST RDI, RDI
+    ; 6D:       7DEB             JNL L0
+    ; 6F:       488D1C30         LEA RBX, [RAX+RSI]
+    ; 73:       EBE8             JMP L1
+    ; 75: L3:   0F0B0A           BREAK 10                ; error trap
+    ; 78:       07               BYTE #X07
+    ; 79:       08               BYTE #X08               ; DIVISION-BY-ZERO-ERROR
+    ; 7A:       FE9E03           BYTE #XFE, #X9E, #X03   ; RDI
+    ; 7D:       FE9E01           BYTE #XFE, #X9E, #X01   ; RBX
+
+Well that's not *too* bad.  I'm not sure why SBCL still performs a check for
+division by zero for the `mod` even though we said that `screen-width` can't
+possibly be zero.  But hey, we're at a manageable amount of assembly, which is
+pretty nice for a high-level language!
+
+Now for the `zapf` version:
+
+    :::lisp
+    (defun move-ball (ball distance screen-width)
+      (declare (optimize (speed 3) (safety 0) (debug 0)))
+      (zapf (ball-x ball)
+            (mod (+ distance %) screen-width)))
+
+Okay, let's bite the bullet.  How bad are we talking?
+
+    :::text
+    ; disassembly for MOVE-BALL
+    ; Size: 70 bytes. Origin: #x1005F47C7B
+    ; 7B:       488B410D         MOV RAX, [RCX+13]       ; no-arg-parsing entry point
+    ; 7F:       48D1F8           SAR RAX, 1
+    ; 82:       488D3C03         LEA RDI, [RBX+RAX]
+    ; 86:       4885F6           TEST RSI, RSI
+    ; 89:       742B             JEQ L2
+    ; 8B:       488BC7           MOV RAX, RDI
+    ; 8E:       4899             CQO
+    ; 90:       48F7FE           IDIV RAX, RSI
+    ; 93:       488BD8           MOV RBX, RAX
+    ; 96:       488D0412         LEA RAX, [RDX+RDX]
+    ; 9A:       4885C0           TEST RAX, RAX
+    ; 9D:       750D             JNE L1
+    ; 9F: L0:   48D1E2           SHL RDX, 1
+    ; A2:       4889510D         MOV [RCX+13], RDX
+    ; A6:       488BE5           MOV RSP, RBP
+    ; A9:       F8               CLC
+    ; AA:       5D               POP RBP
+    ; AB:       C3               RET
+    ; AC: L1:   4885FF           TEST RDI, RDI
+    ; AF:       7DEE             JNL L0
+    ; B1:       4801F2           ADD RDX, RSI
+    ; B4:       EBE9             JMP L0
+    ; B6: L2:   0F0B0A           BREAK 10                ; error trap
+    ; B9:       07               BYTE #X07
+    ; BA:       08               BYTE #X08               ; DIVISION-BY-ZERO-ERROR
+    ; BB:       FE9E03           BYTE #XFE, #X9E, #X03   ; RDI
+    ; BE:       FE1E03           BYTE #XFE, #X1E, #X03   ; RSI
+
+Wait, it's actually *shorter*?  What about all those extra variables the `zapf`
+macro expands into?  It turns out that SBCL is really quite good at optimizing
+`let` statements and local variables.
+
+This is why I love Common Lisp.  You can be rewriting the syntax of the language
+and working on a tower of abstraction one moment, and looking at X86 assembly to
+see what the hell the computer is actually *doing* the next.  It's wonderful.
+
+[Getting Places]: http://malisper.me/2015/09/22/getting-places/
+[Zap]: http://malisper.me/2015/09/29/zap/
+
+    {% endblock article %}
--- a/content/links.html	Mon Aug 15 13:46:09 2016 +0000
+++ b/content/links.html	Mon Aug 15 20:55:05 2016 +0000
@@ -16,6 +16,7 @@
 -----
 
 * [Martin O'Leary](http://mewo2.com/): Blog with some really good articles on procedural generation.
+* [Michael Malis](http://malisper.me/): Lots of Lisp stuff (usually macro-related).
 * [Red Blob Games](http://www.redblobgames.com/): Wonderful blog with *really* good posts about lots of different video game-related topics.
 * [The Digital Antiquarian](http://www.filfre.net/): Immense amount of information on the history of "computer entertainment".
 
--- a/layout/_post.html	Mon Aug 15 13:46:09 2016 +0000
+++ b/layout/_post.html	Mon Aug 15 20:55:05 2016 +0000
@@ -2,7 +2,7 @@
 
 {% block extra_css %}
     <link rel="stylesheet"
-          href="{{ site.url }}/media/css/pygments-monokai-light.css"
+          href="{{ site.url }}/media/css/pygments-clean.css"
           type="text/css" media="screen" charset="utf-8" />
 {% endblock %}
 
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/media/css/pygments-clean.css	Mon Aug 15 20:55:05 2016 +0000
@@ -0,0 +1,60 @@
+/* @override http://localhost:8080/media/css/pygments-monokai-light.css */
+.codehilite .hll { background-color: #49483e }
+.codehilite .err { color: #fff; background-color: #f00 } /* Error */
+.codehilite .k { color: #111} /* Keyword */
+.codehilite .l { color: #111 } /* Literal */
+.codehilite .n { color: #111 } /* Name */
+.codehilite .o { color: #111 } /* Operator */
+.codehilite .p { color: #111 } /* Punctuation */
+.codehilite .c  { color: #714678; font-style: italic; font-weight: bold; } /* Comment */
+.codehilite .cm { color: #714678; font-style: italic; font-weight: bold; } /* Comment.Multiline */
+.codehilite .cp { color: #714678; font-style: italic; font-weight: bold; } /* Comment.Preproc */
+.codehilite .c1 { color: #714678; font-style: italic; font-weight: bold; } /* Comment.Single */
+.codehilite .cs { color: #714678; font-style: italic; font-weight: bold; } /* Comment.Special */
+.codehilite .ge { font-style: italic } /* Generic.Emph */
+.codehilite .gs { font-weight: bold } /* Generic.Strong */
+.codehilite .kc { color: #111 } /* Keyword.Constant */
+.codehilite .kd { color: #111 } /* Keyword.Declaration */
+.codehilite .kn { color: #111 } /* Keyword.Namespace */
+.codehilite .kp { color: #111 } /* Keyword.Pseudo */
+.codehilite .kr { color: #111 } /* Keyword.Reserved */
+.codehilite .kt { color: #111 } /* Keyword.Type */
+.codehilite .ld { color: #111 } /* Literal.Date */
+.codehilite .m { color: #111 } /* Literal.Number */
+.codehilite .s { color: #111} /* Literal.String */
+.codehilite .na { color: #111 } /* Name.Attribute */
+.codehilite .nb { color: #111 } /* Name.Builtin */
+.codehilite .nc { color: #111 } /* Name.Class */
+.codehilite .no { color: #111 } /* Name.Constant */
+.codehilite .nd { color: #111 } /* Name.Decorator */
+.codehilite .ni { color: #111 } /* Name.Entity */
+.codehilite .ne { color: #111 } /* Name.Exception */
+.codehilite .nf { color: #111} /* Name.Function */
+.codehilite .nl { color: #111 } /* Name.Label */
+.codehilite .nn { color: #111} /* Name.Namespace */
+.codehilite .nx { color: #111 } /* Name.Other */
+.codehilite .py { color: #111 } /* Name.Property */
+.codehilite .nt { color: #111 } /* Name.Tag */
+.codehilite .nv { color: #111 } /* Name.Variable */
+.codehilite .ow { color: #111 } /* Operator.Word */
+.codehilite .w { color: #111 } /* Text.Whitespace */
+.codehilite .mf { color: #111 } /* Literal.Number.Float */
+.codehilite .mh { color: #111 } /* Literal.Number.Hex */
+.codehilite .mi { color: #111 } /* Literal.Number.Integer */
+.codehilite .mo { color: #111 } /* Literal.Number.Oct */
+.codehilite .sb { color: #111 } /* Literal.String.Backtick */
+.codehilite .sc { color: #111 } /* Literal.String.Char */
+.codehilite .sd { color: #111 } /* Literal.String.Doc */
+.codehilite .s2 { color: #111 } /* Literal.String.Double */
+.codehilite .se { color: #111 } /* Literal.String.Escape */
+.codehilite .sh { color: #111 } /* Literal.String.Heredoc */
+.codehilite .si { color: #111 } /* Literal.String.Interpol */
+.codehilite .sx { color: #111 } /* Literal.String.Other */
+.codehilite .sr { color: #111 } /* Literal.String.Regex */
+.codehilite .s1 { color: #111 } /* Literal.String.Single */
+.codehilite .ss { color: #111 } /* Literal.String.Symbol */
+.codehilite .bp { color: #111 } /* Name.Builtin.Pseudo */
+.codehilite .vc { color: #111 } /* Name.Variable.Class */
+.codehilite .vg { color: #111 } /* Name.Variable.Global */
+.codehilite .vi { color: #111 } /* Name.Variable.Instance */
+.codehilite .il { color: #111 } /* Literal.Number.Integer.Long */