--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/content/blog/2016/08/playing-with-syntax.html Mon Aug 15 20:55:05 2016 +0000
@@ -0,0 +1,493 @@
+ {% extends "_post.html" %}
+
+ {% hyde
+ title: "Playing With Syntax"
+ snip: "Lisp lets you evolve your language."
+ created: 2016-08-16 13:45:00
+ %}
+
+ {% block article %}
+
+One of the things I love about Lisp is that it gives you the ability to change
+and mold the syntax of the language to what you need. In this post I want to
+look at the evolution of a little macro I've been playing around with for
+a while now.
+
+Mutation is a fundamental concept in most programming languages. Functional
+programming may be beautiful, but mutation is still useful (and fast). In most
+languages assignment is done with an `=` or `:=` operator:
+
+ :::lisp
+ x = 10
+
+In Common Lisp the operator is named `setf` instead of `=` for historical
+reasons, and of course it's used in prefix notation:
+
+ :::lisp
+ (setf x 10)
+
+Aside from the prefix ordering, Common Lisp's syntax is already a bit more
+elegant because you can set arbitrarily many variables without repeating the
+assignment operator over and over again:
+
+ :::lisp
+ ; Have to type out `=` three times
+ x = 10
+ y = 20
+ z = 30
+
+ ; But only one `setf` required
+ (setf x 10
+ y 20
+ z 30)
+
+`setf` assigns values one-by-one, in-order. Common Lisp also has `psetf` which
+is "parallel `setf`". This evaluates all the values first, *then* sets all the
+variables at once:
+
+ :::lisp
+ (let ((a 10) (b 20))
+ (setf a 50
+ b (+ 1 a)) ; => (50 51)
+
+ (psetf a 90
+ b (+ 1 a))) ; => (90 52)
+
+Assignment is nice, but often you want to change the value of a variable by
+taking its *current* value and computing something with it. If you're making
+a ball that travels across the screen you might have something like this:
+
+ :::text
+ def move_ball(ball, distance):
+ ball.x = ball.x + distance
+
+ (defun move-ball (ball distance)
+ (setf (slot-value ball 'x)
+ (+ (slot-value ball 'x)
+ distance)))
+
+If you find `slot-value` too wordy to type, you could make a macro to get rid of
+it (and save yourself a quote too). `.` is already used for something else so
+you'd need to pick a different character:
+
+ :::lisp
+ (defmacro $ (instance slot-symbol)
+ `(slot-value ,instance ',slot-symbol))
+
+ (defun move-ball (ball distance)
+ (setf ($ ball x)
+ (+ ($ ball x)
+ distance)))
+
+In practice, though, you rarely use `slot-value`. Instead you usually use
+accessor functions:
+
+ :::lisp
+ (defun move-ball (ball distance)
+ (setf (ball-x ball)
+ (+ (ball-x ball)
+ distance)))
+
+Anyway, back to assignment. What we wrote above *works*, but it's a bit
+annoying to have to type the "place" that we're working with twice. To cut down
+on this repetition many languages offer operators like `+=` and `++`:
+
+ :::text
+ x = x + 10
+ ⬇
+ x += 10
+
+ y = y + 1
+ ⬇
+ y++
+
+Common Lisp's version of these is called `incf`.
+
+ :::lisp
+ (incf x 10)
+ (incf y)
+
+Notice how the s-expression syntax means we can use the same operator for both
+of the forms, instead of needing separate `+=` and `++` operators.
+
+Other languages often have similar "in-place" operators for other numeric
+operations like `-=`, `--`, `*=`, and so on. Common Lisp has `decf` for
+subtraction, but doesn't have versions for multiplication or division built in.
+But we can add them with a few macros:
+
+ :::lisp
+ (defmacro mulf (place factor)
+ `(setf ,place (* ,place ,factor)))
+
+ (defmacro divf (place divisor)
+ `(setf ,place (/ ,place ,divisor)))
+
+Unfortunately these are not quite right -- we've committed the cardinal
+macro-writing sin of splicing in the `place` expression multiple times. If
+`place` has side effects they'll happen more times than expected.
+
+Luckily, Common Lisp has a macro that wraps up the boring process of handling
+this for us. `define-modify-macro` lets us fix the problem:
+
+ :::lisp
+ (define-modify-macro mulf (factor) *)
+ (define-modify-macro divf (divisor) /)
+
+Now we've got versions of `*=` and `/=` in Common Lisp. While we're here we
+should do things *right* and allow `divf` to be used with just a place, which
+will mean "set `place` to `1/place`":
+
+ :::lisp
+ (define-modify-macro divf (&optional divisor)
+ (lambda (value divisor)
+ (if (null divisor)
+ (/ 1 value)
+ (/ value divisor))))
+
+ (let ((x 10))
+ (divf x 2) ; => 5
+ (divf x) ; => 1/5
+ (divf x)) ; => 5
+
+Note that we don't really need the `1` in `(/ 1 value)`, because Common Lisp's
+`/` works the same way when you call it with a single argument. I'm not sure
+what behavior would make sense for a unary `(mulf place)`, so let's ignore that
+one and move on.
+
+So far we've been talking about the idea of setting a variable to the result of
+computing some function on its current value. We used `define-modify-macro` to
+define some support for extra functions, but defining a mutator macro for every
+function we might possibly want to use could get tedious. What if we generalize
+this a bit more?
+
+ :::lisp
+ (define-modify-macro callf (function)
+ (lambda (value function)
+ (funcall function value)))
+
+ (let ((x -1))
+ (callf x #'abs) ; => 1
+ (callf x #'1+) ; => 2
+ (callf x #'sin) ; => 0.9092974
+ (callf x #'round) ; => 1
+ (callf x #'-)) ; => -1
+
+Now we've got something a bit higher-level. With `callf` we can mutate
+a variable with a unary function. But what about functions that need more
+arguments? Again, Common Lisp does the right thing and handles `&rest`
+parameters correctly in `define-modify-macro`:
+
+ :::lisp
+ (define-modify-macro callf (function &rest args)
+ (lambda (value function &rest args)
+ (apply function value args)))
+
+ (let ((x -1))
+ (callf x #'abs) ; => 1
+ (callf x #'+ 10) ; => 11
+ (callf x #'* 2 4)) ; => 88
+
+ (let ((foo (list 0 1 2 3)))
+ (callf foo #'reverse)
+ ; => (3 2 1 0)
+
+ (callf foo #'append (list -1 -2 -3)))
+ ; => (3 2 1 0 -1 -2 -3)
+
+That's better. Now we can mutate a variable by applying a function to its
+current value and some other arguments. This might come in handy when we don't
+want to bother defining a full-blown modification macro just to use it once or
+twice.
+
+At this point we've got something similar to Michael Malis' [zap][] macro.
+A minor difference is for `zap` the order of the function and the place are
+reversed. I believe his reason was that it makes the resulting form look more
+like the function call that is conceptually being made:
+
+ :::lisp
+ (zap #'+ x 5)
+ ; ↓ ↓ ↓
+ (setf x ( + x 5 ))
+
+Unfortunately, because the place is no longer the first argument we can't use
+`define-modify-macro` to handle the single-evaluation boilerplate for us any
+more. Instead we need to use `get-setf-expander` and work with its results.
+Malis' excellent [Getting Places][] article explains how that works, and if you
+want to fully understand the rest of the code in this post you should read that
+one first.
+
+The difference between `callf` and Malis' `zap` are just cosmetic. But they're
+still not completely flexible. What if we want to call a function, but we need
+the value to be an argument other than the first? Of course we could use
+a `lambda`:
+
+ :::lisp
+ (let ((x (list :coffee :soda :tea)))
+ (callf x (lambda (l) (remove :soda l))))
+
+
+But that is *really* ugly. It would be nice if we had a way to specify which
+argument to the function should be the current value. This is Lisp, so we can
+pick whatever syntax we want. What should we choose? Often a good first step
+is to look at how other languages do things. Clojure's anonymous function
+syntax is one possibility:
+
+ :::lisp
+ (map #(+ 10 %) [1 2 3])
+ ; => (11 12 13)
+
+Clojure uses the magic symbol `%` as a placeholder for the argument. We can
+modify Malis' `zap` macro to do the same. We'll call it `zapf`, and we'll swap
+the place back into the first argument like all the other modify macros, since
+the aesthetic reason no longer holds. The changes are in bold:
+
+<div class="codehilite"><pre>
+; Original version
+(defmacro zap (fn place &rest args)
+ (multiple-value-bind
+ (temps exprs stores store-expr access-expr)
+ (get-setf-expansion place)
+ `(let* (,@(mapcar #'list temps exprs)
+ (,(car stores)
+ (funcall ,fn ,access-expr ,@args)))
+ ,store-expr)))
+
+; New version
+(defmacro <b>zapf</b> (<b>place fn</b> &rest args)
+ (multiple-value-bind
+ (temps exprs stores store-expr access-expr)
+ (get-setf-expansion place)
+ `(let* (,@(mapcar #'list temps exprs)
+ (,(car stores)
+ (funcall ,fn <b>,@(substitute access-expr '% args)</b>)))
+ ,store-expr)))
+</pre></div>
+
+And now we can use `%` to say where the place's value should go:
+
+ :::lisp
+ (let ((x (list :coffee :soda :tea)))
+ (zapf x #'remove :soda %)
+ ; => (:coffee :tea)
+
+ (zapf x #'cons :water %)
+ ; => (:water :coffee :tea)
+
+ (zapf x #'append % (list :cocoa :chai)))
+ ; => (:water :coffee :tea :cocoa :chai)
+
+Note that `substitute` will replace *all* occurrences of `%` with the access
+expression, so we can do something like `(zapf x #'append % %)` if we want.
+
+Some people will find this new version unpleasant, because it effectively
+captures the `%` variable. It's not hygienic, by design. I personally don't
+mind it, and I like the brevity it allows.
+
+We're almost done, but I want to push things just a *bit* further. Let's
+revisit the original example of the ball moving across the screen:
+
+ :::text
+ def move_ball(ball, distance):
+ ball.x = ball.x + distance
+
+ (defun move-ball (ball distance)
+ (setf (ball-x ball)
+ (+ (ball-x ball) distance)))
+
+Of course we can simply use `+=` or `incf` here:
+
+ :::text
+ def move_ball(ball, distance):
+ ball.x += distance
+
+ (defun move-ball (ball distance)
+ (incf (ball-x ball) distance))
+
+Suppose we also wanted to make the ball wrap around the screen when it flies off
+one side. We can do this using modulus:
+
+ :::text
+ def move_ball(ball, distance, screen_width):
+ ball.x += distance
+ ball.x %= screen_width
+
+ (defun move-ball (ball distance screen-width)
+ (incf (ball-x ball) distance)
+ (callf (ball-x ball) #'mod % screen-width))
+
+We could define `modf` if we wanted to make that second call a bit nicer. But
+let's take a moment to reflect. Notice how we're back to having to type out
+`(ball-x ball)` twice again. It's better than typing it four times, but can we
+do better?
+
+`zapf` currently takes a place, a function, and a list of arguments. What if we
+made it more like `setf`, and just have it take a place and an expression? We
+could still use `%` as a placeholder, but could let it appear anywhere in the
+(possibly nested) expression form.
+
+One way to do this would be to simply replace `substitute` with `subst` in the
+`zapf` code. (The difference between these extremely-confusingly-named
+functions is that `substitute` replaces element of a sequence, and `subst`
+replaces elements of a tree.)
+
+This, however, is the wrong strategy. Blindly replacing `%` everywhere in the
+expression will work for many cases, but will break in edge cases like (god help
+you) nested `zapf` forms. We could try to walk the code of the expression we
+get, but down this path lies madness and implementation-specificity.
+
+The solution is much, much simpler. We can just use a normal `let` to capture
+`%` in the expression:
+
+<div class="codehilite"><pre>
+; Final version
+(defmacro <b>zapf</b> (<b>place expr</b>)
+ (multiple-value-bind
+ (temps exprs stores store-expr access-expr)
+ (get-setf-expansion place)
+ `(let* (,@(mapcar #'list temps exprs)
+ (,(car stores)
+ <b>(let ((% ,access-expr))
+ ,expr)</b>))
+ ,store-expr)))
+</pre></div>
+
+And now `move-ball` only requires the bare minimum of typing:
+
+ :::lisp
+ (defun move-ball (ball distance screen-width)
+ (zapf (ball-x ball)
+ (mod (+ distance %) screen-width)))
+
+For completeness it would be nice to allow `zapf` to take any number of
+place/value pairs, just like `setf`. I'll leave this as an exercise for you to
+try if you're interested.
+
+It took me a while to arrive at this form of the macro. I personally find it
+lovely and readable. If you disagree, that's fine too! Lisp is a wonderful
+substrate for building a language that fits how you think and talk about
+problems. Play around with your own syntax and find out what feels most natural
+for you.
+
+Before we leave, let's just ponder one more thing. Performance is often ignored
+these days, but what if we care about making the computer go fast? The final
+`zapf` macro is handy and expressive, but what cost have we paid for this
+abstraction?
+
+First let's tell SBCL that we want to go fast and return to our original `setf`
+strategy:
+
+ :::lisp
+ (deftype nonnegative-fixnum ()
+ `(integer 0 ,most-positive-fixnum))
+
+ (deftype positive-fixnum ()
+ `(integer 1 ,most-positive-fixnum))
+
+ (defstruct ball
+ (x 0 :type nonnegative-fixnum)
+ (y 0 :type nonnegative-fixnum))
+
+ (declaim (ftype (function (ball fixnum positive-fixnum))
+ move-ball))
+
+ (defun move-ball (ball distance screen-width)
+ (declare (optimize (speed 3) (safety 0) (debug 0)))
+ (setf (ball-x ball)
+ (mod (+ distance (ball-x ball))
+ screen-width)))
+
+How does that turn out?
+
+ :::text
+ ; disassembly for MOVE-BALL
+ ; Size: 82 bytes. Origin: #x1005E5E12E
+ ; 2E: 488B410D MOV RAX, [RCX+13] ; no-arg-parsing entry point
+ ; 32: 48D1F8 SAR RAX, 1
+ ; 35: 498D3C00 LEA RDI, [R8+RAX]
+ ; 39: 488BDE MOV RBX, RSI
+ ; 3C: 48D1FB SAR RBX, 1
+ ; 3F: 4885DB TEST RBX, RBX
+ ; 42: 7431 JEQ L3
+ ; 44: 488BC7 MOV RAX, RDI
+ ; 47: 4899 CQO
+ ; 49: 48F7FB IDIV RAX, RBX
+ ; 4C: 4C8BC0 MOV R8, RAX
+ ; 4F: 488BC2 MOV RAX, RDX
+ ; 52: 48D1E0 SHL RAX, 1
+ ; 55: 4885C0 TEST RAX, RAX
+ ; 58: 7510 JNE L2
+ ; 5A: L0: 488BD8 MOV RBX, RAX
+ ; 5D: L1: 4889590D MOV [RCX+13], RBX
+ ; 61: 488BD3 MOV RDX, RBX
+ ; 64: 488BE5 MOV RSP, RBP
+ ; 67: F8 CLC
+ ; 68: 5D POP RBP
+ ; 69: C3 RET
+ ; 6A: L2: 4885FF TEST RDI, RDI
+ ; 6D: 7DEB JNL L0
+ ; 6F: 488D1C30 LEA RBX, [RAX+RSI]
+ ; 73: EBE8 JMP L1
+ ; 75: L3: 0F0B0A BREAK 10 ; error trap
+ ; 78: 07 BYTE #X07
+ ; 79: 08 BYTE #X08 ; DIVISION-BY-ZERO-ERROR
+ ; 7A: FE9E03 BYTE #XFE, #X9E, #X03 ; RDI
+ ; 7D: FE9E01 BYTE #XFE, #X9E, #X01 ; RBX
+
+Well that's not *too* bad. I'm not sure why SBCL still performs a check for
+division by zero for the `mod` even though we said that `screen-width` can't
+possibly be zero. But hey, we're at a manageable amount of assembly, which is
+pretty nice for a high-level language!
+
+Now for the `zapf` version:
+
+ :::lisp
+ (defun move-ball (ball distance screen-width)
+ (declare (optimize (speed 3) (safety 0) (debug 0)))
+ (zapf (ball-x ball)
+ (mod (+ distance %) screen-width)))
+
+Okay, let's bite the bullet. How bad are we talking?
+
+ :::text
+ ; disassembly for MOVE-BALL
+ ; Size: 70 bytes. Origin: #x1005F47C7B
+ ; 7B: 488B410D MOV RAX, [RCX+13] ; no-arg-parsing entry point
+ ; 7F: 48D1F8 SAR RAX, 1
+ ; 82: 488D3C03 LEA RDI, [RBX+RAX]
+ ; 86: 4885F6 TEST RSI, RSI
+ ; 89: 742B JEQ L2
+ ; 8B: 488BC7 MOV RAX, RDI
+ ; 8E: 4899 CQO
+ ; 90: 48F7FE IDIV RAX, RSI
+ ; 93: 488BD8 MOV RBX, RAX
+ ; 96: 488D0412 LEA RAX, [RDX+RDX]
+ ; 9A: 4885C0 TEST RAX, RAX
+ ; 9D: 750D JNE L1
+ ; 9F: L0: 48D1E2 SHL RDX, 1
+ ; A2: 4889510D MOV [RCX+13], RDX
+ ; A6: 488BE5 MOV RSP, RBP
+ ; A9: F8 CLC
+ ; AA: 5D POP RBP
+ ; AB: C3 RET
+ ; AC: L1: 4885FF TEST RDI, RDI
+ ; AF: 7DEE JNL L0
+ ; B1: 4801F2 ADD RDX, RSI
+ ; B4: EBE9 JMP L0
+ ; B6: L2: 0F0B0A BREAK 10 ; error trap
+ ; B9: 07 BYTE #X07
+ ; BA: 08 BYTE #X08 ; DIVISION-BY-ZERO-ERROR
+ ; BB: FE9E03 BYTE #XFE, #X9E, #X03 ; RDI
+ ; BE: FE1E03 BYTE #XFE, #X1E, #X03 ; RSI
+
+Wait, it's actually *shorter*? What about all those extra variables the `zapf`
+macro expands into? It turns out that SBCL is really quite good at optimizing
+`let` statements and local variables.
+
+This is why I love Common Lisp. You can be rewriting the syntax of the language
+and working on a tower of abstraction one moment, and looking at X86 assembly to
+see what the hell the computer is actually *doing* the next. It's wonderful.
+
+[Getting Places]: http://malisper.me/2015/09/22/getting-places/
+[Zap]: http://malisper.me/2015/09/29/zap/
+
+ {% endblock article %}