# HG changeset patch # User Steve Losh # Date 1526944035 14400 # Node ID 21e596ce4966f8c388b150995d111e7e860b4a88 # Parent e32eb216565ec00284947ac3a21201ff6578a2a8 Commit updates from proofreading diff -r e32eb216565e -r 21e596ce4966 content/blog/2018/05/fun-with-macros-gathering.markdown --- a/content/blog/2018/05/fun-with-macros-gathering.markdown Sun May 20 18:52:32 2018 -0400 +++ b/content/blog/2018/05/fun-with-macros-gathering.markdown Mon May 21 19:07:15 2018 -0400 @@ -1,8 +1,8 @@ +++ title = "Fun with Macros: Gathering" snip = "Part 1 in a series of short posts about fun Common Lisp Macros." -date = 2018-05-21T16:40:00Z -draft = true +date = 2018-05-21T16:05:00Z +draft = false +++ @@ -52,8 +52,8 @@ (alexandria:with-gensyms (result) `(let ((,result nil)) (flet ((gather (item) - (push item ,result))) - (declare (dynamic-extent #'gather)) + (push item ,result) + item)) ,@body) (nreverse ,result)))) ``` @@ -61,11 +61,9 @@ ## Notes As the docstring mentions, sometimes you'll encounter procedural code that -iterates over things, but doesn't gather up and return any results (CL's -`maphash` and Alexandria's `map-permutations` are two examples). The -`gathering` macro provides an easy way to plug into the guts of the iteration -and get a list back. - +iterates over things but doesn't return any results (CL's `maphash` and +Alexandria's `map-permutations` are two examples). The `gathering` macro +provides an easy way to plug into the guts of the iteration and get a list back. The docstring describes how to use the macro, but there's a couple of extra things to note before we move on. @@ -80,18 +78,17 @@ The `flet`ed function that actually does the gathering is named as the symbol `gather`, not a gensym. Some folks will dislike that because it captures the -function name. - -I don't mind it. I consider that behavior part of the intended/documented API -of the macro, and because the symbol `gather` is coming from the macro's package -you'll need to import it (or package qualify it) to access it. +function name. I don't mind it. I consider that behavior part of the +intended/documented API of the macro, and because the symbol `gather` is coming +from the macro's package you'll need to import it (or package qualify it) to +access it. If you want to provide a bit more safety here you could potentially define a vanilla function for `gather` like this: ```lisp (defun gather (value) - "Gather `value` into the resulting list. Must be called from inside `gathering`." + "Gather `value` into the result. Must be called from inside `gathering`." (error "GATHER be called from within a GATHERING macro.")) ``` @@ -101,8 +98,9 @@ redefining the function, which would hopefully make them realize the potential conflict earlier. -Of course you could write your own version of `gathering` that takes in a symbol -to use for the function, if you prefer. +Of course you could also write your own version of `gathering` that takes in +a symbol to use for the function, if you prefer. That would avoid all of these +issues, at the cost of making every `(gathering ...)` slightly more verbose. ### Data Structures @@ -114,9 +112,21 @@ Feel free to skip this section. It gets a little bit into the weeds. -You might have noticed that the `gather` function is declared to have [dynamic -extent][]. This means that some implementations (e.g. SBCL) [can stack allocate -the closure][sbcl dynamic extent] and save a heap allocation. +One potential optimization we could make is to declare the `gather` function to +have [dynamic extent][]. This means that some implementations (e.g. SBCL) [can +stack allocate the closure][sbcl dynamic extent] and save a heap allocation. + +```lisp +(defmacro gathering-dynamic-extent (&body body) + (alexandria:with-gensyms (result) + `(let ((,result nil)) + (flet ((gather (item) + (push item ,result) + item)) + (declare (dynamic-extent #'gather)) ; NEW + ,@body) + (nreverse ,result)))) +``` Whether this matters much depends on the usage patterns. If you use `gathering` with some code that typically gathers no elements (and only occasionally gathers @@ -125,9 +135,6 @@ things then you're heap allocating all the cons cells anyway, so allocating the closure wouldn't be a big deal. -I decided to go ahead and enable the efficiency optimization. It doesn't make -any sense to be calling the closure once `gathering` has returned, so why not? - If you want to actually see the stack allocation in action, it might be a little trickier than you think. SBCL is pretty smart about inlining things, so it might not allocate the closure at runtime *at all*, even *without* the @@ -268,7 +275,7 @@ ```lisp (disassemble (lambda () - (gathering + (gathering-dynamic-extent (my-mapc #'gather '(1 2 3))))) ; disassembly for (LAMBDA ()) @@ -308,7 +315,78 @@ NIL ``` -Much nicer. +Much nicer. However, this optimization comes with a price: safety. + +The `gather` closure should never be called outside of the `gathering` block +that defines it. As the docstring says: it would be useless to do so anyway, +because the result has already been returned. But what would happen if the user +accidentally calls the closure? Let's try it out, first on the original +version without the `dynamic-extent` declaration: + +```lisp +(defparameter *f* nil) + +(defun leak (function) + (setf *f* function)) + +(gathering (leak #'gather)) + +(funcall *f* 1) + +; => 1 +``` + +Here the closure is heap-allocated, so calling it later is fine (if useless). +But what happens if we try our optimized version? + +```lisp +(gathering-dynamic-extent (leak #'gather)) + +(funcall *f*) + +; CORRUPTION WARNING in SBCL pid 19575(tid 0x7fffb6814380): +; Memory fault at 0xffffffff849ff8e7 (pc=0x19ff8df, sp=0x19ff890) +; The integrity of this image is possibly compromised. +; Continuing with fingers crossed. +; +; debugger invoked on a SB-SYS:MEMORY-FAULT-ERROR in thread +; #: +; Unhandled memory fault at #xFFFFFFFF849FF8E7. +; +; restarts (invokable by number or by possibly-abbreviated name): +; 0: [ABORT] Exit debugger, returning to top level. +; +; (SB-SYS:MEMORY-FAULT-ERROR) +; 0] +``` + +Things get even worse if you're brave (foolish) enough to be running with +`(declaim (safety 0))`: + +```lisp +(gathering-dynamic-extent (leak #'gather)) + +(funcall *f*) + +; debugger invoked on a SB-KERNEL:FLOATING-POINT-EXCEPTION in thread +; #: +; An arithmetic error SB-KERNEL:FLOATING-POINT-EXCEPTION was signalled. +; No traps are enabled? How can this be? +; +; +; restarts (invokable by number or by possibly-abbreviated name): +; 0: [ABORT] Exit debugger, returning to top level. +; +; ("bogus stack frame") +; 0] +``` + +The moral of this story is that although we can optimize for a little bit of +speed, it comes at a price that might not be worth paying. + +Here's an exercise for you: make the original heap-allocated version signal an +error (with a nice error message) when called outside of its `gathering` block, +instead of silently doing something useless. [dynamic extent]: http://clhs.lisp.se/Body/d_dynami.htm [sbcl dynamic extent]: http://www.sbcl.org/manual/#Dynamic_002dextent-allocation