# HG changeset patch
# User Steve Losh <steve@stevelosh.com>
# Date 1460992582 0
# Node ID b157bf8168a279f38ece6b937773107aa7e9cabb
# Parent  902d171a1a856cab8b6352946ecdf4571cb822ca
Document the nightmare

diff -r 902d171a1a85 -r b157bf8168a2 src/wam/compiler.lisp
--- a/src/wam/compiler.lisp	Mon Apr 18 13:08:47 2016 +0000
+++ b/src/wam/compiler.lisp	Mon Apr 18 15:16:22 2016 +0000
@@ -147,19 +147,128 @@
 
 
 ;;;; Parsing
-;;; Turns p(A, q(A, B)) into something like:
+;;; You might want to grab a coffee for this one.
+;;;
+;;; Consider this simple Prolog example: `p(A, q(A, r(B)))`.  We're going to get
+;;; this as a Lisp list: `(p :a (q :a (r b)))`.
+;;;
+;;; The goal is to turn this list into a set of register assignments.  The book
+;;; handwaves around how to do this, and it turns out to be pretty complicated.
+;;; This example will (maybe, read on) be turned into:
+;;;
+;;;     A0 <- X2
+;;;     A1 <- (q X2 X3)
+;;;     X2 <- :a
+;;;     X3 <- (r X4)
+;;;     X4 <- :b
+;;;
+;;; There are a few things to note here.  First: like the book says, the
+;;; outermost predicate is stripped off and returned separately (later it'll be
+;;; used to label the code for a program, or to figure out the procedure to call
+;;; for a query).
+;;;
+;;; The first N registers are designated as argument registers.  Structure
+;;; assignments can live directly in the argument registers, but variables
+;;; cannot.  In the example above we can see that A1 contains a structure
+;;; assignment.  However, the variable `:a` doesn't live in A0 -- it lives in
+;;; X2, which A0 points at.  The books neglects to explain this little fact.
 ;;;
-;;;   X0 -> p(X1, X2)
-;;;   X1 -> A
-;;;   X2 -> q(X1, X3)
-;;;   X3 -> B
+;;; The next edge case is permanent variables, which the book does talk about.
+;;; Permanent variables are allocated to stack registers, so if `:b` was
+;;; permanent in our example we'd get:
+;;;
+;;;     A0 <- X2
+;;;     A1 <- (q X2 X3)
+;;;     X2 <- :a
+;;;     X3 <- (r Y0)
+;;;     Y0 <- :b
+;;;
+;;; Note that the mapping of permanent variables to stack register numbers has
+;;; to be consistent as we compile all the terms in a clause, so we cheat a bit
+;;; here and just always add them all, in order, to the register assignment
+;;; produced when parsing.  They'll get flattened away later anyway -- it's the
+;;; USES that we actually care about.  In our example, the `Y0 <- :b` will get
+;;; flattened away, but the USE of Y0 in X3 will remain).
+;;;
+;;; We're almost done, I promise, but there's one more edge case to deal with.
+;;;
+;;; When we've got a clause with a head and at least one body term, we need the
+;;; head term and the first body term to share argument/local registers.  For
+;;; example, if we have the clause `p(Cats) :- q(A, B, C, Cats)` then when
+;;; compiling the head `(p :cats)` we want to get:
+;;;
+;;;     A0 <- X4
+;;;     A1 <- ???
+;;;     A2 <- ???
+;;;     A3 <- ???
+;;;     X4 <- :cats
+;;;
+;;; And when compiling `(q :a :b :c :cats)` we need:
 ;;;
-;;; And then processes the argument register assignments into:
+;;;     A0 <- X5
+;;;     A1 <- X6
+;;;     A2 <- X7
+;;;     A3 <- X4
+;;;     X4 <- :cats
+;;;     X5 <- :a
+;;;     X6 <- :b
+;;;     X7 <- :c
+;;;
+;;; What the hell are those empty argument registers in p?  And why did we order
+;;; the X registers of q like that?
+;;;
+;;; The book does not bother to mention this important fact at all, so to find
+;;; out that you have to handle this you need to do the following:
+;;;
+;;; 1. Implement it without this behavior.
+;;; 2. Notice your results are wrong.
+;;; 3. Figure out the right bytecode on a whiteboard.
+;;; 4. Try to puzzle out why that bytecode isn't generated when you do exactly
+;;;    what the book says.
+;;; 5. Scour IRC and the web for scraps of information on what the hell you need
+;;;    to do here.
+;;; 6. Find the answer in a comment squirreled away in a source file somewhere
+;;;    in a language you don't know.
+;;; 7. Drink.
+;;;
+;;; Perhaps you're reading this comment as part of step 6 right now.  If so:
+;;; welcome aboard.  Email me and we can swap horror stories about this process
+;;; over drinks some time.
 ;;;
-;;;   p/2:
-;;;   A0 -> A
-;;;   A1 -> q(A1, X3)
-;;;   X2 -> B
+;;; Okay, so the clause head and first body term need to share argument/local
+;;; registers.  Why?  To understand this, we need to go back to what Prolog
+;;; clauses are supposed to do.
+;;;
+;;; Imagine we have:
+;;;
+;;;     p(f(X)) :- q(X), ...other goals.
+;;;
+;;; When we want to check if `p(SOMETHING)` is true, we need to first unify
+;;; SOMETHING with `f(X)`.  Then we search all of the goals in the body, AFTER
+;;; substituting in any X's in those goals with the X from the result of the
+;;; unification.
+;;;
+;;; This substitution is why we need the head and the first term in the body to
+;;; share the same argument/local registers.  By sharing the registers, when the
+;;; body term builds a representation of itself on the stack before calling its
+;;; predicate any references to X will be point at the (unified) results instead
+;;; of fresh ones (because they'll be compiled as `put_value` instead of
+;;; `put_variable`).
+
+;;; But wait: don't we need to substitute into ALL the body terms, not just the
+;;; first one?  Yes we do, but the trick is that any variables in the REST of
+;;; the body that would need to be substituted must, by definition, be permanent
+;;; variables!  So the substitution process for the rest of the body is handled
+;;; automatically with the stack machinery.
+;;;
+;;; In theory, you could eliminate this edge case by NOT treating the head and
+;;; first goal as a single term when searching for permanent variables.  Then
+;;; all substitution would happen elegantly through the stack.  But this
+;;; allocates more variables on the stack than you really need (especially for
+;;; rules with just a single term in the body (which is many of them)), so we
+;;; have this extra corner case to optimize it away.
+;;;
+;;; We now return you to your regularly scheduled Lisp code.
 
 (defun parse-term (term permanent-variables
                    ;; JESUS TAKE THE WHEEL
@@ -177,7 +286,8 @@
          (arguments (rest term))
          (arity (length arguments))
          ;; Preallocate enough registers for all of the arguments.  We'll fill
-         ;; them in later.
+         ;; them in later.  Note that things are more complicated in the head
+         ;; and first body term of a clause (see above).
          (local-registers (make-array 64
                             :fill-pointer (or reserved-arity arity)
                             :adjustable t
diff -r 902d171a1a85 -r b157bf8168a2 src/wam/interpreter.lisp
--- a/src/wam/interpreter.lisp	Mon Apr 18 13:08:47 2016 +0000
+++ b/src/wam/interpreter.lisp	Mon Apr 18 15:16:22 2016 +0000
@@ -143,13 +143,56 @@
 
 
 ;;;; Instruction Definition
+;;; These macros are a pair of real greasy bastards.
+;;;
+;;; Basically the issue is that there exist two separate types of registers:
+;;; local registers and stack registers.  The process of retrieving the contents
+;;; of a register is different for each type.
+;;;
+;;; Certain machine instructions take a register as an argument and do something
+;;; with it.  Because the two register types require different access methods,
+;;; the instruction needs to know what kind of register it's dealing with.
+;;;
+;;; One possible way to solve this would be to encode whether this is
+;;; a local/stack register in the register argument itself (e.g. with a tag
+;;; bit).  This would work, and a previous version of the code did that, but
+;;; it's not ideal.  It turns out we know the type of the register at compile
+;;; time, so requiring a mask/test at run time for every register access is
+;;; wasteful.
+;;;
+;;; Instead we use an ugly, but fast, solution.  For every instruction that
+;;; takes a register argument we make TWO opcodes instead of just one.  The
+;;; first is the "-local" variant of the instruction, which treats its register
+;;; argument as a local register.  The second is the "-stack" variant.  When we
+;;; compile we can just pick the appropriate opcode, and now we no longer need
+;;; a runtime test for every single register assignment.
+;;;
+;;; To make the process of defining these two "variants" we have these two
+;;; macros.  `define-instruction` (singular) is just a little sugar around
+;;; `defun*`, for those instructions that don't deal with arguments.
+;;;
+;;; `define-instructions` (plural) is the awful one.  You pass it a pair of
+;;; symbols for the two variant names.  Two functions will be defined, both with
+;;; the same body, with the symbol `%wam-register%` macroletted to the
+;;; appropriate access code.  So in the body, instead of using
+;;; `(wam-{local/argument}-register wam register)` you just use
+;;; `(%wam-register% wam register)` and it'll do the right thing.
+
 (defmacro define-instruction (name lambda-list &body body)
+  "Define an instruction function.
+
+  This is just syntactic sugar over `defun*` that will add the `(returns :void)`
+  declaration for you, and also append a `(values)` to the end of the body to
+  make sure it actually does return void.
+
+  "
   `(defun* ,name ,lambda-list
      (:returns :void)
      ,@body
      (values)))
 
 (defmacro define-instructions ((local-name stack-name) lambda-list &body body)
+  "Define a local/stack pair of instructions."
   `(progn
     (macrolet ((%wam-register% (wam register)
                  `(wam-local-register ,wam ,register)))