Replacing conditional Lisp primitives with macros

Index | Up


Table of Contents

1. Introduction

Some months ago I started writing my own Lisp. I decided to implement the or and and procedures as special form C primitives, but I wanted to explain in the documentation that they don’t need to be C primitives, since we can define them as macros that use if. When writing the example for the documentation, I wrote a few different alternatives that I consider worth sharing here.

Before getting into the macros, I will briefly explain some of these concepts. Keep in mind that I will be referring to these concepts in the context of a Lisp interpreter written in C, more specifically on the Lisp syntax used in my interpreter.

1.1. Primitives

Even though the user can define his own functions, at some point the interpreter will have to translate them to machine instructions that run on the actual hardware. What this means is that, even thought the user can write:

(defun square (x)
  (* x x))

(square 5)

At some point the Lisp has to multiply the numbers, either with another Lisp procedure that uses, for example, addition; or with actual CPU instructions.

These basic building blocks, like arithmetic functions, are called primitives. In my case they are built into the interpreter as C functions.

1.2. Special Forms

Lisp follows a general evaluation process when calling a procedure. Usually, when the user calls a function func with the following arguments:

(defun func (arg0 arg1 arg2)
  (format "%d %f %s" arg0 arg1 arg2)) ; Sample body

(func 1 (+ 3 4.0) "foo")

The following process is performed:

  1. Each argument is evaluated. The integer 1 evaluates to itself; the list (+ 3 4) is treated as a procedure call, and returns 7.0; and the string "foo" evaluates to itself.
  2. The evaluated arguments are bound to the formal arguments. In this case arg0, arg1 and arg2, respectively.
  3. Each expression in the body of the procedure is evaluated, returning the last result. In this case, the string returned by format is also returned by func.

However, some exceptions have to be made. For example, we can’t define if as a normal procedure because its 3 arguments would be evaluated before the call, when in reality only the second or the third argument should be evaluated depending on the first one:

(defun my-if (predicate consequent alternative)
  ...)

(my-if tru
       (define var 1)
       (define var 2))

In the previous example var would be defined to 2 because all three arguments were evaluated before the procedure call was made. In this case, it’s assumed that the arguments are evaluated in order, which is true in my Lisp, but isn’t guaranteed in most interpreters.

These special procedures that don’t follow normal evaluation rules are called special forms.

1.3. Macros

Macros in my Lisp are inspired by Emacs Lisp macros. Macros are similar to procedures, but have a few key differences:

  1. When a macro is called, the arguments are not evaluated before applying it, so the macro can operate on the un-evaluated expressions directly, instead of on the values they compute. The first step of a macro call is binding the un-evaluated arguments to the formals.
  2. Macros don’t directly compute values, they instead build Lisp expressions that will be used to compute the actual values. The second step of a macro call is the macro expansion. In this step, the macro is called just like a lambda, returning a Lisp expression.
  3. The last step of a macro call is evaluating the expanded expression, which will be used to compute the actual value returned by the macro.

In other words the general process when calling a procedure is:

Evaluate arguments -> Bind arguments -> Evaluate body
                      `-----------------------------´
                                 (Apply)

While the process of calling a macro is:

Bind arguments -> Evaluate body -> Evaluate expansion
`-----------------------------´
            (Expand)

For example:

> (defmacro my-macro (name)
    (list 'define name 123))
<macro>

> (my-macro some-name)
123

> (macroexpand '(my-macro some-name))
(define some-name 123)

> some-name
123

2. The or macro

First, the expected behavior of or:

  1. When called with no arguments, nil is returned. This is the expected behavior in Scheme and in my Lisp.
  2. Each argument is evaluated in order. If one of them is non-nil, stop evaluating and return it.
  3. If all arguments are nil, then nil is returned.

2.1. Version 1

(defmacro my-or (&rest exprs)
  (defun or-lst (expr-list)
    (if (null? expr-list)
        nil
        ;; TODO: Don't overwrite "result", generate unique symbol.
        (list (list 'lambda (list 'result)
                    (list 'if 'result 'result
                          (or-lst (cdr expr-list))))
              (car expr-list))))
  (or-lst exprs))

The first version uses an inner or-lst procedure to allow easier recursion. Note that this procedure is not defined globally, the scope is restricted to the body of the macro. Since my-or uses &rest, when the macro is called with (my-or 'A 'B 'C), the list (A B C) is bound to the symbol exprs. This makes recursion trickier, because if we call ourselves with (cdr exprs), we are not doing (my-or B C), but (my-or '(B C)), which gets put into another list because of &rest. An easier solution for this “problem” is shown on the next section.

First, it checks the base case, we didn’t get any arguments. In that case, nil is returned.

If we got an argument, the macro will expand to a call to a lambda that receives the evaluated argument. We need to do this to evaluate the expression only once. An incorrect example:

(or A B C)

;; Expanded (incorrectly) into:
(if A A
    (if B B
        (if C C
            nil)))

In that example, A is evaluated once to get the condition and, if the result is non-nil, A is evaluated a second time as the consequent. Instead, the correct approach is something like:

(let ((result A))
  (if result
      result
      ...))

In that second example, A is evaluated only once. However, since my Lisp didn’t have a let macro at this point, I used an uglier (but equivalent) version which calls an anonymous lambda:

((lambda (result)
   (if result
       result
       ...))
 A)

As you can probably tell, there is a Big Bug™ in the macro, and it will remain throughout all versions. The name “result”, used as the lambda argument, is far from unique, so we might overwrite some user value during this call. In a real implementation, we should use some function like Emacs Lisp’s gensym.

Here are some examples of the macro:

> (my-or nil 'A 'B)
A

> (macroexpand '(my-or 'A 'B 'C))  ; God almighty...
((lambda (result)
   (if result result
     ((lambda (result)
        (if result result
          ((lambda (result)
             (if result result
               nil))
           'C)))
      'B)))
 'A)

> (my-or)
nil

> (macroexpand '(my-or))
nil

Throughout this article I will make some minor changes to the interpreter output for readability, like formatting the indentation or replacing (quote expr) with 'expr.

2.2. Version 2

(defmacro my-or (&rest exprs)
  (if (null? exprs)
      nil
      (list (list 'lambda (list 'result)
                  (list 'if 'result 'result
                        ;; The expansion will call `my-or'.
                        (cons 'my-or (cdr exprs))))
            (car exprs))))

This version still uses the lambda call method, but it doesn’t use an inner or-lst procedure. Instead, it introduces a call to the macro itself in the expansion.

We use cons to append my-or to the cdr of the argument list, constructing a function call. We could also use apply, but we would have to quote the argument list to avoid evaluating the cdr as another function call:

;; Incorrect version, the `cadr' will be interpreted as a function.
(list 'apply 'my-or
      (cdr exprs))

;; Correct version.
(list 'apply 'my-or
      (list 'quote (cdr exprs)))

;; Which, assumming `exprs' is (A B C), expands to:
(apply my-or (quote (B C)))

;; Or alternatively:
(apply my-or '(B C))

Some examples of the second version:

> (my-or nil 'A 'B)
A

> (macroexpand '(my-or 'A 'B 'C))  ; Much more readable
((lambda (result)
   (if result result
     (my-or ('B 'C))))
 'A)

2.3. Version 3

(defmacro my-or (&rest exprs)
  (if (null? exprs)
      nil
      (list 'let (list (list 'result (car exprs)))
            (list 'if 'result 'result
                  (cons 'my-or (cdr exprs))))))

Finally, to make the expansion a bit more readable, we can remove that lambda call by using the let macro. This version is probably a bit less efficient since let also expands to a lambda call, but it’s more readable.

Some examples of the third version:

> (my-or nil 'A 'B)
A

> (macroexpand '(my-or 'A 'B 'C))  ; Much more readable
(let ((result 'A))
  (if result result
    (my-or 'B 'C)))

3. The and macro

This version of my-and is based on the third version of my-or, so it also overwrites the result symbol.

(defmacro my-and (&rest exprs)
  (if (null? exprs)
      tru
      (list 'let (list (list 'result (car exprs)))
            (list 'if
                  'result
                  (if (null? (cdr exprs))
                      'result
                      (cons 'my-and (cdr exprs)))
                  nil))))

The first difference is that, when expr is empty, tru is returned instead of nil. Again, this is the expected behavior in Scheme and in my Lisp.

Then, a conditional is introduced when expanding the macro. Just to be clear, this conditional, the one that checks if (cdr exprs) is empty, will be performed when the macro is expanded, not when the expansion is evaluated. This conditional is needed because, if we reached the last argument, we want to return it if it’s non-nil. If we are not on the last argument, we keep checking by calling ourselves, just like we did in my-or.

(let ((result 'A))
  (if result
    (my-and 'B 'C)
    nil))

4. Final note

I will end up adding a 4th version once I add backquote support to my Lisp.

Feel free to contribute if you have any suggestions or improvements.