Replacing conditional Lisp primitives with macros
Table of Contents
1. Introduction
Some months ago I started writing my own Lisp interpreter. I decided to
implement the or
and and
procedures as special form C primitives, but I wanted
to explain in the documentation that they don’t need to be C primitives, since
we can define them as macros that use if
. When writing the example for the
documentation, I wrote a few different alternatives that I consider worth
sharing here.
Before getting into the macros, I will briefly explain some of these concepts. Keep in mind that I will be referring to these concepts in the context of a Lisp interpreter written in C, more specifically on the Lisp syntax used in my interpreter.
1.1. Primitives
Even though the user can define his own functions, at some point the interpreter will have to translate them to machine instructions that run on the actual hardware. What this means is that, even thought the user can write:
(defun square (x) (* x x)) (square 5)
At some point the Lisp has to multiply the numbers, either with another Lisp procedure that uses, for example, addition; or with actual CPU instructions.
These basic building blocks, like arithmetic functions, are called primitives. In my case they are built into the interpreter as C functions.
1.2. Special Forms
Lisp follows a general evaluation process when calling a procedure. Usually,
when the user calls a function func
with the following arguments:
(defun func (arg0 arg1 arg2) (format "%d %f %s" arg0 arg1 arg2)) ; Sample body (func 1 (+ 3 4.0) "foo")
The following process is performed:
- Each argument is evaluated. The integer
1
evaluates to itself; the list(+ 3 4)
is treated as a procedure call, and returns 7.0; and the string"foo"
evaluates to itself. - The evaluated arguments are bound to the formal arguments. In this case
arg0
,arg1
andarg2
, respectively. - Each expression in the body of the procedure is evaluated, returning the last
result. In this case, the string returned by
format
is also returned byfunc
.
However, some exceptions have to be made. For example, we can’t define if
as a
normal procedure because its 3 arguments would be evaluated before the call,
when in reality only the second or the third argument should be evaluated
depending on the first one:
(defun my-if (predicate consequent alternative) ...) (my-if tru (define var 1) (define var 2))
In the previous example var
would be defined to 2 because all three arguments
were evaluated before the procedure call was made. In this case, it’s assumed
that the arguments are evaluated in order, which is true in my Lisp, but isn’t
guaranteed in most interpreters.
These special procedures that don’t follow normal evaluation rules are called special forms.
1.3. Macros
Macros in my Lisp are inspired by Emacs Lisp macros. Macros are similar to procedures, but have a few key differences:
- When a macro is called, the arguments are not evaluated before applying it, so the macro can operate on the un-evaluated expressions directly, instead of on the values they compute. The first step of a macro call is binding the un-evaluated arguments to the formals.
- Macros don’t directly compute values, they instead build Lisp expressions that will be used to compute the actual values. The second step of a macro call is the macro expansion. In this step, the macro is called just like a lambda, returning a Lisp expression.
- The last step of a macro call is evaluating the expanded expression, which will be used to compute the actual value returned by the macro.
In other words the general process when calling a procedure is:
Evaluate arguments -> Bind arguments -> Evaluate body `-----------------------------´ (Apply)
While the process of calling a macro is:
Bind arguments -> Evaluate body -> Evaluate expansion `-----------------------------´ (Expand)
For example:
(defmacro my-macro (name) (list 'define name 123)) ⇒ <macro> (my-macro some-name) ⇒ 123 (macroexpand '(my-macro some-name)) ⇒ (define some-name 123) some-name ⇒ 123
2. The or
macro
First, the expected behavior of or
:
- When called with no arguments,
nil
is returned. This is the expected behavior in Scheme and in my Lisp. - Each argument is evaluated in order. If one of them is non-nil, stop evaluating and return it.
- If all arguments are
nil
, thennil
is returned.
2.1. Version 1
(defmacro my-or (&rest exprs) (defun or-lst (expr-list) (if (null? expr-list) nil ;; TODO: Don't overwrite "result", generate unique symbol. (list (list 'lambda (list 'result) (list 'if 'result 'result (or-lst (cdr expr-list)))) (car expr-list)))) (or-lst exprs))
The first version uses an inner or-lst
procedure to allow easier recursion. Note
that this procedure is not defined globally, the scope is restricted to the body
of the macro. Since my-or
uses &rest
, when the macro is called with
(my-or 'A 'B 'C)
, the list (A B C)
is bound to the symbol exprs
. This makes
recursion trickier, because if we call ourselves with (cdr exprs)
, we are not
doing (my-or B C)
, but (my-or '(B C))
, which gets put into another list because
of &rest
. An easier solution for this “problem” is shown on the next section.
First, it checks the base case, we didn’t get any arguments. In that case, nil
is returned.
If we got an argument, the macro will expand to a call to a lambda that receives the evaluated argument. We need to do this to evaluate the expression only once. An incorrect example:
(or A B C) ;; Expanded (incorrectly) into: (if A A (if B B (if C C nil)))
In that example, A
is evaluated once to get the condition and, if the result is
non-nil, A
is evaluated a second time as the consequent. Instead, the correct
approach is something like:
(let ((result A)) (if result result ...))
In that second example, A
is evaluated only once. However, since my Lisp didn’t
have a let
macro at this point, I used an uglier (but equivalent) version which
calls an anonymous lambda:
((lambda (result) (if result result ...)) A)
As you can probably tell, there is a Big Bug™ in the macro, and it will remain throughout all versions. The name “result”, used as the lambda argument, is far from unique, so we might overwrite some user value during this call. In a real implementation, we should use some function like Emacs Lisp’s gensym.
Here are some examples of the macro:
(my-or nil 'A 'B) ⇒ A (macroexpand '(my-or 'A 'B 'C)) ; God almighty... ⇒ ((lambda (result) (if result result ((lambda (result) (if result result ((lambda (result) (if result result nil)) 'C))) 'B))) 'A) (my-or) ⇒ nil (macroexpand '(my-or)) ⇒ nil
Throughout this article I will make some minor changes to the interpreter output
for readability, like formatting the indentation or replacing (quote expr)
with
'expr
.
2.2. Version 2
(defmacro my-or (&rest exprs) (if (null? exprs) nil (list (list 'lambda (list 'result) (list 'if 'result 'result ;; The expansion will call `my-or'. (cons 'my-or (cdr exprs)))) (car exprs))))
This version still uses the lambda call method, but it doesn’t use an inner
or-lst
procedure. Instead, it introduces a call to the macro itself in the
expansion.
We use cons
to append my-or
to the cdr
of the argument list, constructing a
function call. We could also use apply
, but we would have to quote the argument
list to avoid evaluating the cdr
as another function call:
;; Incorrect version, the `cadr' will be interpreted as a function. (list 'apply 'my-or (cdr exprs)) ;; Correct version. (list 'apply 'my-or (list 'quote (cdr exprs))) ;; Which, assumming `exprs' is (A B C), expands to: (apply my-or (quote (B C))) ;; Or alternatively: (apply my-or '(B C))
Some examples of the second version:
(my-or nil 'A 'B) ⇒ A (macroexpand '(my-or 'A 'B 'C)) ; Much more readable ⇒ ((lambda (result) (if result result (my-or ('B 'C)))) 'A)
2.3. Version 3
(defmacro my-or (&rest exprs) (if (null? exprs) nil (list 'let (list (list 'result (car exprs))) (list 'if 'result 'result (cons 'my-or (cdr exprs))))))
Finally, to make the expansion a bit more readable, we can remove that lambda
call by using the let
macro. This version is probably a bit less efficient since
let
also expands to a lambda call, but it’s more readable.
Some examples of the third version:
(my-or nil 'A 'B) ⇒ A (macroexpand '(my-or 'A 'B 'C)) ; Much more readable ⇒ (let ((result 'A)) (if result result (my-or 'B 'C)))
2.4. Version 4
This is not how most Lisp programmers declare macros, though. Similarly to how
the quote
special form can be used for delaying evaluation, the backquote (`
)
can be used to selectively delay evaluation.
For example, in the following code, the elements marked for unquoting (,
) are
evaluated, while the rest of the list remains unchanged.
`(a b ,(list 1 2 3) c d) ⇒ (a b (1 2 3) c d) `(a b (c d ,(null? 789)) e f) ⇒ (a b (c d nil) e f)
Furthermore, the splice operator can be used on an expression that evaluates to
a list to place its contents inside another list. We will use it when calling
our macro recursively, instead of using cons
.
`(a b ,(list 1 2 3) c d) ⇒ (a b (1 2 3) c d) `(a b ,@(list 1 2 3) c d) ⇒ (a b 1 2 3 c d)
With this in mind, we can define the fourth version of our macro.
(defmacro my-or (&rest exprs) (if (null? exprs) nil `(let ((result ,(car exprs))) (if result result (my-or ,@(cdr exprs))))))
Some examples of the fourth version:
(my-or nil 'A 'B) ⇒ A (macroexpand '(my-or 'A 'B 'C)) ⇒ (let ((result 'A)) (if result result (my-or 'B 'C)))
As you can see, it works exactly like the third version, but the definition is more readable.
3. The and
macro
This version of my-and
is based on the fourth version of my-or
, but with some
important differences that will be explained below. Note that this version also
overwrites the result
symbol.
(defmacro my-and (&rest exprs) (cond ((null? exprs) tru) ((null? (cdr exprs)) (car exprs)) (tru `(let ((result ,(car exprs))) (if result (my-and ,@(cdr exprs)) nil)))))
The first difference is that we use cond
instead of if
. This is because we want
to check multiple conditions sequentially, and it’s more readable this way.
The first condition checks if we received no arguments. You might notice that,
when called with no arguments, the function returns tru
instead of nil
. Again,
this is the expected behavior in Scheme and in my Lisp.
The second condition checks if we only received one argument. If we did, then we
simply want to return it, and its evaluation will determine whether it was nil
or not. This is important, since we will call ourselves recursively.
Otherwise, if we got two or more arguments, we will evaluate the first one and,
if it evaluates to non-nil, we will continue checking the rest of the arguments
by calling ourselves recursively. If an argument evaluates to nil
, the call to
my-and
returns nil
.
Some example calls to my-and
.
(my-and) ⇒ tru (my-and 'A 'B 'C) ⇒ C (my-and 'A nil 'C) ⇒ nil
And some macro expansions.
(macroexpand '(my-and 'A 'B 'C)) ⇒ (let ((result 'A)) (if result (my-and 'B 'C) nil)) (macroexpand '(my-and 'A)) ⇒ 'A