Optimizing loop conditions in assembly
Table of Contents
1. Introduction
I found a way to improve the logic of my conditional loops in my bf2nasm
project, and I decided to write about it. There is a LaTeX and PDF version of
this article in the same folder as bf2nasm.c
. Since it’s a bit hidden, I decided
to write an article on here as well.
2. Pretested and posttested loops
When I was reading Reversing: Secrets of Reverse Engineering, I found this part specially interesting:
Eldad Eilam. (2005). Reversing: Secrets of Reverse Engineering (pp. 56-57).
The most common high-level loop construct is the pretested loop, where the loop’s condition is tested before the loop’s body is executed. The problem with this construct is that it requires an extra unconditional jump at the end of the loop’s body in order to jump back to the beginning of the loop (for comparison, posttested loops only have a single conditional branch instruction at the end of the loop, which makes them more efficient). Because of this, it is common for optimizers to convert pretested loops to posttested loops. In some cases, this requires the insertion of an if statement before the beginning of the loop, so as to make sure the loop is not entered when its condition isn’t satisfied.
The compiler will sometimes optimize the following pretested loop:
while (a != b) { /* ... */ }
Into this posttested loop inside a conditional.
if (a != b) { do { /* ... */ } while (a != b); }
The assembly for the first loop would be something like:
.loop: cmp A, B je .done ; ... jmp .loop .done:
And the assembly for the second loop could be translated literally to something like:
cmp A, B je .done .loop: ; ... cmp A, B jne .loop .done:
As mentioned in the quote, on the first loop there is a conditional jump and an
unconditional one, while in the second one, the first condition is tested once
before the loop, and the loop only has one conditional jump. If the condition
fails on the second loop, the jump will not be performed and the execution will
continue at the done
label.
3. Brainfuck loops
In brainfuck, loops are defined with square brackets. The code inside the loop will be ran as long as the value at the current cell is not zero. At first, I thought this was a posttested loop, but this is wrong. The loops are pretested, meaning that the loop will be jumped over entirely if the value at the current cell is not zero.
When the program encounters a loop start, the following assembly is generated:
jmp .check_N .loop_N: ; ... .check_N: cmp byte [rcx], 0 jnz .loop_N
Where N is the loop counter.
That way, we even avoid the first comparison from the other example, and we jump directly to the “end” of the loop, where the condition is checked. Now we only make an “extra” unconditional jump once, for checking the condition the first time when entering the loop.
4. Compiler-generated assembly
Some months after writing the first sections of this article, I was messing with
an Emacs package called beardbolt; an Emacs implementation of Godbolt, the
Compiler Explorer. I was looking at some sources, and found an example of the
same jmp
strategy that I used for brainfuck.
Note that there is a bug in the current version of GCC (14.2.1) when generating
assembly with gcc -S -masm=intel
, and the offsets are not shown inside of the
square brackets (-24[rbp]
instead of [rbp-24]
).
As you can see, the first highlighted assembly instruction jumps unconditionally
to .L2
, where the actual comparison is made. In this case it jumps to the start
of the loop body (.L3
) only if [rbp-24]
is not zero.