Optimizing loop conditions in assembly

1. Introduction
2. Pretested and posttested loops
3. Brainfuck loops
4. Compiler-generated assembly

1. Introduction

I found a way to improve the logic of my conditional loops in my bf2nasm project, and I decided to write about it. There is a LaTeX and PDF version of this article in the same folder as bf2nasm.c. Since it’s a bit hidden, I decided to write an article on here as well.

2. Pretested and posttested loops

When I was reading Reversing: Secrets of Reverse Engineering, I found this part specially interesting:

Eldad Eilam. (2005). Reversing: Secrets of Reverse Engineering (pp. 56-57).

The most common high-level loop construct is the pretested loop, where the loop’s condition is tested before the loop’s body is executed. The problem with this construct is that it requires an extra unconditional jump at the end of the loop’s body in order to jump back to the beginning of the loop (for comparison, posttested loops only have a single conditional branch instruction at the end of the loop, which makes them more efficient). Because of this, it is common for optimizers to convert pretested loops to posttested loops. In some cases, this requires the insertion of an if statement before the beginning of the loop, so as to make sure the loop is not entered when its condition isn’t satisfied.

The compiler will sometimes optimize the following pretested loop:

while (a != b) {
    /* ... */
}

Into this posttested loop inside a conditional.

if (a != b) {
    do {
        /* ... */
    } while (a != b);
}

The assembly for the first loop would be something like:

.loop:
    cmp A, B
    je .done
    ; ...
    jmp .loop
.done:

And the assembly for the second loop could be translated literally to something like:

    cmp A, B
    je .done
.loop:
    ; ...
    cmp A, B
    jne .loop
.done:

As mentioned in the quote, on the first loop there is a conditional jump and an unconditional one, while in the second one, the first condition is tested once before the loop, and the loop only has one conditional jump. If the condition fails on the second loop, the jump will not be performed and the execution will continue at the done label.

3. Brainfuck loops

In brainfuck, loops are defined with square brackets. The code inside the loop will be ran as long as the value at the current cell is not zero. At first, I thought this was a posttested loop, but this is wrong. The loops are pretested, meaning that the loop will be jumped over entirely if the value at the current cell is not zero.

When the program encounters a loop start, the following assembly is generated:

    jmp .check_N
.loop_N:
    ; ...
.check_N:
    cmp byte [rcx], 0
    jnz .loop_N

Where N is the loop counter.

That way, we even avoid the first comparison from the other example, and we jump directly to the “end” of the loop, where the condition is checked. Now we only make an “extra” unconditional jump once, for checking the condition the first time when entering the loop.

4. Compiler-generated assembly

Some months after writing the first sections of this article, I was messing with an Emacs package called beardbolt; an Emacs implementation of Godbolt, the Compiler Explorer. I was looking at some sources, and found an example of the same jmp strategy that I used for brainfuck.

Note that there is a bug in the current version of GCC (14.2.1) when generating assembly with gcc -S -masm=intel, and the offsets are not shown inside of the square brackets (-24[rbp] instead of [rbp-24]).

As you can see, the first highlighted assembly instruction jumps unconditionally to .L2, where the actual comparison is made. In this case it jumps to the start of the loop body (.L3) only if [rbp-24] is not zero.

Optimizing loop conditions in assembly

Table of Contents

1. Introduction

2. Pretested and posttested loops

3. Brainfuck loops

4. Compiler-generated assembly