Switch statements from a low-level perspective

1. Introduction
2. If-else blocks
3. Switch statements
- 3.1. Simple switch
- 3.2. Jump tables
4. Conclusion

1. Introduction

Since I started learning C, I have heard that switch statements are faster and essentially better than if-else blocks, and that you should use them whenever possible. I always heard this is because the program jumps to the case label directly, without checking all previous possibilities one by one.

This article was motivated because a friend asked me about switch statements, and although I have kind of known the reason why they are better for a long time, I never actually thought about how I would implement the compiler optimizations myself from assembly. I wrote this code, but I decided to comments were big enough to turn them into an article.

I will to try to explain the following in this article:

How a compiler optimizes switch statements, and how they can be faster than multiple if-else blocks.
Why the different cases of the switch statement need to be known at compile time in most programming languages.

Since I am going to explain this from a low-level perspective, you will need a bit of x86 assembly knowledge.

2. If-else blocks

Let’s first have a look at a normal if-else conditional.

if (my_var == 11) {
    /* Block 1 */
} else if (my_var == 12) {
    /* Block 2 */
} else if (my_var == 15) {
    /* Block 3 */
} else {
    /* Block 4 */
}

Keep in mind that this is the same as:

if (my_var == 11) {
    /* Block 1 */
} else {
    if (my_var == 12) {
        /* Block 2 */
    } else {
        if (my_var == 15) {
            /* Block 3 */
        } else {
            /* Block 4 */
        }
    }
}

Now let’s look at the assembly at a compiler (without optimizations) could have generated:

    mov eax, [my_var]

    cmp eax, 11
    jne else_one
    ; Block 1...
    jmp done

else_one:
    cmp eax, 12
    jne else_two
    ; Block 2...
    jmp done

else_two:
    cmp eax, 15
    jne else_three
    ; Block 3...
    jmp done

else_three:
    ; Block 4...

done:
    ; Continue after conditionals...

As expected, the conditions are checked sequentially.

3. Switch statements

3.1. Simple switch

Let’s look at a switch statement equivalent to the previous if conditionals.

switch (my_var) {
    case 11:
        /* Block 1 */
        break;
    case 12:
        /* Block 2 */
        break;
    case 15:
        /* Block 3 */
        break;
    default:
        /* Block 4 */
        break;
}

For this specific switch statement, since it’s pretty small, the compiler might produce something similar to the previous if-else example:

    mov eax, [my_var]

    cmp eax, 11
    je  case_11
    cmp eax, 12
    je  case_12
    cmp eax, 15
    je  case_15
    jmp case_default

case_11:
    ; Block 1...
    jmp done

case_12:
    ; Block 2...
    jmp done

case_15:
    ; Block 3...
    jmp done

case_default:
    ; Block 4...

done:
    ; Continue after conditionals...

However, if there are a lot of switch statements, and they are not too apart from each other, the compiler will probably use a jump table.

3.2. Jump tables

The idea behind the jump table is using the value of my_var as an index in an array of pointers (the jump table). Each element in the jump table will contain the address of a procedure corresponding to the label of the switch.

The assembly needs to do the following:

Define the jump table, in our case in the .data section.
In the location of the switch, somehow convert the value inside my_var to the index of the jump table, and jump to the stored address at that index (more on this below).
After the jump instruction, add a “done” label that each case will use for returning.
Define the case labels somewhere in the .text section, with the blocks that the user defined in the switch. At the end of the blocks, jump to the “done” label that we declared inside the main function.

For example:

section .data
jump_table:
    ; TODO: Handle cases smaller than 11
    dq case_11
    dq case_12
    dq case_default ; case 13, not specified
    dq case_default ; case 14, not specified
    dq case_15
    ; TODO: Handle cases greater than 15

; ------------------------------------------------------------------------------

section .text
my_func:
    ; Code before the switch...

    mov eax, [my_var]

    ; TODO: Get index in the jump table from the value in `eax'
    jmp ???

.switch_done:
    ; Code after the switch...
    ret

; ------------------------------------------------------------------------------

case_11:
    ; Block 1...
    jmp my_func.switch_done

case_12:
    ; Block 2...
    jmp my_func.switch_done

case_15:
    ; Block 3...
    jmp my_func.switch_done

case_default:
    ; Block 4...
    jmp my_func.switch_done

The addresses of all those case_* labels at the bottom will be stored in the jump table inside the .data section. Since this is x86_64 assembly, each element of the jump table is a quad-word because it needs to be able to hold these 64-bit addresses. For 32-bits, we would use dd for declaring a 32-bit double-word.

Note how the third and fourth elements in the jump table correspond to the 13 and 14 values that the user did not specify in the switch statement. Since they are just two, we can fill them with the address of the case_default label.

As you can see, the only thing left is making sure the value is within the first and last element of the array, and then calculating the index in the array from there.

The first part is simple, just compare eax against the value of the lowest and biggest case values. If it’s not within those bounds, jump to the case_default label.

; if (eax < 11 || eax > 15)
;     goto case_default
cmp eax, 11
jl  case_default
cmp eax, 15
jg  case_default

To calculate the index in the array, we simply subtract the value of the lowest case (in this case 11) to the value in my_var. However, we also need to multiply this index by the size of each element in jump_table (in this case 8-byte quad-words) to get the real offset. For example:

; Subtract the lowest case
sub  eax, 11

; Multiply by 8 (quad-word) to get the byte offset
imul eax, eax, 8

; Store the address of the jump table
mov  rdx, jump_table

; Add the byte offset to the address, and dereference it to get the address of
; this `case'
mov  rdx, [rdx]

; Jump to the label itself
jmp  rdx

If you are actually going to try this code, you might want to use lea when loading the address of jump_table. For more information, see my note in Understanding and traversing the call stack.

In this case, I used the imul instruction, which multiplies the second operand by the third operand and saves the result in the first operand. Since we are multiplying by 8 (a power of 2) we could have used a more optimal bit shift:

imul rax, rax, 8
; The same as:
shl  rax, 3

4. Conclusion

The compiler doesn’t always use a jump table for switch statements, because if the values are too separated from each other, the array would be too big and it wouldn’t be worth it. If the jump table is too small, the compiler might also decide to treat the switch as a series of nested if-else blocks, just like we saw before.

With this in mind, it’s more clear why the values in the case statements have to be known at compile time. The compiler needs to know the specific values of all the cases just to decide if it’s worth it to generate a jump table. Then, it also uses them for filling default cases, making sure the value is between the lowest and greatest values, for indexing, etc.

If you reached this far, I hope you learned something. For the full source code, see my scratch repository.