Challenge 5
Table of Contents
1. Original assembly
Original challenge link: Click me.
This is the original function, adapted to fit the nasm syntax.
1: func: 2: cmp rcx, rsi 3: ja .L10 4: sub rsi, rcx 5: add rsi, 1 6: mov r11, rsi 7: je .L10 8: test rcx, rcx 9: jne .L16 10: mov rax, rdi 11: ret 12: .L10: 13: xor eax, eax 14: ret 15: .L16: 16: push rbx 17: xor r10d, r10d 18: mov r9d, 1 19: .L4: 20: lea rax, [rdi + r10] 21: xor esi, esi 22: xor r8d, r8d 23: .L8: 24: movzx ebx, BYTE [rdx + rsi] 25: cmp BYTE [rax + rsi], bl 26: cmovne r8d, r9d 27: add rsi, 1 28: cmp rsi, rcx 29: jne .L8 30: test r8d, r8d 31: je .L12 32: add r10, 1 33: cmp r10, r11 34: jne .L4 35: xor eax, eax 36: .L12: 37: pop rbx 38: ret
2. Assembly notes
Here are some important notes before converting the code to a higher level language.
2.1. First checks
As you can see, the first thing the function does is check if rcx
is greater
than rsi
. If it is, it returns zero. Otherwise, it subtracts rcx
from rsi
, adds
one to rsi
, and uses another je
instruction to branch. There are some key
concepts you need to know to understand this:
- The
je
andjz
instructions are the same. The opcode for bothje rel32
andjz rel32
is0F 84 cd
. - Both of these instructions just check the
ZF
bit inside the EFLAGS register1. - The
cmp
instruction just subtracts the second operand from the first operand, but setting the flags without saving the result in the first operand. - This flag is not only changed by the
cmp
instruction, but also by other instructions likeadd
,sub
, etc.
So in this case, it’s checking if rsi
minus rcx
plus one is zero or not. In
other words, if rcx
is equal to rsi+1
. If it is, it returns zero.
2.2. Function parameters
We can see that the function has 4 parameters because registers rdi
, rsi
, rdx
and rcx
are being used. See challenge 3 for more information on the AMD64 System
V ABI.
Another important thing to note is how we know the types of the function
parameters. We know the first argument (rdi
) is a pointer because right after
.L4
it’s being used for indexing. We can determine the type of the pointer
because:
- It’s not multiplying the index by the size of the elements (e.g.
i*4
for arrays of DWORDs), so we can asume the size of each element is one byte. - Right after
.L8
, you can see it’s moving and comparing byte pointers, so it’s not hard to see that these arechar*
.
2.3. Function return type
We can tell that the function returns a char*
because at line 31 of the assembly
code, it’s returning without changing the value of rax
, and the only place it’s
modified is at line 20, where it loads the address of a character inside the
string.
The other place where we don’t return zero is at line 10, where we return the
first parameter, which as we know is a char*
.
2.4. Constructing the for
loops
We can see that the labels .L4
and .L8
are being used as loops because of the
conditional jumps at lines 29 and 34 of the assembly code. We can get the loop
conditions right before the jumps.
For the outer loop, the condition is:
while (r10 != r11) { // ... }
And for the inner loop, the condition is:
while (rsi != rcx) { // ... }
We can tell that they are actually for
loops because:
- One of the variables used for the loop’s condition is initialized before the
loop starts (before the label used for jumping). In the first loop, the
r10
register is initialized to zero (line 17), and in the inner loop, thersi
register is also initialized to zero (line 21). - Right before checking the conditions, the loops increment those same variables
(
rsi
on line 27 andr10
on line 32).
In my opinion, the best approach when trying to understand a label’s purpose is to look at the places where those labels are being used, and get the loop/conditional information directly from there.
Also note that the rsi
variable is now being used as the iterator for the inner
loop, instead of holding the old result of subtracting the arguments. This old
value was stored in the r11
register on line 6. It’s important to know where the
meaning/usage of the registers change.
2.5. Conditional move
After .L8
, on line 25, it compares rax[rsi]
with rdx[rsi]
, and moves r9d
to r8d
if they are not equal. To do this, it uses the cmovne
instruction:
CMOVcc — Conditional Move
CMOVNE r32, r/m32
: Move if not equal (ZF
= 0).Each of the
CMOVcc
instructions performs a move operation if the status flags in the EFLAGS register (CF
,OF
,PF
,SF
, andZF
) are in a specified state (or condition). […] If the condition is not satisfied, a move is not performed and execution continues with the instruction following theCMOVcc
instruction.
3. C translation
From the previous challenge, I learned that it’s a good approach to first convert the assembly code into a higher level language like C, and try to figure out what the function does from there.
#include <stdint.h> #include <stddef.h> char* func(char* rdi, int rsi, char* rdx, int rcx) { if (rcx > rsi) /* .L10 */ return NULL; rsi = (rsi - rcx) + 1; int r11 = rsi; if (rsi == 0) /* .L10 */ return NULL; if (rcx == 0) return rdi; /* .L16 */ const int r9 = 1; for (int r10 = 0; r10 != r11; r10++) { /* .L4 */ char* rax = &rdi[r10]; int r8 = 0; for (int rsi = 0; rsi != rcx; rsi++) { /* .L8 */ char ebx = rdx[rsi]; if (rax[rsi] != ebx) r8 = r9; } if (r8 == 0) return rax; } return NULL; }
After changing some variable names and simplifying:
#include <stdbool.h> #include <stdint.h> #include <stddef.h> #include <stdio.h> char* func(char* str1, int n1, char* str2, int n2) { if (n2 > n1) /* .L10 */ return NULL; n1 = (n1 - n2) + 1; /* NOTE: `r11` is just used to store the original `rsi` (n1), since `rsi` * it's going to be used as iterator for the second loop (j). Since I create * a `j` variable, I don't need this aux variable. */ if (n1 == 0) /* .L10 */ return NULL; if (n2 == 0) return str1; /* .L16 */ /* NOTE: r9 is only used once and it's value (true) never changes */ for (int i = 0; i != n1; i++) { /* .L4 */ char* substring = &str1[i]; bool failed = false; for (int j = 0; j != n2; j++) { /* .L8 */ if (substring[j] != str2[j]) failed = true; } if (!failed) return substring; } return NULL; } char* result = func("Hello, world!", 13, "world", 5); printf("%p\n\"%s\"\n", result, result); /* "world!" */
As we can see, the function is used to return the first match of str2
inside
str1
. You also need to provide the lengths of the strings.
Note that the n1
and n2
parameters are the lengths of the strings. In the
example we pass 5 instead of 6 (sizeof("world")
) so "world\0"
matches "world!"
.
Footnotes:
See Intel SDM, Vol. 1, Chapter 3.4.3 EFLAGS Register and Vol. 1, Chapter 5.1.7 Control Transfer Instructions.