Challenge 2
Table of Contents
1. Original assembly
Original challenge link: Click me.
This is the original assembly code from the challenge, I just formatted it a bit to fit the nasm syntax.
func: mov eax, [esp + 0x4] bswap eax mov edx, eax and eax, 0x0F0F0F0F and edx, 0xF0F0F0F0 shr edx, 4 shl eax, 4 or eax, edx mov edx, eax and eax, 0x33333333 and edx, 0xCCCCCCCC shr edx, 2 shl eax, 2 or eax, edx mov edx, eax and eax, 0x55555555 and edx, 0xAAAAAAAA add eax, eax shr edx, 1 or eax, edx ret
2. Analysis
Let’s try to figure this out step by step.
First of all, it loads the function argument into eax
. Since this program was
compiled with -m32
, it follows the i386 System V ABI.
i386 System V ABI - Registers and the Stack Frame (Page 37)
Argument words are pushed onto the stack in reverse order (that is, the rightmost argument in C call syntax has the highest address), preserving the stack’s word alignment. All incoming arguments appear on the stack, residing in the stack frame of the caller.
The caller, after pushing the function parameters, pushes the return address to
the stack with the call
instruction, taking up 4 bytes. We can add these 4 bytes
to the esp
register to skip over the return address and get the parameters.
mov eax, [esp + 0x4]
Note that we are not using the ebp
register to preserve the stack frame. If the
function started like this, we would also have to “skip” over the pushed ebp
to
access the parameters:
func: push ebp ; Preserve caller's stack frame (pushing another 4 bytes) mov ebp, esp ; Store current stack frame mov eax, [esp + 0x8] ; Skip over pushed EBP and the return address (4 + 4) ... mov esp, ebp ; Restore stack frame pop ebp ; Pop caller's stack frame ret ; Next value on the stack is the return address, popped by `ret`
Next, it uses bswap
to reverse the bytes (not bits) in the eax
register. From
felixcloutier.com:
BSWAP — Byte Swap
Reverses the byte order of a 32-bit […] register. This instruction is provided for converting little-endian values to big-endian format and vice versa.
So it will convert 0x11223344
to 0x44332211
.
bswap eax
Next, it copies eax
to edx
, and masks eax
with 0x0f0f0f0f
and edx
with
0xf0f0f0f0
. This will store the higher half of the bytes inside eax
, and the
lower part in edx
.
mov edx, eax and eax, 0x0F0F0F0F and edx, 0xF0F0F0F0
After that, it shifts eax
4 bits to the left, edx
4 bits to the right, and OR’s
them back together.
shr edx, 4 shl eax, 4 or eax, edx
Let’s try to visualize these last steps with an example:
Base number: 0xDEADBEEF 0b11011110101011011011111011101111 After masking: 0x0E0D0E0F 0b00001110000011010000111000001111 (eax) 0xD0A0B0E0 0b11010000101000001011000011100000 (edx) After shifting: 0xE0D0E0F0 0b11100000110100001110000011110000 (eax) 0x0D0A0B0E 0b00001101000010100000101100001110 (edx) After merging: 0xEDDAEBFE 0b11101101110110101110101111111110
As you can see, every nibble (group of 4 bits) has been swapped with the adjacent one.
Right after that, we copy the result back to edx
, and we do a similar
operation. In this case, however, we will be swapping bit pairs, instead of
nibbles:
mov edx, eax and eax, 0x33333333 ; 0b00110011001100110011001100110011 and edx, 0xCCCCCCCC ; 0b11001100110011001100110011001100 shr edx, 2 shl eax, 2 or eax, edx
Bits 0 and 1 have been swapped with bits 2 and 3, bits 4 and 5 have been swapped with bits 6 and 7, etc.
And finally, it does a similar operation with every bit:
mov edx, eax and eax, 0x55555555 ; 0b01010101010101010101010101010101 and edx, 0xAAAAAAAA ; 0b10101010101010101010101010101010 add eax, eax shr edx, 1 or eax, edx ret
Note that adding a number to itself (multiplying by two) is the same as shifting to the left. The previous code could have been written as:
... and eax, 0x55555555 and edx, 0xAAAAAAAA shl eax, 1 shr edx, 1 ...
Which matches the previous operations.
The final OR’d result will be saved to eax
and returned.
3. C translation
This is a simple C translation of the assembly:
#include <stdint.h> #include <byteswap.h> #include <stdio.h> uint32_t swap(uint32_t num) { num = bswap_32(num); num = ((num & 0x0F0F0F0F) << 4) | ((num & 0xF0F0F0F0) >> 4); num = ((num & 0x33333333) << 2) | ((num & 0xCCCCCCCC) >> 2); num = ((num & 0x55555555) << 1) | ((num & 0xAAAAAAAA) >> 1); return num; } #define PRINT_EXPR(E) printf("%s 0x%X\n", #E, E) int main(void) { PRINT_EXPR(swap(0xDEADBEEF)); /* 0xF77DB57B */ PRINT_EXPR(swap(swap(0xDEADBEEF))); /* 0xDEADBEEF */ return 0; }
4. Inverse function
I made my own inverse function before realizing that the original swapping function is actually an involutory function, so using it on its own output produces the original input.
The bit swapping sections are the same, just in a different order.
inverse: mov eax, [esp + 0x4] mov edx, eax and eax, 0x55555555 and edx, 0xAAAAAAAA shl eax, 1 shr edx, 1 or eax, edx mov edx, eax and eax, 0x33333333 and edx, 0xCCCCCCCC shl eax, 2 shr edx, 2 or eax, edx mov edx, eax and eax, 0x0F0F0F0F and edx, 0xF0F0F0F0 shl eax, 4 shr edx, 4 or eax, edx bswap eax ret