Reversing a simple keygen with rizin
Table of Contents
1. Introduction
I created a simple keygen program that takes a user and a key, and prints a message if the key is valid. Let’s try to figure out how to make a keygen for it. I made an article explaining how keygens work, and how to make your own. I will not be using the same code, but feel free to check Creating a simple keygen in case you are interested.
2. Trying out the program
First, let’s try running the program to see what it’s asking.
Username: test Key: test Invalid key.
From this, we can at least expect:
- A call to something like
printf()
with"Username:"
in the arguments. - Some call for scanning the username, perhaps
getchar()
,gets()
orscanf()
. - Another call to
printf()
with"Key:"
in the arguments. - Another call for scanning the key.
- Some kind of key validation.
- Depending on the result, a last call to
printf()
with the text"Invalid key."
.
3. The main function
Let’s start by having a look at the main function with rizin 0.6.3 (you can also use cutter or IDA, if you prefer a GUI program).
3.1. Finding the main function
We start by running the aaa
rizin command, which is used to “Analyze all calls,
references, emulation and apply signatures”.
$ rizin ./my_keygen [0x00001090]> aaa [x] Analyze all flags starting with sym. and entry0 (aa) [x] Analyze function calls [x] Analyze len bytes of instructions for references [x] Check for classes [x] Analyze local variables and arguments [x] Type matching analysis for all functions [x] Applied 0 FLIRT signatures via sigdb [x] Propagate noreturn information [x] Resolve pointers to data sections [x] Use -AA or aaaa to perform additional experimental analysis.
We can list the symbols (flags) with fl
.
Note: With
command~text
, we can filter rizin’s output.
[0x00001090]> fl~main 0x0000140d 201 main 0x00003fc0 8 __libc_start_main 0x000041d0 8 __libc_start_main
Normally, rizin is able to find the main
function. If this was not the case, we
would have to look at the address being loaded into rdi
from entry0
. To print
the disassembly of entry0
, we can use the pdf
command.
Note: Using
command@location
executes the command at the specified position. In this case, rizin has placed us intoentry0
already, so this is not necessary. We could also uses location
(seek) and thencommand
.
[0x00001090]> pdf @ entry0 entry0 ;-- section..text: entry0 entry0(int64_t arg3); entry0 ; arg int64_t arg3 @ rdx entry0 31 ed xor ebp, ebp ; [13] -r-x section size 1492 named .text entry0+0x2 49 89 d1 mov r9, rdx ; arg3 entry0+0x5 5e pop rsi entry0+0x6 48 89 e2 mov rdx, rsp entry0+0x9 48 83 e4 f0 and rsp, 0xfffffffffffffff0 entry0+0xd 50 push rax entry0+0xe 54 push rsp entry0+0xf 45 31 c0 xor r8d, r8d entry0+0x12 31 c9 xor ecx, ecx entry0+0x14 48 8d 3d 4e 03 00 00 lea rdi, data.0000140d ; 0x140d entry0+0x1f ff 15 0b 2f 00 00 call qword [rip + __libc_start_main] ; [reloc.__libc_start_main:8]=0x41d0 reloc.target.__libc_start_main
The address being loaded into rdi
at entry0+0x14
should be our main function
(0x140d
). Then, you could rename the function to “main” with
f+ main 0xc8 @ 0x140d
.
3.2. Disassembling the main function
Let’s have a look at the disassembly of the main function.
Note: If for some reason
main
is not recognized as a function, you would need to usepd 100 @ location
instead, where 100 is the number of bytes to disassemble.
[0x00001090]> pdf @ main ; DATA XREF from entry0 @ 0x10a8 main int main(int argc, char **argv, char **envp); main ; var uint64_t var_9h @ stack - 0x9 main 55 push rbp main+0x1 48 89 e5 mov rbp, rsp main+0x4 48 83 ec 10 sub rsp, 0x10 main+0x8 48 8d 05 f1 0b 00 00 lea rax, [rip + str.Username:] ; 0x2009 main+0xf 48 89 c7 mov rdi, rax ; const char *format main+0x12 b8 00 00 00 00 mov eax, 0 main+0x17 e8 3b fc ff ff call printf ; sym.imp.printf ; int printf(const char *format) main+0x1c 48 8d 05 34 2c 00 00 lea rax, [rip + data.00004060] ; 0x4060 main+0x23 48 89 c6 mov rsi, rax main+0x26 48 8d 05 de 0b 00 00 lea rax, [rip + str.255s] ; 0x2014 main+0x2d 48 89 c7 mov rdi, rax ; const char *format main+0x30 b8 00 00 00 00 mov eax, 0 main+0x35 e8 3d fc ff ff call __isoc99_scanf ; sym.imp.__isoc99_scanf ; int scanf(const char *format) main+0x3a 48 8d 05 d0 0b 00 00 lea rax, [rip + str.Key:] ; 0x201a main+0x41 48 89 c7 mov rdi, rax ; const char *format main+0x44 b8 00 00 00 00 mov eax, 0 main+0x49 e8 09 fc ff ff call printf ; sym.imp.printf ; int printf(const char *format) main+0x4e 48 8d 05 02 2d 00 00 lea rax, [rip + data.00004160] ; 0x4160 main+0x55 48 89 c7 mov rdi, rax ; int64_t arg1 main+0x58 e8 41 fe ff ff call fcn.000012a7 ; fcn.000012a7 main+0x5d 48 8d 05 13 2d 00 00 lea rax, [rip + data.00004180] ; 0x4180 main+0x64 48 89 c6 mov rsi, rax ; int64_t arg2 main+0x67 48 8d 05 e9 2b 00 00 lea rax, [rip + data.00004060] ; 0x4060 main+0x6e 48 89 c7 mov rdi, rax ; const char *arg1 main+0x71 e8 0a fd ff ff call fcn.00001189 ; fcn.00001189 main+0x76 ba 14 00 00 00 mov edx, 0x14 ; size_t n main+0x7b 48 8d 05 f5 2c 00 00 lea rax, [rip + data.00004180] ; 0x4180 main+0x82 48 89 c6 mov rsi, rax ; const void *s2 main+0x85 48 8d 05 cb 2c 00 00 lea rax, [rip + data.00004160] ; 0x4160 main+0x8c 48 89 c7 mov rdi, rax ; const void *s1 main+0x8f e8 d3 fb ff ff call memcmp ; sym.imp.memcmp ; int memcmp(const void *s1, const void *s2, size_t n) main+0x94 85 c0 test eax, eax main+0x96 0f 94 c0 sete al main+0x99 88 45 ff mov byte [rbp - 1], al main+0x9c 80 7d ff 00 cmp byte [rbp - 1], 0 ┌─< main+0xa0 74 11 je 0x14bc │ main+0xa2 48 8d 05 6e 0b 00 00 lea rax, [rip + str.Correct_key.] ; 0x2020 │ main+0xa9 48 89 c7 mov rdi, rax ; const char *s │ main+0xac e8 76 fb ff ff call puts ; sym.imp.puts ; int puts(const char *s) ┌──< main+0xb1 eb 0f jmp 0x14cb │└─> main+0xb3 48 8d 05 6a 0b 00 00 lea rax, [rip + str.Invalid_key.] ; 0x202d │ main+0xba 48 89 c7 mov rdi, rax ; const char *s │ main+0xbd e8 65 fb ff ff call puts ; sym.imp.puts ; int puts(const char *s) │ ; CODE XREF from main @ 0x14ba └──> main+0xc2 b8 00 00 00 00 mov eax, 0 main+0xc7 c9 leave main+0xc8 c3 ret
In here, we can see that it matches the pattern we saw when running the program.
From main+0x8
to main+0x17
, it calls printf("Username: ")
, and from main+0x1c
to
main+0x35
it uses scanf("%255s", user)
to read the user, where user is at
address 0x4060
.
Similarly, from main+0x3a
to main+0x49
it calls printf("Key: ")
, but instead of
calling scanf()
, from main+0x4e
to main+0x58
it calls an unknown function at
0x12a7
with 0x4160
as argument. We can safely asume that it scans for the user
key, so we will rename the function to get_key()
, and the parameter user_key
.
From main+0x5d
to main+0x71
, it calls an unknown function at 0x1189
with the
user
we got from scanf()
and 0x4180
as arguments. This looks very promising,
since it will compare this 0x4180
value with user_key
right below. For this
reason, we will call this function generate_key()
and the second parameter at
0x4180
, real_key
.
From main+0x76
to main+0xbd
it calls memcmp(user_key, real_key, 0x14)
and prints
“Correct key.” or “Invalid key.” depending on the returned value by memcmp
. From
this call we also know that the key size should be 0x14 (20).
This is obviously an ideal environment, since the main function is the one responsible for validating the key. Since this is not normally the case, we could try to look for those success and fail messages in the program’s string list, and check the xrefs to find the key validation function. In this specific program, we could also just patch the bytes to either show us the real key, or change the conditional jump so it always jumps to the code that gets executed when the key is correct.
4. Disassembling the key generator
Now that we understand the main logic, let’s have a look at the generate_key()
function at 0x1189
.
[0x00001090]> pdf @ fcn.00001189 ; CALL XREF from main @ 0x147a fcn.00001189 fcn.00001189(const char *arg1, int64_t arg2); fcn.00001189 ; arg const char *arg1 @ rdi fcn.00001189 ; arg int64_t arg2 @ rsi fcn.00001189 ; var int64_t var_28h @ stack - 0x28 fcn.00001189 ; var const char *s @ stack - 0x20 fcn.00001189 ; var int64_t var_15h @ stack - 0x15 fcn.00001189 ; var int64_t var_14h @ stack - 0x14 fcn.00001189 ; var int64_t var_10h @ stack - 0x10 fcn.00001189 ; var int64_t var_ch @ stack - 0xc fcn.00001189 55 push rbp fcn.00001189+0x1 48 89 e5 mov rbp, rsp fcn.00001189+0x4 48 83 ec 20 sub rsp, 0x20 fcn.00001189+0x8 48 89 7d e8 mov qword [rbp - 0x18], rdi ; arg1 fcn.00001189+0xc 48 89 75 e0 mov qword [rbp - 0x20], rsi ; arg2 fcn.00001189+0x10 48 8b 45 e8 mov rax, qword [rbp - 0x18] fcn.00001189+0x14 48 89 c7 mov rdi, rax ; const char *s fcn.00001189+0x17 e8 9b fe ff ff call strlen ; sym.imp.strlen ; size_t strlen(const char *s) fcn.00001189+0x1c 89 45 fc mov dword [rbp - 4], eax fcn.00001189+0x1f c7 45 f4 00 00 00 00 mov dword [rbp - 0xc], 0 fcn.00001189+0x26 c7 45 f8 00 00 00 00 mov dword [rbp - 8], 0 ┌─< fcn.00001189+0x2d e9 8f 00 00 00 jmp 0x124a ┌──> fcn.00001189+0x32 8b 45 f4 mov eax, dword [rbp - 0xc] ╎│ fcn.00001189+0x35 48 63 d0 movsxd rdx, eax ╎│ fcn.00001189+0x38 48 8b 45 e8 mov rax, qword [rbp - 0x18] ╎│ fcn.00001189+0x3c 48 01 d0 add rax, rdx ╎│ fcn.00001189+0x3f 0f b6 00 movzx eax, byte [rax] ╎│ fcn.00001189+0x42 88 45 f3 mov byte [rbp - 0xd], al ╎│ fcn.00001189+0x45 0f b6 45 f3 movzx eax, byte [rbp - 0xd] ╎│ fcn.00001189+0x49 c1 e0 04 shl eax, 4 ╎│ fcn.00001189+0x4c 89 c2 mov edx, eax ╎│ fcn.00001189+0x4e 0f b6 45 f3 movzx eax, byte [rbp - 0xd] ╎│ fcn.00001189+0x52 c0 e8 04 shr al, 4 ╎│ fcn.00001189+0x55 09 d0 or eax, edx ╎│ fcn.00001189+0x57 88 45 f3 mov byte [rbp - 0xd], al ╎│ fcn.00001189+0x5a 8b 45 f8 mov eax, dword [rbp - 8] ╎│ fcn.00001189+0x5d 0f af 45 f4 imul eax, dword [rbp - 0xc] ╎│ fcn.00001189+0x61 48 63 d0 movsxd rdx, eax ╎│ fcn.00001189+0x64 48 69 d2 81 80 80 80 imul rdx, rdx, 0xffffffff80808081 ╎│ fcn.00001189+0x6b 48 c1 ea 20 shr rdx, 0x20 ╎│ fcn.00001189+0x6f 01 c2 add edx, eax ╎│ fcn.00001189+0x71 89 d1 mov ecx, edx ╎│ fcn.00001189+0x73 c1 f9 07 sar ecx, 7 ╎│ fcn.00001189+0x76 99 cdq ╎│ fcn.00001189+0x77 29 d1 sub ecx, edx ╎│ fcn.00001189+0x79 89 ca mov edx, ecx ╎│ fcn.00001189+0x7b c1 e2 08 shl edx, 8 ╎│ fcn.00001189+0x7e 29 ca sub edx, ecx ╎│ fcn.00001189+0x80 29 d0 sub eax, edx ╎│ fcn.00001189+0x82 89 c1 mov ecx, eax ╎│ fcn.00001189+0x84 89 c8 mov eax, ecx ╎│ fcn.00001189+0x86 00 45 f3 add byte [rbp - 0xd], al ╎│ fcn.00001189+0x89 8b 45 fc mov eax, dword [rbp - 4] ╎│ fcn.00001189+0x8c 89 c2 mov edx, eax ╎│ fcn.00001189+0x8e 0f b6 45 f3 movzx eax, byte [rbp - 0xd] ╎│ fcn.00001189+0x92 31 d0 xor eax, edx ╎│ fcn.00001189+0x94 88 45 f3 mov byte [rbp - 0xd], al ╎│ fcn.00001189+0x97 8b 45 f8 mov eax, dword [rbp - 8] ╎│ fcn.00001189+0x9a 48 63 d0 movsxd rdx, eax ╎│ fcn.00001189+0x9d 48 8b 45 e0 mov rax, qword [rbp - 0x20] ╎│ fcn.00001189+0xa1 48 01 c2 add rdx, rax ╎│ fcn.00001189+0xa4 0f b6 45 f3 movzx eax, byte [rbp - 0xd] ╎│ fcn.00001189+0xa8 88 02 mov byte [rdx], al ╎│ fcn.00001189+0xaa 83 45 f4 01 add dword [rbp - 0xc], 1 ╎│ fcn.00001189+0xae 8b 45 f4 mov eax, dword [rbp - 0xc] ╎│ fcn.00001189+0xb1 3b 45 fc cmp eax, dword [rbp - 4] ┌───< fcn.00001189+0xb4 7c 07 jl 0x1246 │╎│ fcn.00001189+0xb6 c7 45 f4 00 00 00 00 mov dword [rbp - 0xc], 0 └───> fcn.00001189+0xbd 83 45 f8 01 add dword [rbp - 8], 1 ╎│ ; CODE XREF from fcn.00001189 @ 0x11b6 ╎└─> fcn.00001189+0xc1 83 7d f8 13 cmp dword [rbp - 8], 0x13 └──< fcn.00001189+0xc5 0f 8e 67 ff ff ff jle 0x11bb fcn.00001189+0xcb 90 nop fcn.00001189+0xcc 90 nop fcn.00001189+0xcd c9 leave fcn.00001189+0xce c3 ret
Since we saw how it was called, we can determine the number of parameters and
the types. We also know that the first argument is the user, and that it’s
calculating the string length once at f+0x17
.
We also know that the second parameter is a char*
because it’s storing rsi
in
[rbp - 0x20]
, and from f+0x9d
to f+0xa8
it moves that value to rax
, adds it to
rdx
(probably using rdx
as an index) and finally it access its contents with
byte [rdx]
.
We can also identify a for
loop, since at f+0x26
it sets [rbp - 8]
to 0, right
before jumping to f+0xc1
, where it checks if this value is less or equal than
0x13
(19) and jumps back to the top. Right before this conditional jump, we can
see that the value at [rbp - 8]
is increased by one. Note how the value in the
loop’s condition is the same as the one we saw being used as the size parameter
when calling memcmp
from main
(Since i<=19
is the same as i<20
).
We see some local variable ([rbp - 0xc]
) being initialized to 0 in f+0x1f
, that
will be incremented by one in f+0xaa
, and that will be set to zero if it’s
greater or equal than the user length (f+0xaa
to f+0xb6
). We can determine that
this is some kind of index being used for the user
string, that will be
incremented each iteration unless it’s out of bounds, in which case it will be
set back to 0.
From this, we can identify a basic structure:
void func(const char* user, const char* real_key) { int user_len = strlen(user); // [rbp - 4] int user_pos = 0; // [rbp - 0xc] int i; // [rbp - 8] for (i = 0; i < 20; i++) { // TODO user_pos++; if (user_pos >= user_len) user_pos = 0; } }
Let’s take a look at the body of the loop. We can see how it’s loading the
user_pos
into rdx
, and the first argument into rax
, before adding them together
and dereferencing the address into [rbp - 0xd]
.
/* * fcn.00001189+0x32 mov eax, dword [rbp - 0xc] * fcn.00001189+0x35 movsxd rdx, eax * fcn.00001189+0x38 mov rax, qword [rbp - 0x18] * fcn.00001189+0x3c add rax, rdx * fcn.00001189+0x3f movzx eax, byte [rax] * fcn.00001189+0x42 mov byte [rbp - 0xd], al */ char c = user[user_pos]; // [rbp - 0xd]
Then, it shifts that value 4 bits to the left, saves the result in edx
, shifts
the original value 4 bits to the right and ORs them back together. In other
words, it exchanges the high and low nibbles.
/* * fcn.00001189+0x45 movzx eax, byte [rbp - 0xd] * fcn.00001189+0x49 shl eax, 4 * fcn.00001189+0x4c mov edx, eax * fcn.00001189+0x4e movzx eax, byte [rbp - 0xd] * fcn.00001189+0x52 shr al, 4 * fcn.00001189+0x55 or eax, edx * fcn.00001189+0x57 mov byte [rbp - 0xd], al */ c = (c << 4) | (c >> 4);
Next, it multiplies i
by user_pos
, and saves it in rdx
.
/* * fcn.00001189+0x5a mov eax, dword [rbp - 8] * fcn.00001189+0x5d imul eax, dword [rbp - 0xc] * fcn.00001189+0x61 movsxd rdx, eax */ int tmp = i * user_pos;
The next part is a bit messy because of compiler optimizations, so you will just
have to trust me. It performs a modulus operation with tmp
and 0xFF
, and then
adds it back to c
.
/* * fcn.00001189+0x64 imul rdx, rdx, 0xffffffff80808081 * fcn.00001189+0x6b shr rdx, 0x20 * fcn.00001189+0x6f add edx, eax * fcn.00001189+0x71 mov ecx, edx * fcn.00001189+0x73 sar ecx, 7 * fcn.00001189+0x76 cdq * fcn.00001189+0x77 sub ecx, edx * fcn.00001189+0x79 mov edx, ecx * fcn.00001189+0x7b shl edx, 8 * fcn.00001189+0x7e sub edx, ecx * fcn.00001189+0x80 sub eax, edx * fcn.00001189+0x82 mov ecx, eax * fcn.00001189+0x84 mov eax, ecx * fcn.00001189+0x86 add byte [rbp - 0xd], al */ c += tmp % 255;
Finally, it XORs the length of the user string with c
, and writes it to the
second parameter.
/* * fcn.00001189+0x89 mov eax, dword [rbp - 4] * fcn.00001189+0x8c mov edx, eax * fcn.00001189+0x8e movzx eax, byte [rbp - 0xd] * fcn.00001189+0x92 xor eax, edx * fcn.00001189+0x94 mov byte [rbp - 0xd], al */ c ^= user_len; /* * fcn.00001189+0x97 mov eax, dword [rbp - 8] * fcn.00001189+0x9a movsxd rdx, eax * fcn.00001189+0x9d mov rax, qword [rbp - 0x20] * fcn.00001189+0xa1 add rdx, rax * fcn.00001189+0xa4 movzx eax, byte [rbp - 0xd] * fcn.00001189+0xa8 mov byte [rdx], al */ real_key[i] = c;
We already know the rest, incrementing user_pos
, making sure we are not writing
outside of user
, incrementing i
and looping until we are done with all
characters of real_key
.
This is the final key generation function:
void func(const char* user, const char* real_key) { int user_len = strlen(user); // [rbp - 4] int user_pos = 0; // [rbp - 0xc] int i; // [rbp - 8] for (i = 0; i < 20; i++) { char c = user[user_pos]; // [rbp - 0xd] c = (c << 4) | (c >> 4); c += (i * user_pos) % 0xFF; c ^= user_len; real_key[i] = c; user_pos++; if (user_pos >= user_len) user_pos = 0; } }
5. Alternative: Decompiling with IDA or ghidra
This option is not always reliable or not always available, so it’s important to understand how the actual assembly works before jumping into the decompiler.
There are various decompilers, and everyone has different opinions about which one is the best. For me, even though I rather use free and open-source tools, I find that the best decompiler is the IDA Pro one. Rizin (and therefore cutter) has its own ghidra plugin made in C++.
I will show a comparison between these two decompilers, but keep in mind that decompiling a single program doesn’t provide nearly enough data to judge the two decompilers.
Note: I formatted both outputs with my clang-format to make the outputs look as similar as possible.
5.1. IDA Pro
This is the generated C code by IDA Pro Version 7.7.220118 Windows x64 (x64 Decompiler Hex-Rays SA 7.7.0.220118).
size_t /* __fastcall */ sub_1189(const char* a1, __int64 a2) { size_t result; // rax int v3; // [rsp+14h] [rbp-Ch] int i; // [rsp+18h] [rbp-8h] int v5; // [rsp+1Ch] [rbp-4h] result = strlen(a1); v5 = result; v3 = 0; for (i = 0; i <= 19; ++i) { *(a2 + i) = v5 ^ (v3 * i % 255 + ((16 * a1[v3]) | (a1[v3] >> 4))); result = ++v3; if (v3 >= v5) v3 = 0; } return result; }
Note how IDA decides to translate (x << 4)
to (x * 16)
, since they are
equivalent and the second is more likely to be used.
5.2. Rizin’s version of ghidra
This is the generated C code by rizin 0.6.1 @ linux-x86-64.
[0x00001090]> pdg @ fcn.00001189 // WARNING: Variable defined which should be unmapped: var_ch // WARNING: Could not reconcile some variable overlaps // WARNING: [rz-ghidra] Detected overlap for variable var_10h // WARNING: [rz-ghidra] Detected overlap for variable var_15h void fcn.00001189(char* arg1, int64_t arg2) { int32_t iVar1; int64_t var_28h; char* s; int64_t var_14h; int64_t var_ch; iVar1 = strlen(arg1); var_14h._0_4_ = 0; for (var_14h._4_4_ = 0; var_14h._4_4_ < 0x14; var_14h._4_4_ = var_14h._4_4_ + 1) { *(var_14h._4_4_ + arg2) = (arg1[var_14h] >> 4 | arg1[var_14h] << 4) + (var_14h._4_4_ * var_14h) + ((var_14h._4_4_ * var_14h) / 0xff) ^ iVar1; var_14h._0_4_ = var_14h + 1; if (iVar1 <= var_14h) { var_14h._0_4_ = 0; } } return; }
I am not sure why it’s throwing all those warnings, and why it’s using var_14h
all the time instead of using var_ch
, for example.
I manually removed the type casts from rizin’s output since I disabled them for IDA.