Overview
Challenge | Difficulty | Points | Category | Flag |
---|---|---|---|---|
Reverse100-1 | easy | 100 | reversing | poctf{uwsp_1n_w1n3_7h3r3_15_7ru7h} |
Reverse100-2 | easy | 100 | reversing | poctf{uwsp_d0_0r_d0_n07} |
Reverse100-3 | easy | 100 | reversing | poctf{uwsp_br3v17y_15_7h3_50u1} |
Reverse200-1 | Easy | 200 | reversing | poctf{uwsp_4_7h1n6_0f_b34u7y} |
Reverse200-2 | Easy | 200 | reversing | poctf{uwsp_7h3_n16h7_15_d4rk} |
Reverse200-3 | Easy | 200 | reversing | poctf{uwsp_1_4m_7h3_0c34n} |
Reverse300-1 | Easy | 300 | reversing | poctf{uwsp_7h3_w0rld_15_4_57463} |
Reverse300-2 | Medium | 300 | reversing | poctf{uwsp_4b4nd0n_4ll_h0p3} |
Reverse300-3 | Medium | 300 | reversing | poctf{uwsp_7h3_g4m3_15_4f007} |
Reverse400-1 | Hard | 400 | reversing | poctf{uwsp_4ll_7h47_gl1773r5} |
Reverse100-1
Highlighted techniques
- how to patch files in ghidra
- how to retype strings to make them readable in ghidra
- small gdb tutorial on how to set a breakpoint and skip the function with
jump
Learning the game
We are presented with a file called Reverse100-1
. I run the file
command against it:
Reverse100-1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9604fd8c1b649b4686112951a38c3b7280449fc5, for GNU/Linux 3.2.0, not stripped
it seems to be an ELF file, so I’ll be running it with my docker debian container.
1
2
## ./Reverse100-1
Encoded flag: ΑΑ̠ȗ̠̍ʠȍȗ
The program seems to be printing out the flag after it has had some sort of transformation applied, time to put on the gloves.
Playing the game
Alright time to get a bit more serious, I’ll be showcasing two ways to solve this, by performing dynamic and static analysis.
For the sake of understanding we’ll start by looking at the program through the eyes of the dragon (ghidra
, you guessed correctly)
We saw from the file
command that this program is not stripped so ghidra should have plenty of information to work with.
decompiled “main” function
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
undefined8 main(void)
{
undefined8 local_38;
undefined8 local_30;
undefined8 local_28;
undefined7 local_20;
undefined4 uStack_19;
local_38 = 0x77757b6674636f70;
local_30 = 0x31775f6e315f7073;
local_28 = 0x33723368375f336e;
local_20 = 0x7572375f35315f;
uStack_19 = 0x7d6837;
obfuscate(&local_38);
printf("Encoded flag: %s\n",&local_38);
return 0;
}
Hmm… Okay so let’s break it down:
- first of all we can see a bunch of weird variables with unrecognized types being defined, we’ll come back to it later
- after that we have a call to
obfuscate()
with a pointer tolocal_38
(one of our weird variables) being passed as an argument - then we have the following call
printf("Encoded flag: %s\n",&local_38);
So we can clearly guess that the last two lines we identified are obfuscating/encoding the flag and then printing it, also the obfuscation is performed in place since both the obfuscation function and the print function receive the same pointer.
Let’s solve the mistery of those weird variables, you may have already guessed that they are actually meant to be a single variable, a string, the flag. Why does ghidra show us a long string like this? Well because the disassembly looks like this:
As you can see there are a bunch of weird values being moved into registers, ghidra sees each of these as a new variable, but we can solve that.
How to spoon feed a string to Ghidra
- Right click the first variable generated and look for the “Retype Variable” option (or do Ctrl+L)
- Retype it to
char[n]
wheren
is the amount of characters in the desired string, to get that you can add up how many bytes each of the variables you want to merge is using, but in this case since we only have this string as a variable we can check the function prologue to know how many bytes we need (remember that values in the disassembly are in hex)
It’s not perfect but a lot more readable than before, with this we could already reconstruct the entire flag by hand.
What if I don’t want to rebuild the flag like a LEGO?
Okay then you could create a script for ghidra that gets the variable and reconstructs it… Okay let’s just see how to patch the program to make the obfuscation never happen.
- We go to the call to
obfuscate
in the disassembly and look for thePatch Instruction
option (or use Ctrl+shift+G)
- We’ll patch it so it becomes a
NOP
instruction, I don’t know if there’s aNOP
instruction that is long enough but I just added two of them
after this we can export the program and run it.
- press
O
to export the program, select “Original File”, change it’s name and then click “Ok”
- after this you can run your patched file.
Extra: How to do it all from the terminal and feel like a superhero
Lastly I’ll show you how to solve this from a linux terminal (bash) using gdb
(dynamic analysis).
- run the program with
gdb
1
gdb Reverse100-1
- set a breakpoint in the
obfuscate
function (we can do this since the program has symbols left on it)
- Run the program and let it hit the breakpoint, let them come to us.
- Now, enter the matrix by enabling the disassembly layout
layout asm
- From here we can see everything, and by everything I mean that we can see the instruction we are currently about to execute and what follows. Let’s show a bit more of our power and
jump
straight to the prologue of the function, ignoring the rest of the instructions.
BOOM!!
Got ‘em, they don’t even know what hit them ;)
Flag
poctf{uwsp_1n_w1n3_7h3r3_15_7ru7h}
Reverse100-2
Hightlighted techniques
- symbolic execution with angr
Learning the game
We are presented with a file called Reverse100-2
. I run the file
command against it:
Reverse100-2: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=48d49fdc0741aa19b895d5fec898bd484d2eb49e, for GNU/Linux 3.2.0, not stripped
it seems to be an ELF file, so I’ll be running it with my docker debian container.
1
2
3
## ./Reverse100-2
Enter the password: asdasd
Access denied!
The program asks for a password and later displays the Access denied!
text.
Playing the game
Okay so this structure of challenge tempts me to go for an easy angr solve, but then when I looked at the output of the strings
command looking for the target text to use for stdout I see this:
1
2
3
4
5
6
7
8
9
10
...
poctf{uwH
sp_d0_0rH
p_d0_0r_H
d0_n07}
Enter the password:
%99s
Access granted!
Access denied!
...
sooo, the flag seems to already be there, this is that string fixed in a single line:
poctf{uwsp_d0_0rp_d0_0r_d0_n07}
Hm, looks kinda weird, we can probably guess the flag from here but let’s just do a simple angr script to make sure.
1
2
3
4
5
6
7
import angr
project = angr.Project("Reverse100-2", auto_load_libs=False)
simgr = project.factory.simgr()
print(simgr.explore(find=lambda state: b"granted" in state.posix.dumps(1)).found[0].posix.dumps(0))
1
b'poctf{uwsp_d0_0r_d0_n07}\x00\x00\x00\x00...\x00'
It works!! (with some extra trailing \x00
)
Flag
poctf{uwsp_d0_0r_d0_n07}
Reverse100-3
Learning the game
We are presented with a file called Reverse100-3
. I run the file
command against it:
Reverse100-3: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a372e4d793f897e5a6c3382036f11a4a12f029a2, for GNU/Linux 3.2.0, not stripped
it seems to be an ELF file, so I’ll be running it with my docker debian container.
1
2
3
## ./Reverse100-3
Encoded flag: Flag after reverse step 0: 8e79a99cacd5c5c7917aa58ab88dc6815583a5597bb987b851697b58bb8bcd
Decode function not added yet!Decoded flag (plaintext in hex): Flag after reverse step 0: 8e79a99cacd5c5c7917aa58ab88dc6815583a5597bb987b851697b58bb8bcd
We are also provided with the following source code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <stdio.h>
#include <string.h>
// Convert each byte of the flag to hex and print it
void print_flag_hex(unsigned char *flag, int length, int step) {
printf("Flag after reverse step %d: ", step);
for (int i = 0; i < length; i++) {
printf("%02x", flag[i]); // Print each byte in hexadecimal
}
printf("\n");
}
// Reverse the modification of the flag bytes based on the seed
void reverse_modify_flag(unsigned char *flag, unsigned int seed) {
int length = strlen((char *)flag);
for (int i = 0; i < length; i++) {
flag[i] = (flag[i] - (seed % 10)) % 256; // Reverse each byte modification
seed = seed / 10;
if (seed == 0) {
seed = 88974713; // Reset seed if it runs out
}
}
}
int main() {
unsigned char encoded_flag[] = { 0x8e, 0x79, 0xa9, 0x9c, 0xac, 0xd5, 0xc5, 0xc7, 0x91, 0x7a, 0xa5, 0x8a, 0xb8, 0x8d, 0xc6, 0x81, 0x55, 0x83, 0xa5, 0x59, 0x7b, 0xb9, 0x87, 0xb8, 0x51, 0x69, 0x7b, 0x58, 0xbb, 0x8b, 0xcd};
unsigned int seed = 88974713;
int length = sizeof(encoded_flag);
printf("Encoded flag: ");
print_flag_hex(encoded_flag, length, 0);
// Reverse the modifications 10 times (finish this!)
printf("Decode function not added yet!");
printf("Decoded flag (plaintext in hex): ");
print_flag_hex(encoded_flag, length, 0); // Print final decoded flag in hex
return 0;
}
Playing the game
Seems like we just need to modify the source code to call the reversing function in a loop
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
...
int main() {
unsigned char encoded_flag[] = { 0x8e, 0x79, 0xa9, 0x9c, 0xac, 0xd5, 0xc5, 0xc7, 0x91, 0x7a, 0xa5, 0x8a, 0xb8, 0x8d, 0xc6, 0x81, 0x55, 0x83, 0xa5, 0x59, 0x7b, 0xb9, 0x87, 0xb8, 0x51, 0x69, 0x7b, 0x58, 0xbb, 0x8b, 0xcd};
unsigned int seed = 88974713;
int length = sizeof(encoded_flag);
printf("Encoded flag: ");
print_flag_hex(encoded_flag, length, 0);
for(int i = 0; i < 10; i++){
reverse_modify_flag(encoded_flag, seed);
printf("Decoded flag (plaintext in hex): ");
print_flag_hex(encoded_flag, length, i); // Print final decoded flag in hex
}
return 0;
}
We compile and run this…
1
2
3
##.\a.exe
Encoded flag: Flag after reverse step 0: 8e79a99cacd5c5c7917aa58ab88dc6815583a5597bb987b851697b58bb8bcd
Decoded flag (plaintext in hex): Flag after reverse step 10: 706f6374667b757773705f627233763137795f31355f3768335f353075317dcf00000079a54d050a0000008000400010ff610054ff610088124000010000008815c1008822c100fdffffff020000000000000054ff6100cd88c5750070360084ff6100f512400001000000000000000000000000000000000000000000000000000000a97bd97500703600907bd975dcff6100cbc0cd770070360081430b0d000000000000000000703600000000000000000000000000000000000000000000000000000000000000000000000000
Then copy whatever this is into cyberchef to get a string
Flag
poctf{uwsp_br3v17y_15_7h3_50u1}
Reverse200-1
Hightlighted techniques
- ghidralib for semi-automatic string extraction
Learning the game
We are presented with a file called Reverse
. I run the file
command against it:
Reverse200-1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=ee89e6f8d8bc723c2eabc56f150f344af85be5f3, for GNU/Linux 3.2.0, not stripped
it seems to be an ELF file, so I’ll be running it with my docker debian container.
1
2
## ./Reverse200-1
Obfuscated Flag (Hex): 73 0b 22 21 1e 1c 11 22 21 73
seems to print the obfuscated flag
Playing the game
I opened the file in ghidra and modified it a bit to make it more readable
The actual flag seems to be redacted but there’s a call to a obfuscate()
function and then every byte of the string is being printed in hex
This is the code of the obfuscate()
function:
There’s also a deobfuscate()
function, but it seems to not do anything
Reading the assembly for this function we can see that it is simply building a string in a variable and then not using it for anything, that’s why the decompiler is not producing any code
Let’s first create the reversed obfuscate function and then come back to what the flag is. The obfuscate()
function seems to simply do a xor and add a constant to the result, so we just reverse that by subtracting the same constant and performing the same xor
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def main():
encoded_flag = ""
decoded_flag = ""
for char_i in range(0, len(encoded_flag), 2):
try:
current_char = encoded_flag[char_i: char_i + 2]
except IndexError:
current_char = encoded_flag[char_i:]
decoded_char = chr((int(current_char, 16) - 0x03) ^ 90)
print(decoded_char)
decoded_flag += decoded_char
print(decoded_flag)
Now let’s fill the encoded_flag
variable
From the assembly we had before we can just get the full string, but there’s an important thing to take into account, not all MOVs are performed equally The second to last MOV actually overwrites some of the characters from the one done before (at 0x0040121d
)
To avoid any mistake when copying or transforming the string, it’s better to do it in an automated way, so for this I will be showcasing ghidralib.
You can go read the documentation for a more in depth tutorial, but basically I will:
- Use the emulator to emulate the desired function from
0x004011c5
to0x0040123d
- get the value from the stack
so the variable that the function builds seem to be 0x48
long but values are only loaded up to RBP - 0x40
.
We can put this result into our script anddddd…
Flag
poctf{uwsp_4_7h1n6_0f_b34u7y}
Reverse200-2
Hightlighted techniques
- symbolic execution doesn’t work
- still using angr to get data dynamically
- reversing said data
Learning the game
We are presented with a file called Reverse200-2
. I run the file
command against it:
Reverse200-2: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=11b0463ea4dbbd923c4513ffb7da9e6d5bf1cfb0, for GNU/Linux 3.2.0, not stripped
it seems to be an ELF file, so I’ll be running it with my docker debian container.
1
2
3
## ./Reverse200-2
Enter the correct input: asdfasdf
Incorrect input. Try again.
Asks for an input and then says it’s incorrect
Playing the game
I tried going for the simple angr technique, but it is not able to solve the correct input, so I used angr but not to directly get the correct input
Looking at the file in ghidra we find that there’s a function check_input()
which does the following
- creates a variable with a static value
- gets user input into another variable
- transforms user input with a call to
transform()
- compares the transformed user input to the first variable
We quickly realize that the transformed user input has to match the static variable which we already have so the following steps are:
- reverse the
transform()
function - apply the inverse of
transform()
tolocal_28
The transform()
function is really simple, for every character, it XORs it with 0x3f
and then adds 5
I made Python script to reverse this
1
2
3
4
5
6
def solve(encoded_flag: bytes):
decoded_flag = ""
for b in encoded_flag:
decoded_b = chr((b - 5) ^ 0x3f)
decoded_flag += decoded_b
return decoded_flag
Now we need to get the static byte string from the program, for this I will show how you can do it in an interesting way using angr.
Get values from memory using angr
First we need to get the assembly of the program, for that we will use objdump
objdump -d Reverse200-2 > dump.s
In the assembly we look for the memory address where the desired string is already stored in memory. To achieve this I looked for the call to strcmp
so I can read the value from the argument
...
401236: 48 8d 45 c0 lea -0x40(%rbp),%rax
40123a: 48 89 c7 mov %rax,%rdi
40123d: e8 34 ff ff ff call 401176 <transform>
401242: 48 8d 55 e0 lea -0x20(%rbp),%rdx
401246: 48 8d 45 c0 lea -0x40(%rbp),%rax
40124a: 48 89 d6 mov %rdx,%rsi
40124d: 48 89 c7 mov %rax,%rdi
401250: e8 1b fe ff ff call 401070 <strcmp@plt>
401255: 85 c0 test %eax,%eax
...
We see that pointers to the arguments are stored in rsi
and rdi
, to know which is which we can also see a bit further up that the argument to transform()
(which is the user input) is stored in rbp - 0x40
, a pointer to this is later being moved to rax
and finally to rdi
before strcmp()
, meaning that the other argument is the static string local_28
1
2
3
4
5
6
7
8
9
10
11
12
13
import angr
def get_encoded_flag():
project = angr.Project("Reverse200-2", auto_load_libs=False)
simgr = project.factory.simgr()
result = simgr.explore(find=0x401250)
if result.found:
state: angr.SimState = result.found[0]
return state.solver.eval(state.memory.load(state.regs.get("rsi"), 29), cast_to=bytes)
I’ll explain the important lines of this script
1
2
3
4
project = angr.Project("Reverse200-2", auto_load_libs=False)
simgr = project.factory.simgr()
result = simgr.explore(find=0x401250)
The first two lines simply get the simulation manager as usual.
The next line looks for a state located in the call to strcmp()
1
2
3
if result.found:
state: angr.SimState = result.found[0]
return state.solver.eval(state.memory.load(state.regs.get("rsi"), 29), cast_to=bytes)
if a state is found, we get the value from memory, lets break up that final line a little bit
1
2
3
rsi_value = state.regs.get("rsi")
unsolved_value = state.memory.load(rsi_value, 29)
return state.solver.eval(unsolved_value, cast_to=bytes)
- the
rsi
register holds a pointer to the string we want, so by doingstate.regs.get("rsi")
we get that pointer as a memory address - we load the value at that address into a variable with
state.memory.load
which takes two arguments- a memory address (we use the pointer from
rsi
here) - an integer representing how many bytes we want to read
- a memory address (we use the pointer from
state.solver.eval
is used to solve the value, sincestate.memory.load
returns aBV
object (there doesn’t seem to be an easier way to read the value as a string even if it is solved)
Here’s the whole code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import angr
def get_encoded_flag():
project = angr.Project("Reverse200-2", auto_load_libs=False)
simgr = project.factory.simgr()
simgr.use_technique(angr.exploration_techniques.DFS())
result = simgr.explore(find=0x401250)
if result.found:
state: angr.SimState = result.found[0]
return state.solver.eval(state.memory.load(state.regs.get("rsi"), 29), cast_to=bytes)
def solve(encoded_flag: bytes):
decoded_flag = ""
for b in encoded_flag:
decoded_b = chr((b - 5) ^ 0x3f)
decoded_flag += decoded_b
return decoded_flag
flag = solve(get_encoded_flag())
print(flag)
Flag
poctf{uwsp_7h3_n16h7_15_d4rk}
Reverse200-3
Hightlighted techniques
Learning the game
We are presented with a file called Reverse
. I run the file
command against it:
Reverse200-3: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=932c681f4a9c2be824bce27856fe7ee8212bb7f1, for GNU/Linux 3.2.0, not stripped
it seems to be an ELF file, so I’ll be running it with my docker debian container.
1
2
3
## ./Reverse200-3
Enter the correct input: sadfsa
Incorrect input! Try again.
Playing the game
To begin this challenge I used the strings
command and found a few interesting things
1
2
3
4
5
6
7
8
9
10
11
12
13
...
H=@@@
*REDACTEH
optc{fwuH
ps1_4__mH
_4__mh7_H
3c043}n
Correct!
Enter the correct input:
%29s
Incorrect input! Try again.
;*3$"
...
So we have what looks like the flag but a bit scrambled, this is the fixed string
optc{fwups1_4__m_4__mh7_3c043}n
it looks like the characters are switched in pairs of two, another clue of this is what we get when trying to solve this with angr
looks like the *REDACTED*
string with characters switched, so let’s try that.
I got: poctf{uwsp_1_4m_4___hm_7c340}3n
Hmmm, so it looks like the flag starts fine, but then it kinda does not make sense.
What I realize is that the last n
should switch places with the }
but it is not doing so, this gives me an idea, let’s replace half the characters starting from the left and the other half from the right
poctf{uwsp_1_4m_7h3_0c34n}
Nicee it worked, here’s the final script used.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
scrambled_flag = "optc{fwups1_4__m_4__mh7_3c043}n"
def unscramble(flag):
unscrambled_left = ""
unscrambled_right = ""
for i in range(0, len(flag) // 2, 2):
unscrambled_left += "".join(reversed(flag[i:i + 2]))
for i in range(len(flag)+1, (len(flag) // 2) + 6, -2):
unscrambled_right = flag[i:i-2:-1] + unscrambled_right
return unscrambled_left + unscrambled_right
print(unscramble(scrambled_flag))
Flag
poctf{uwsp_1_4m_7h3_0c34n}
Reverse300-1
Hightlighted techniques
- symbolic execution with angr
Learning the game
We are presented with a file called Reverse300-1
. I run the file
command against it:
Reverse300-1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=cd45573f4bd7b1d2d713912994eec4d881dfb71f, for GNU/Linux 3.2.0, not stripped
it seems to be an ELF file, so I’ll be running it with my docker debian container.
1
2
3
## ./Reverse300-1
Enter the key to decrypt the flag: sadfasdf
Incorrect key length. Key must be 22 characters long.
Playing the game
Let’s check the functions in an objdump disassembly.
1
2
3
4
5
6
7
8
9
10
...
00000000004012e0 <decrypt_flag>:
4012e0: 55 push %rbp
4012e1: 48 89 e5 mov %rsp,%rbp
4012e4: 48 83 ec 40 sub $0x40,%rsp
4012e8: 48 89 7d c8 mov %rdi,-0x38(%rbp)
4012ec: 48 b8 14 0c 0b 12 11 movabs $0x16302911120b0c14,%rax
4012f3: 29 30 16
4012f6: 48 ba 14 05 0f 7d 50 movabs $0x212f12507d0f0514,%rdx
...
We see that there’s a decrypt_flag()
function which is probably called when the password is correct, so let’s use angr to look for a state at the start of this function and then get stdin
as a string
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import angr
def get_key():
project = angr.Project("Reverse300-1", auto_load_libs=False)
simgr = project.factory.simgr()
result = simgr.explore(find=0x4012e0)
if result.found:
state: angr.SimState = result.found[0]
return state.posix.dumps(0)
print(get_key())
This outputs: b'dchfwREaguPJ8!pV*^U&Ms'
okay then, let’s use that as our password
1
2
3
# ./Reverse300-1
Enter the key to decrypt the flag: dchfwREaguPJ8!pV*^U&Ms
The flag is: poctf{uwsp_7h3_w0rld_15_4_57463}
Flag
poctf{uwsp_7h3_w0rld_15_4_57463}
Reverse300-2
Hightlighted techniques
- procmon
- using burpsuite as proxy
Learning the game
We are presented with a file called Reverse
. I run the file
command against it:
Reverse300-2.exe: PE32+ executable (console) x86-64, for MS Windows, 10 sections
1
2
3
## ./Reverse300-2.exe
Fetching the flag from a secure source...
Success!
Playing the game
Okay so I got kinda stuck in the beginning for this one I’ll admit, I tried looking at the program in ghidra and looking for where the Success!
string was being printed, but I couldn’t find it.
Finally, I decided that the “secure source” from where the flag was being gotten had to be one of two:
- a hidden file created at runtime
- some remote server
So I looked at what the program does with procmon.
After applying a few filters to make it easier we can see that the program is clearly performing some http request, so I will use burpsuite to intercept the data for this request
Flag
poctf{uwsp_4b4nd0n_4ll_h0p3}
Reverse300-3
Hightlighted techniques
- a bit harder ghidra reveng
Learning the game
We are presented with a file called Reverse300-3
. I run the file
command against it:
Reverse300-3: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=769e50127c2c2acfc39695dd52a429b3b7f510be, for GNU/Linux 3.2.0, not stripped
it seems to be an ELF file, so I’ll be running it with my docker debian container.
1
2
3
4
5
## ./Reverse
Memory initialized. Encoded flag loaded.
Decoding the flag...
octf{
Done.
Playing the game
Initial static analysis
Okay so let’s open up the file in ghidra, all the code I show here has been modified a bit for better understanding
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
undefined8 main(void)
{
char local_38 [32];
local_38[0] = '\x01';
local_38[1] = '\0';
local_38[2] = '\x05';
local_38[3] = '\x01';
local_38[4] = '\x06';
local_38[5] = '\0';
local_38[6] = '\x01';
local_38[7] = '\x01';
local_38[8] = '\x05';
local_38[9] = '\x02';
local_38[10] = '\x06';
local_38[0xb] = '\0';
local_38[0xc] = '\x01';
local_38[0xd] = '\x02';
local_38[0xe] = '\x05';
local_38[0xf] = '\x03';
local_38[0x10] = '\x06';
local_38[0x11] = '\0';
local_38[0x12] = '\x01';
local_38[0x13] = '\x03';
local_38[0x14] = '\x05';
local_38[0x15] = '\x04';
local_38[0x16] = '\x06';
local_38[0x17] = '\0';
local_38[0x18] = '\x01';
local_38[0x19] = '\x04';
local_38[0x1a] = '\x05';
local_38[0x1b] = '\x05';
local_38[0x1c] = '\x06';
local_38[0x1d] = '\0';
local_38[0x1e] = -1;
local_38[0x1f] = '\0';
initialize_memory();
execute_vm(local_38,32);
return 0;
}
So in our main function we have:
- a byte string
local_38
- a call to
initialize_memory()
without any arguments - a call with the following arguments
execute_vm(local_38,32)
let’s take a look at that initialize_memory()
function
Okay so it seems that it is just loading the encoded flag into a static region of memory, presumably to later be decoded (at least partially) by the execute_vm()
function
So now let’s see what the execute_vm()
function does
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
void execute_vm(long instructions,ulong param_2)
{
byte current_value;
ulong i;
long iplus1;
byte operand;
byte operator;
i = 0;
current_value = 0;
puts("Decoding the flag...");
while( true ) {
if (param_2 <= i) {
return;
}
iplus1 = i + 1;
operator = *(byte *)(i + instructions);
i = i + 2;
operand = *(byte *)(iplus1 + instructions);
if (6 < operator) break;
switch(operator) {
case 1:
current_value = memory[(int)(uint)operand];
break;
case 2:
memory[(int)(uint)operand] = current_value;
break;
case 3:
current_value = current_value + memory[(int)(uint)operand];
break;
case 4:
current_value = current_value - memory[(int)(uint)operand];
break;
case 5:
current_value = current_value ^ memory[(int)(uint)operand];
break;
case 6:
putchar((uint)current_value);
break;
default:
goto switchD_004011f2_caseD_6;
}
}
if (operator == 0xff) {
puts("\nDone.");
return;
}
switchD_004011f2_caseD_6:
printf("Unknown instruction: %02x\n",(ulong)operator);
return;
}
Okay so a lot is going on but if we follow the function arguments we can realize a few things to begin with:
param_2
is only used to know for how long to run the while loop- the first parameter (renamed
instructions
) is iterated in batches of two bytes - we can indeed see that the symbol
memory
(where the flag is stored) is being used in some way
After looking at it for a while we realize that we have kind of a custom instructions language. I’ll leave the realizing how it works exactly as an exercise for the reader ;)
Instruction set
Instructions come in pairs
- there is a single register which I’m calling
current_value
- everything is relative to the
memory
symbol address
Operators
We have six different possibilities for an operator
- 1 : copy value at operand offset
- 2 : set value at operand offset equal to
current_value
- 3 : add current value to the value at operand offset and set it
- 4 : sub current value to the value at operand offset and set it
- 5 : xor current value to the value at operand offset and set it
- 6 : print current value
How is the flag decoded?
Now that we have the knowledge of how this internal instruction set works, let’s figure out the flag’s encoding.
We know that the execute_vm
is getting a static set of instructions stored at local_38
in main()
, let’s analyse it and see what it does.
Here’s the full program:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
local_38[0] = '\x01';
local_38[1] = '\0';
local_38[2] = '\x05';
local_38[3] = '\x01';
local_38[4] = '\x06';
local_38[5] = '\0';
local_38[6] = '\x01';
local_38[7] = '\x01';
local_38[8] = '\x05';
local_38[9] = '\x02';
local_38[10] = '\x06';
local_38[0xb] = '\0';
local_38[0xc] = '\x01';
local_38[0xd] = '\x02';
local_38[0xe] = '\x05';
local_38[0xf] = '\x03';
local_38[0x10] = '\x06';
local_38[0x11] = '\0';
local_38[0x12] = '\x01';
local_38[0x13] = '\x03';
local_38[0x14] = '\x05';
local_38[0x15] = '\x04';
local_38[0x16] = '\x06';
local_38[0x17] = '\0';
local_38[0x18] = '\x01';
local_38[0x19] = '\x04';
local_38[0x1a] = '\x05';
local_38[0x1b] = '\x05';
local_38[0x1c] = '\x06';
local_38[0x1d] = '\0';
local_38[0x1e] = -1;
local_38[0x1f] = '\0';
I’ll reformat it, so we can see the instructions with their operator and operand.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
copy 0
xor 1
print 0
copy 1
xor 2
print 0
copy 2
xor 3
print 0
copy 3
xor 4
print 0
copy 4
xor 5
print 0
-1 0
a bit better, now I’ll “disassemble” this, so it looks even better, keep in mind that operands are always offsets of memory
We can see a pattern here, I’ll guide you through the first few lines:
- copy value at offset 0 into the register
- xor the value at the register with the value at offset 1
- print the value at the register
then we do the same four more times increasing the offsets by two.
So the pattern is pretty simple, the encoding on the flag is a xor of the characters in pairs of two.
Get the full encoded flag and reverse it
To get the string I copied it as “Python byte string” in ghidra
Then with this script we do the reversing
1
2
3
4
5
6
7
8
9
10
11
12
def decode_key(encoded_key: bytes):
decoded_key = ""
for i in range(len(encoded_key)):
try:
decoded_key += chr(encoded_key[i] ^ encoded_key[i+1])
except:
pass
return decoded_key
encoded_key = b'\x70\x1f\x7c\x08\x6e\x15\x60\x17\x64\x14\x4b\x7c\x14\x27\x78\x1f\x2b\x46\x75\x2a\x1b\x2e\x71\x45\x23\x13\x23\x14\x69'
print(decode_key(encoded_key))
for some reason we don’t get the first char: octf{uwsp_7h3_g4m3_15_4f007}
but that’s okay, we know that after all
Flag
poctf{uwsp_7h3_g4m3_15_4f007}
Reverse400-1
Hightlighted techniques
Learning the game
We are presented with a file called Reverse
. I run the file
command against it:
Reverse400.exe: PE32+ executable (console) x86-64, for MS Windows, 6 sections
1
2
## ./Reverse400.exe
Encrypted flag: bd7e9dad4a5fe0e7911f93cb1bf5a321
From the icon we can figure out that this is a compiled Python program
Playing the game
Decompile Python
To do this there are actually two steps:
- extraction
- decompiling
Extracting the executable
To extract the .pyc files I used pyinstxtractor, but there’s a web version that lets you upload an executable and returns the extracted files if you don’t want to download stuff now.
this yielded a folder with a bunch of files, from those I chose one that caught my attention.
Decompiling Reverse400.pyc
For the decompilation I used pycdc
pycdc Reverse400.pyc > Reverse400.pyc.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Source Generated with Decompyle++
# File: Reverse400.pyc (Python 3.11)
pyobfuscate = lambda getattr: getattr.items()()
Il = chr(114) + chr(101)
lI = '[^a-zA-Z0-9]'
lIl = chr(115) + chr(117) + chr(98)
__import__('sys').setrecursionlimit(100000000)
exec(
getattr(__import__("zlib"), "decompress")
(bytes.fromhex(
''.replace(
'\n', ''))).decode()
)
Reversing the decompiled code
The decompiled code might look a bit menacing at first, but actually all it does is:
- call the
decompress
function in thezlib
module - pipe the result into
exec
So the actual code is compressed, let’s remove the call to exec
and move the result from decompress
to another file
Reversing decompressed code from decompiled code (my head might start to hurt)
Well this is… something.
I tried to reverse this by renaming variables, but that was not a great idea, not only because it took like half an hour to refactor the name of a single variable, but also when it was finally done, the code was not executable anymore.
Oh btw, we can run this and get the same output as before
1
2
3
# python3 really_obfuscated_thing.py
Encrypted flag: bd7e9dad4a5fe0e7911f93cb1bf5a321
So my next thought is that this whole obfuscated mess must be building some code that is at least executable and then running it with exec
or something like that.
So what I tried is hooking exec
by adding the following at the start of the program
1
2
3
4
oexec = exec
def exec(command):
print(command)
oexec(command)
and the output is…
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
lllllllllllllll,llllllllllllllI,lllllllllllllIl,lllllllllllllII,llllllllllllIll,llllllllllllIlI,llllllllllllIIl=bytearray,print,enumerate,chr,ord,len,bytes
from Crypto.Cipher import AES as IIIIlIIlIIllII
from Crypto.Util.Padding import pad as IIlIlIIllllIlI
def lIlIIIlllllIlllIll(IlIlIlIIlllIIlIlll):return''.join(lllllllllllllII(llllllllllllIll(A)^42)for A in IlIlIlIIlllIIlIlll)
def lllIIllllIIIIlllll(lIIllllllIlllIIIll,llIllIIlllIlIIllII):B=llIllIIlllIlIIllII;A=lIIllllllIlllIIIll;return A<<B&255|A>>8-B
def IllIlllIIlllIIIIlI(lIlIllIlllIlIlIIIl,IlllIIlIIIllIlIlII,lIIIIllIllllIIIlII):
B=lIIIIllIllllIIIlII;A=IlllIIlIIIllIlIlII;C=lllllllllllllll();E=llllllllllllIlI(A);F=llllllllllllIlI(B)
for(D,G)in lllllllllllllIl(lIlIllIlllIlIlIIIl):H=(A[D%E]+B[D%F])%8;C.append(lllIIllllIIIIlllll(G,H))
return C
def IIIllIlIIIIlIlIlIl(IIIIlllIIIIIlIIIll,llIllIlllIIIlllIll,IIllIlllIIIlIIlIlI):A=IIIIlIIlIIllII.new(llIllIlllIIIlllIll,IIIIlIIlIIllII.MODE_CBC,IIllIlllIIIlIIlIlI);return A.encrypt(IIlIlIIllllIlI(IIIIlllIIIIIlIIIll,IIIIlIIlIIllII.block_size))
def lIlIlIllIlIlIlIIll(lIIIIlIlIIIIIIIIIl):return llllllllllllIIl.fromhex(lIIIIlIlIIIIIIIIIl)
IIllIllIlIllIIIIIl='[redacted]'
IIllIIlIllIlIIlIll='fa21c9c2596099915dbc7845c941c14e81594b5c4f69177cc4059da11e782e0b'
IlllIlIlllIIIIIIll='504f43544632303234'
llIllIIlllIlIIllII='437261636b3430302d58'
IllIllllllIIIllIIl=lIlIIIlllllIlllIll(IIllIllIlIllIIIIIl)
IIllllIlIlIIIIIIll=lIlIlIllIlIlIlIIll(IlllIlIlllIIIIIIll)
if llllllllllllIlI(IIllllIlIlIIIIIIll)<16:IIllllIlIlIIIIIIll=IIllllIlIlIIIIIIll.ljust(16,b'\x00')
IlllIIlIIIllIlIlII=llIllIIlllIlIIllII[:32]if llllllllllllIlI(llIllIIlllIlIIllII)>=32 else llIllIIlllIlIIllII.ljust(32,'0')
lIIIIllIllllIIIlII=llllllllllllIIl.fromhex(IlllIIlIIIllIlIlII)
lllllIIllllIlIllll=IllIlllIIlllIIIIlI(IllIllllllIIIllIIl.encode('utf-8'),IIllllIlIlIIIIIIll,lIIIIllIllllIIIlII)
IIIlIIllllIlllIlII=IIIllIlIIIIlIlIlIl(lllllIIllllIlIllll,IIllllIlIlIIIIIIll,lIIIIllIllllIIIlII)
llllllllllllllI('Encrypted flag:',IIIlIIllllIlllIlII.hex())
__import__('sys').exit()
Reversing the deobfuscated code from the decompressed code from the decompiled code (my head 100% hurts now)
Okay this at least I can work with, after a long session of renaming and refactoring I ended up with this
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# bytearray,print,enumerate,chr,ord,len,bytes=bytearray,print,enumerate,chr,ord,len,bytes
from Crypto.Cipher import AES as AES
from Crypto.Util.Padding import pad as pad
def xor42(data_string):
return ''.join(chr(ord(A) ^ 42) for A in data_string)
def shifts_masks_sub(data,
param2):
ret = data << param2 & 255 | data >> 8 - param2
return ret
def three_way_encode(param1, key, iv):
bytesarray = bytearray()
param2_len = len(key)
param3_len = len(iv)
for (i, data) in enumerate(param1):
weird_mod_8 = (key[i % param2_len] + iv[i % param3_len]) % 8
bytesarray.append(shifts_masks_sub(data, weird_mod_8))
return bytesarray
def encrypt_flag(encoded_flag, key, param3):
cipher = AES.new(key,
AES.MODE_CBC,
param3)
return cipher.encrypt(
pad(encoded_flag, AES.block_size))
def bytesfromhex(hex_string): return bytes.fromhex(hex_string)
redacted = '[redacted]'
IIllIIlIllIlIIlIll = 'fa21c9c2596099915dbc7845c941c14e81594b5c4f69177cc4059da11e782e0b'
key = '504f43544632303234' # POCTF2024
iv = '437261636b3430302d58' # Crack400-X
redacted_xor42 = xor42(redacted)
key_bytes = bytesfromhex(key)
if len(key_bytes) < 16:
key_bytes = key_bytes.ljust(16, b'\x00')
iv_32 = iv[:32] if len(iv) >= 32 else iv.ljust(32, '0')
iv_32_bytes = bytes.fromhex(iv_32)
encoded_flag = three_way_encode(redacted_xor42.encode('utf-8'), key_bytes, iv_32_bytes)
encrypted_flag = encrypt_flag(encoded_flag, key_bytes, iv_32_bytes)
print('Encrypted flag:', encrypted_flag.hex())
__import__('sys').exit()
Key things to notice
- the
IIllIIlIllIlIIlIll
variable is never used (probably the encrypted flag hidden in there) - There is encryption but the whole flag encryption process takes three steps
- Every character is XORed with decimal 42
- Some form of encoding done in the
three_way_encode()
function - AES encryption
The first and last steps are easy to reverse since we already have the key and iv values for the encryption.
How does the “three way encoding” work? (btw I only called it that because it takes three parameters)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def shifts_masks_sub(data,
param2):
ret = data << param2 & 255 | data >> 8 - param2
return ret
def three_way_encode(param1, key, iv):
bytesarray = bytearray()
param2_len = len(key)
param3_len = len(iv)
for (i, data) in enumerate(param1):
weird_mod_8 = (key[i % param2_len] + iv[i % param3_len]) % 8
bytesarray.append(shifts_masks_sub(data, weird_mod_8))
return bytesarray
three_way_encode
analysis:
- a for loop is initiated iterating over the flag, inside this loop:
- the variable
weird_mod_8
is constructed with data from thekey
andiv
, since it does not depend on the flag, we can reconstruct it shifts_masks_sub
is called passing it data from the current index of the encoded flag and theweird_mod_8
value- the result of the previous call is appended to
bytesarray
- the variable
bytesarray
is returned
shifts_masks_sub
analysis:
param2
is a number between 0 and 7- data is split at
param2
index - the left side is moved to the right
- the right side is moved to the left
- mixed data is returned
Finally, build a script to reverse this mess
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad
def decrypt_flag(encrypted_flag, key, iv):
cipher = AES.new(key,
AES.MODE_CBC,
iv)
return cipher.decrypt(pad(encrypted_flag, AES.block_size))
def undo_shifts_masks_sub(ret, param2):
data = ret << (8 - param2) & 255 | ret >> param2
return data
def three_way_decode(encoded_flag, key, iv):
decoded_flag = ""
key_len = len(key)
iv_len = len(iv)
for (i, data) in enumerate(encoded_flag):
weird_mod_8 = (key[i % key_len] + iv[i % iv_len]) % 8
decoded_flag += chr(undo_shifts_masks_sub(data, weird_mod_8))
return decoded_flag
def xor42(data_string):
return ''.join(chr(ord(A) ^ 42) for A in data_string)
if __name__ == "__main__":
key = bytes.fromhex('504f43544632303234')
iv_str = '437261636b3430302d58'
if len(key) < 16:
key = key.ljust(16, b'\x00')
iv_32 = iv_str[:32] if len(iv_str) >= 32 else iv_str.ljust(32, '0')
iv = bytes.fromhex(iv_32)
encrypted_flag_hex = bytes.fromhex("fa21c9c2596099915dbc7845c941c14e81594b5c4f69177cc4059da11e782e0b")
encoded_flag = decrypt_flag(encrypted_flag_hex, key, iv)
flag_xor42 = three_way_decode(encoded_flag, key, iv)
flag = xor42(flag_xor42)
print(flag)
And the output of this is…
poctf{uwsp_4ll_7h47_gl1773r5})))ü©$¸Î¯k2ó"'
very weird, but you know, the flag is there.
Flag
poctf{uwsp_4ll_7h47_gl1773r5}