Defcon Quals: Access Control (simple reverse engineer)

Hello all,

Today’s post will be another write-up from the Defcon CTF Qualifiers. This one will be the level called “Access Client”, or simply “client”, which was a one-point reverse engineering level. This post is going to be mostly about the process I use for reverse engineering crypto-style code - it’s a much different process than reversing higher level stuff, because each instruction matters and it’s often extremely hard to follow.

Having just finished another level (r0pbaby, I think), and having about an hour left in the competition, I wanted something I could finish quickly. There were two one-point reverse engineering challenges open that we hadn’t solved: one was 64-bit and written in C++, whereas this one was 32-bit and C and only had a few short functions. The choice was easy. :)

I downloaded the binary and had a look at its strings. Lots of text-based stuff, such as “list users”, “print key”, and “connection id:”, which I saw as a good sign!

Running it

If you wnat to follow along, I uploaded all my work to my Github page, including a program called server.rb that more or less simulates the server. It’s written in Ruby, obviously, and simulates all the responses. The real client can’t actually read the flag from it, though, and I can’t figure out why (and spent way too much time last night re-reversing the client binary before realizing it doesn’t matter).

Anyway, when you run the client, it asks for an ip address:

$ ./client
need IP

The competition gives you a target, so that’s easy (note that most of this is based on my own server.rb, not the real one, which I re-created from packet captures:

$ ./client 52.74.123.29
Socket created
Enter message : Hello
nope...Hello

If you look at a packet capture of this, you’ll see that a connection is made but nothing is sent or received. Local checks are best checks!

All right.. time for some reversing! I open up the client program in IDA, and go straight to the Strings tab (Shift-F12). I immediately see “Enter message :” so I double click it and end up here:

.rodata:080490F5</span> ; char aEnterMessage[]
.rodata:080490F5</span> aEnterMessage   db 'Enter message : ',0 ; DATA XREF: main+178o
.rodata:08049106</span> aHackTheWorld   db 'hack the world',0Ah,0 ; DATA XREF: main+1A7o
.rodata:08049116</span> ; char aNope_[]
.rodata:08049116</span> aNope___S       db 'nope...%s',0Ah,0    ; DATA XREF: main+1CAo

Could it really be that easy?

The answer, for a change, is yes:

$ ./client 52.74.123.29
Socket created
Enter message : hack the world
<< connection ID: nuc EW1A IQr^2&


*** Welcome to the ACME data retrieval service ***
what version is your client?

<< hello...who is this?
<<

<< enter user password

<< hello grumpy, what would you like to do?
<<

<< grumpy
mrvito
gynophage
selir
jymbolia
sirgoon
duchess
deadwood
hello grumpy, what would you like to do?

<< the key is not accessible from this account. your administrator has been notified.
<<
hello grumpy, what would you like to do?

Then it just sits there.

I logged the traffic with Wireshark and it looks like this (blue = incoming, red = outgoing, or you can just download my pcap):

connection ID: Je@/b9~A>Xa'R-


*** Welcome to the ACME data retrieval service ***
what version is your client?
version 3.11.54
hello...who is this?grumpy

enter user password
H0L31
hello grumpy, what would you like to do?
list users
grumpy
mrvito
gynophage
selir
jymbolia
sirgoon
duchess
deadwood
hello grumpy, what would you like to do?
print key
the key is not accessible from this account. your administrator has been notified.
hello grumpy, what would you like to do?

Connection IDs and passwords

I surmised, based on this, that the connection id was probably random (it looks random) and that the password is probably hashed (poorly) and not replay-able (that’d be too easy). Therefore, the password is probably based on the connection id.

To verify the first part, I ran a capture a second time:

connection ID: #2^1}P>JAqbsaj
[...]
hello...who is this?
grumpy
enter user password
V/%S:

Yup, it’s different!

I did some quick digging in IDA and found a function - sub_8048EAB - that was called with “grumpy” and “1” as parameters, as well as a buffer that would be sent to the server. It looked like it did some arithmetic on “grumpy” - which is presumably a password, and it touched a global variable - byte_804BC70 - that, when I investigated, turned out to be the connection id. The function was called from a second place, too, but we’ll get to that later!

So now we’ve found a function that looks at the password and the connection id. That sounds like the hashing function to me (and note that I’m using the word “hashing” in its literal sense, it’s obviously not a secure hash)! I could have used a debugger to verify that it was actually returning a hashed password, but the clock was ticking and I had to make some assumptions in order to keep moving - if the the assumptions turned out to be wrong, I wouldn’t have finished the level, but I wouldn’t have finished it either if I verified everything.

I wasn’t entirely sure what had to be done from here, but it seemed logical to me that reverse engineering the password-hashing function was something I’d eventually have to do. So I got to work, figuring it couldn’t hurt!

Reversing the hashing function

There are lots of ways to reverse engineer a function. Frequently, I take a higher level view of what libc/win32 functions it calls, but sub_8048EAB doesn’t call any functions. Sometimes I’ll try to understand the code, mentally, but I’m not super great at that. So I used a variation of this tried-and-true approach I often use for crypto code:

Reverse each line of assembly to exactly one line of C
Test it against the real version, preferably instrumented so I can automatically ensure that it's working properly
While the output of my code is different from the output of their code, use a debugger (on the binary) and printf statements (on your implementation) to figure out where the problem is - this usually takes the most of my time, because there are usually several mistakes
With the testing code still in place, simplify the C function as much as you can

Because I only had about an hour to reverse this, I had to cut corners. I reversed it to Ruby instead of C (so I wouldn’t have to deal with sockets in C), I didn’t set up proper instrumentation and instead used Wireshark, and I didn’t simplify anything till afterwards. In the end, I’m not sure whether this was faster or slower than doing it “right”, but it worked so I can’t really complain.

Version 1

As I said, the first thing I do is translate the code directly, line by line, to assembly. I had to be a little creative with loops and pointers because I can’t just use goto and cast everything to an integer like I would in C, but this is what it looked like. Note that I’ve fixed all the bugs that were in the original version - there were a bunch, but it didn’t occur to me to keep the buggy code - I did, however, leave in the printf-style statements I used for debugging!

# mode = 1 for passwords, 7 for keys
def hash_password(password, connection_id, mode)
# mov     eax, [ebp+password]
  eax = password

# mov     [ebp+var_2C], eax
  var_2c = eax

# mov     eax, [ebp+buffer]
  eax = ""

# mov     [ebp+var_30], eax
  var_30 = ""

# xor     eax, eax
  eax = 0

# mov     ecx, ds:g_connection_id_plus_7 ; 0x0000007d, but changes
  ecx = connection_id[7]
  #puts('%x' % ecx.ord)

# mov     edx, 55555556h
  edx = 0x55555556
# mov     eax, ecx
  eax = ecx
# imul    edx
  #puts("imul")
  #puts("%x" % eax.ord)
  #puts("%x" % edx)
  edx = ((eax.ord * edx) >> 32)
  #puts("%x" % edx)
# mov     eax, ecx
  eax = ecx
# sar     eax, 1Fh
  #puts("sar")
  #puts("%x" % eax.ord)
  eax = eax.ord >> 0x1F
  #puts("%x" % eax)
# mov     ebx, edx
  ebx = edx
# sub     ebx, eax
  ebx -= eax
  #puts("sub")
  #puts("%x" % ebx)
# mov     eax, ebx
  eax = ebx
# mov     [ebp+var_18], eax
  var_18 = eax
# mov     edx, [ebp+var_18]
  edx = var_18
# mov     eax, edx
  eax = edx
# add     eax, eax
  eax = eax * 2
# add     eax, edx
  eax = eax + edx

  #puts("")
  #puts("%x" % eax)
# mov     edx, ecx
  edx = ecx
# sub     edx, eax
  #puts()
  #puts("%x" % ecx.ord)
  #puts("%x" % edx.ord)
  edx = edx.ord - eax
  #puts("%x" % edx)
# mov     eax, edx
  eax = edx
# mov     [ebp+var_18], eax
  var_18 = eax
  #puts()
  #puts("%x" % var_18)
# mov     eax, dword_804B04C
  eax = mode
# add     [ebp+var_18], eax
  var_18 += eax
  #puts("%x" % eax)
# mov     edx, offset g_connection_id ; <--
  edx = connection_id
# mov     eax, [ebp+var_18]
  eax = var_18
# add     eax, edx
# mov     dword ptr [esp+8], 5 ; n
# mov     [esp+4], eax    ; src
# lea     eax, [ebp+dest]
# mov     [esp], eax      ; dest
# call    _strncpy
  dest = connection_id[var_18, 5]
  #puts(dest)
# mov     [ebp+var_1C], 0
  var_1c = 0

# jmp     short loc_8048F4A
# loc_8048F2A:                            ; CODE XREF: do_password+A3j
  0.upto(4) do |var_1c|
#   mov     eax, [ebp+var_1C]
    eax = var_1c
#   add     eax, [ebp+var_30]
    # XXX
#   lea     edx, [ebp+dest]
    edx = dest

#   add     edx, [ebp+var_1C]
#   movzx   ecx, byte ptr [edx]
    ecx = edx[var_1c]
#   mov     edx, [ebp+var_1C]
    edx = var_1c

#   add     edx, [ebp+var_2C]
#   movzx   edx, byte ptr [edx]
    edx = var_2c[var_1c]

#   xor     edx, ecx
    edx = edx.ord ^ ecx.ord
#   mov     [eax], dl
    edx &= 0x0FF
    var_30[var_1c] = (edx & 0x0FF).chr

#   add     [ebp+var_1C], 1
#
#   loc_8048F4A:                            ; CODE XREF: do_password+7Dj
#   cmp     [ebp+var_1C], 4
#   jle     short loc_8048F2A
  end

  #puts()

  return var_30
end

After I got it working and returning the same value as the real implementation, I had a problem! The value I returned - even though it matched the real program - wasn’t quite right! It had a few binary characters in it, whereas the value sent across the network never did. I looked around and found the function - sub_8048F67 - that actually sends the password to the server. It turns out, that function replaces all the low- and high-ASCII characters with proper ones (the added lines are in bold):

# mode = 1 for passwords, 7 for keys
def hash_password(password, connection_id, mode)
# mov     eax, [ebp+password]
  eax = password

# mov     [ebp+var_2C], eax
  var_2c = eax

# mov     eax, [ebp+buffer]
  eax = ""

# mov     [ebp+var_30], eax
  var_30 = ""

# xor     eax, eax
  eax = 0

# mov     ecx, ds:g_connection_id_plus_7 ; 0x0000007d, but changes
  ecx = connection_id[7]
  #puts('%x' % ecx.ord)

# mov     edx, 55555556h
  edx = 0x55555556
# mov     eax, ecx
  eax = ecx
# imul    edx
  #puts("imul")
  #puts("%x" % eax.ord)
  #puts("%x" % edx)
  edx = ((eax.ord * edx) >> 32)
  #puts("%x" % edx)
# mov     eax, ecx
  eax = ecx
# sar     eax, 1Fh
  #puts("sar")
  #puts("%x" % eax.ord)
  eax = eax.ord >> 0x1F
  #puts("%x" % eax)
# mov     ebx, edx
  ebx = edx
# sub     ebx, eax
  ebx -= eax
  #puts("sub")
  #puts("%x" % ebx)
# mov     eax, ebx
  eax = ebx
# mov     [ebp+var_18], eax
  var_18 = eax
# mov     edx, [ebp+var_18]
  edx = var_18
# mov     eax, edx
  eax = edx
# add     eax, eax
  eax = eax * 2
# add     eax, edx
  eax = eax + edx

  #puts("")
  #puts("%x" % eax)
# mov     edx, ecx
  edx = ecx
# sub     edx, eax
  #puts()
  #puts("%x" % ecx.ord)
  #puts("%x" % edx.ord)
  edx = edx.ord - eax
  #puts("%x" % edx)
# mov     eax, edx
  eax = edx
# mov     [ebp+var_18], eax
  var_18 = eax
  #puts()
  #puts("%x" % var_18)
# mov     eax, dword_804B04C
  eax = mode
# add     [ebp+var_18], eax
  var_18 += eax
  #puts("%x" % eax)
# mov     edx, offset g_connection_id ; <--
  edx = connection_id
# mov     eax, [ebp+var_18]
  eax = var_18
# add     eax, edx
# mov     dword ptr [esp+8], 5 ; n
# mov     [esp+4], eax    ; src
# lea     eax, [ebp+dest]
# mov     [esp], eax      ; dest
# call    _strncpy
  dest = connection_id[var_18, 5]
  #puts(dest)
# mov     [ebp+var_1C], 0
  var_1c = 0

# jmp     short loc_8048F4A
# loc_8048F2A:                            ; CODE XREF: do_password+A3j
  0.upto(4) do |var_1c|
#   mov     eax, [ebp+var_1C]
    eax = var_1c
#   add     eax, [ebp+var_30]
    # XXX
#   lea     edx, [ebp+dest]
    edx = dest

#   add     edx, [ebp+var_1C]
#   movzx   ecx, byte ptr [edx]
    ecx = edx[var_1c]
#   mov     edx, [ebp+var_1C]
    edx = var_1c

#   add     edx, [ebp+var_2C]
#   movzx   edx, byte ptr [edx]
    edx = var_2c[var_1c]

#   xor     edx, ecx
    edx = edx.ord ^ ecx.ord
#   mov     [eax], dl
    edx &= 0x0FF

    #puts("before edx = %x" % edx)
    if(edx < 0x1f)
      #puts("a")
      edx += 0x20
    elsif(edx > 0x7F)
      edx = edx - 0x7E + 0x20
    end
    #puts("after edx = %x" % edx)

    var_30[var_1c] = (edx & 0x0FF).chr

#   add     [ebp+var_1C], 1
#
#   loc_8048F4A:                            ; CODE XREF: do_password+7Dj
#   cmp     [ebp+var_1C], 4
#   jle     short loc_8048F2A
  end

  #puts()

  return var_30
end

As you can see, it’s quite long and difficult to follow. But, now that the bugs were fixed, it was outputting the same thing as the real version! I set it up to log in with the username ‘grumpy’ and the password ‘grumpy’ and it worked great!

Cleaning it up

I didn’t actually clean up the code until after the competition, but here’s the step-by-step cleanup that I did, just so I could blog about it.

First, I removed all the comments:

def hash_password_phase2(password, connection_id, mode)
  eax = password
  var_2c = eax
  eax = ""
  var_30 = ""
  eax = 0
  ecx = connection_id[7]
  edx = 0x55555556
  eax = ecx
  edx = ((eax.ord * edx) >> 32)
  eax = ecx
  eax = eax.ord >> 0x1F
  ebx = edx
  ebx -= eax
  eax = ebx
  var_18 = eax
  edx = var_18
  eax = edx
  eax = eax * 2
  eax = eax + edx

  edx = ecx
  edx = edx.ord - eax
  eax = edx
  var_18 = eax
  eax = mode
  var_18 += eax
  edx = connection_id
  eax = var_18
  dest = connection_id[var_18, 5]
  var_1c = 0

  0.upto(4) do |var_1c|
    eax = var_1c
    edx = dest
    ecx = edx[var_1c]
    edx = var_1c
    edx = var_2c[var_1c]
    edx = edx.ord ^ ecx.ord
    edx &= 0x0FF
    if(edx < 0x1f)
      edx += 0x20
    elsif(edx > 0x7F)
      edx = edx - 0x7E + 0x20
    end
    var_30[var_1c] = (edx & 0x0FF).chr
  end
  return var_30
end

Then I started eliminating redundant statements:

def hash_password_phase3(password, connection_id, mode)
  ecx = connection_id[7]
  eax = ecx
  edx = ((eax.ord * 0x55555556) >> 32)
  eax = ecx
  eax = eax.ord >> 0x1F
  eax = ((edx - (eax.ord >> 0x1F)) * 2) + edx

  edx = ecx
  edx = edx.ord - eax
  eax = edx
  var_18 = eax
  var_18 += mode
  edx = connection_id
  eax = var_18
  dest = connection_id[var_18, 5]

  result = ""
  0.upto(4) do |i|
    eax = i
    edx = dest
    ecx = edx[i]
    edx = password[i]
    edx = edx.ord ^ ecx.ord
    edx &= 0x0FF
    if(edx < 0x1f)
      edx += 0x20
    elsif(edx > 0x7F)
      edx = edx - 0x7E + 0x20
    end
    result << (edx & 0x0FF).chr
  end

  return result
end

Removed some more redundancy:

def hash_password_phase4(password, connection_id, mode)
  char_7 = connection_id[7].ord
  edx = ((char_7 * 0x55555556) >> 32)
  eax = ((edx - (char_7 >> 0x1F >> 0x1F)) * 2) + edx

  result = ""
  0.upto(4) do |i|
    edx = (password[i].ord ^ connection_id[char_7 - eax + mode + i].ord) & 0xFF

    if(edx < 0x1f)
      edx += 0x20
    elsif(edx > 0x7F)
      edx = edx - 0x7E + 0x20
    end
    result << (edx & 0x0FF).chr
  end

  return result
end

And a final cleanup pass where I eliminated the “bad paths” - things that I know can’t possibly happen:

def hash_password_phase5(password, connection_id, mode)
  char_7 = connection_id[7].ord

  result = ""
  0.upto(4) do |i|
    edx = password[i].ord ^ connection_id[i + char_7 - (((char_7 * 0x55555556) >> 32) * 3) + mode].ord
    if(edx < 0x1f)
      edx += 0x20
    elsif(edx > 0x7F)
      edx = edx - 0x7E + 0x20
    end
    result << edx.chr
  end

  return result
end

And that’s the final product! Remember, at each step of the way I was testing and re-testing to make sure it worked for a few dozen test strings. That’s important because it’s really, really easy to miss stuff.

The rest of the level

Now, getting back to the level…

As we saw above, after logging in, the real client sends “list users” then “print key”. “print key” fails because the user doesn’t have administrative rights, so presumably one of the users printed out on the “list users” page does.

I went through and manually entered each user into the program, with the same username as password (seemed like the thing to do, since grumpy’s password was “grumpy”) until I reached the user “duchess”. When I tried “duchess”, I got the prompt:

challenge: /\&[$
answer?

When I was initially reversing the password hashing, I noticed that the hash_password() function was called a second time near the strings “challenge:” and “answer?”! The difference was that instead of passing the integer 1 as the mode, it passed 7. So I tried calling hash_password(‘/\&[$’, connection_id, 7) and got the response, “<=}-^”.

I sent that, and the key came back! Here’s the full session:

connection ID: Tk8)k)e3a[vzN^


*** Welcome to the ACME data retrieval service ***
what version is your client?
version 3.11.54
hello...who is this?
duchess
enter user password
/MJ#L
hello duchess, what would you like to do?
print key
challenge: /\&[$
answer?
<=}-^
the key is: The only easy day was yesterday. 44564

I submitted the key with literally three minutes to go. I was never really sure if I was doing the right thing at each step of the way, but it worked!

An alternate solution

If I’d had the presence of mind to realize that the username would always be the password, there’s another obvious solution to the problem that probably would have been a whole lot easier.

The string “grumpy” (as both the username and the password) is only read in three different places in the binary. It would have been fairly trivial to:

Find a place in the binary where there's some room (right on top of the old "grumpy" would be fine)
Put the string "duchess" in this location (and the other potential usernames if you don't yet know which one has administrative access)
Patch the three references to "grumpy" to point to the new string instead of the old one - unfortunately, using a new location instead of just overwriting the strings is necessary because "duchess" is longer than "grumpy" so there's no room
Run the program and let it get the key itself

That would have been quicker and easier, but I wasn’t confident enough that the usernames and passwords would be the same, and I didn’t want to risk going down the wrong path with almost no time left, so I decided against trying that.

Conclusion

This wasn’t the most exciting level I’ve ever done, but it was quick and gave me the opportunity to do some mildly interesting reverse engineering.

The main idea was to show off my process - translate line by line, instrument it, debug till it works, then refactor and reduce and clean up the code!