16 August 2016

Running the Labyrenth: Unit 42 CTF

At least once a year I try to publish my work process for a Capture The Flag (CTF) event. If you're not familiar with CTFs, they're a timed challenge of very difficult or obscure challenges to gain a "flag" to submit for points. Some enjoy these, some feel them a waste of time. At the very least, they're exercises to keep your mind sharp and your skills prepared for the unexpected.

This year, Palo Alto Networks (notably their threat research team Unit 42), put together a great CTF that was open to the public for one month. Uniquely, they offered sizable cash prizes to the first person to win each category of challenges and to the top winners overall.

Categories of challenges were separated between: Windows, Unix, Documents, Mobile, Random, and Threat. While some of these are apparent, and Random was a cool assortment of off-the-wall stuff, Threat was unique for being very abstract problems of pattern analysis and writing YARA rules. Overall, nearly 40 challenges that were woven through the narrative of the excellent film starring David Bowie: Labyrinth.

While a small number were able to complete each and every one of these challenges, I was excited to just do a handful, about 32. What I'll provide here are just a few of those where I feel like I did something unique or profoundly stupid to obtain the answer.




Windows Challenge #1


The first challenge is a 32-bit executable named AntiD.exe. That wants a typed-in key and compares it to an encoded value to give a Go/NoGo:



 A quick glance in IDA Pro shows a single subroutine, no strings, no realistic imports, leading one to believe that this is a packed executable. A hex view of the file shows section names of ".ups0" and ".ups1". A little farther down, at the beginning of the first section, is the text "3.08" "UPX!":

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

000003D0   00 00 00 00 00 00 00 00  00 00 00 33 2E 30 38 00              3.08
000003E0   55 50 58 21 0D 09 02 07  1C B5 B1 22 80 D7 23 65   UPX!     µ±"€×#e
000003F0   81 66 00 00 EE 0F 00 00  00 28 00 00 26 00 00 ED    f  î    (  &  í

Obviously, we think this is UPX packing :) However, UPX fails to unpack it.

Now, just from guessing, I was able to solve this issue within five seconds and move on. Others debugged it manually until it unpacked in memory. The "official" method should've been to note the error message: "CantUnpackException: file is modified/hacked/protected; take care!!!" and review the UPX source code. Look for the error message and work backwards from there:

    if (memcmp(isection[0].name,"UPX",3) == 0)
    {
        ...
    }
    if (is_packed && ih.entry < isection[2].vaddr)
    {
        ...
            if (offset >= 0 && find(buf + offset + 1, sizeof(buf) - offset - 1, magic, 7) >= 0)
                x = true;
        } catch (...) {
            //x = true;
        }
        if (x)
            throwCantUnpack("file is modified/hacked/protected; take care!!!");
        else
            throwCantUnpack("file is possibly modified/hacked/protected; take care!");
        return false;   // not reached
    }


Note here that the routine starts with a check to see if the first three bytes of the section name are literally "UPX". If not, start throwing errors. By simply changing "ups" to "UPX", the file can instantly be unpacked for analysis. This is the guessing part that I did in five seconds.



Once you're in the executable, it's a straightforward matter of finding the data and the encoding routine. This routine, below, takes the input values, does math on it, and checks each byte against an encoded set of bytes.



Each byte is XOR by 0x33, added to by 0x44, XOR by 0x55, subtracted by 0x66, and finally XOR again by an incremental value. That last part makes it extremely difficult to reverse the routine, as we'd have to brute force that value. Since the routine is simple, the easiest method is to just brute force from the beginning. We can do this byte-by-byte:


#!/usr/bin/env python
key = (0x8C, 0xF1, 0x53, 0xA3, 0x08, 0xD7, 0xDC, 0x48, 
       0xDB, 0x0C, 0x3A, 0xEE, 0x15, 0x22, 0xC4, 0xE5, 
       0xC9, 0xA0, 0xA5, 0x0C, 0xD3, 0xDC, 0x51, 0xC7, 
       0x39, 0xFD, 0xD0, 0xF8, 0x3B, 0xE8, 0xCC, 0x03, 
       0x06, 0x43, 0xF7, 0xDA, 0x7E, 0x65, 0xAE, 0x80)

def crack(guess):
    v2 = 0
    for i in range(0, len(guess)):
        v4 = ord(guess[i]) ^ 0x33 & 0xFF
        v5 = v4 + 0x44 & 0xFF
        v6 = v5 ^ 0x55 & 0xFF
        v7 = v6 - 0x66 & 0xFF
        v8 = v7 ^ v2 & 0xFF
        if not v8 == key[i]:
            return False
        v2 += v8 
    return True
    
code = "PAN{"
for pos in range(4, len(key)):
    for char in range(32, 127):
        tmp = crack(code + chr(char))
        if tmp:
            code = code + chr(char)
            break
print(code)

This will print out the flag:

PAN{C0nf1agul4ti0ns_0n_4_J08_W3LL_D0N3!}

Windows Challenge #4


This challenge gave quite a few people issues, and frustrated me for awhile. This is a GUI application that asks for input and gives a Go/NoGo.



Opening up the file in IDA Pro, we look for DialogFunc(), look for typed input to be read with GetDlgItemText(), and work from there. This routine starts with loading a whole set of encoded values into an array using obscure movdqa routines. It then does some obtuse methods for determining the correct length of the input, 32 bytes:

        v25 = -1;
        do
          ++v25;
        while ( GUESS[v25] );
        if ( v25 >= 32 )

If so, the typed value is parsed to ensure that each byte is either a 1, 2, or 3:

        while ( GUESS[pos2] - 0x31 <= 2 )
        {
          ...
        }


The value is then passed into a validation routine. This routine does something with every two bytes, something I spent too much time trying to understand, before I just skipped it entirely. It's just an initial Go/NoGo on the typed value, so we'll force it to Go each time. Once done, it'll add up each number of the value and create a total sum:

        do
        {
          ++pos;
          sum_of_serial += GUESS[pos] - 0x30;
        } while (pos < guess_len);

From there, we see that each number of the typed value is XOR by the sum_of_serial, and XOR again by the respective byte of the array of encoded values. It doesn't even check the values. If the earlier validation routine is correct, it'll print whatever here, even wrong answers.

So we don't know the valid serial number to pass that earlier routine, we don't want to figure out that earlier routine, and we don't know the valid sum of digits. That latter part we can work out easily, though, by working through the expected flag of "PAN{}"

"P" (0x50) XOR "encoded_byte0" (0x27) = 0x77
That first byte can then be either a 1, 2, or 3. So XOR 0x77 by each:
0x31 (1) XOR 0x77 = 0x46 (70)
0x32 (2) XOR 0x77 = 0x45 (69)
0x33 (3) XOR 0x77 = 0x44 (68)

Our sum is one of those three values. By doing the same down the line ("A" against the second encoded byte, "N" against the third), we see only one sum value working: 0x44 (68). Yay!

Now what? We know the sum, but we still don't know the encoded serial number. All we know is that each value is a 1, 2, or 3. So... guess?

I wrote a program to just guess the byte for each and print to an array. From there, pick out the values that make sense.


#!/usr/bin/env python
"""
000000000025EEB0  27 00 00 00 34 00 00 00 38 00 00 00 0E 00 00 00  '...4...8.......  
000000000025EEC0  36 00 00 00 47 00 00 00 19 00 00 00 11 00 00 00  6...G...........  
000000000025EED0  07 00 00 00 43 00 00 00 40 00 00 00 03 00 00 00  ....C...@.......  
000000000025EEE0  1A 00 00 00 14 00 00 00 23 00 00 00 47 00 00 00  ........#...G...  
000000000025EEF0  1A 00 00 00 19 00 00 00 04 00 00 00 29 00 00 00  ............)...  
000000000025EF00  17 00 00 00 02 00 00 00 13 00 00 00 12 00 00 00  ................  
000000000025EF10  0F 00 00 00 2A 00 00 00 0E 00 00 00 46 00 00 00  ....*.......F...  
000000000025EF20  20 00 00 00 01 00 00 00 44 00 00 00 29 00 00 00   .......D...)...  
000000000025EF30  04 00 00 00 1A 00 00 00 1A 00 00 00 03 00 00 00  ................  
000000000025EF40  10 00 00 00 13 00 00 00 28 00 00 00 02 00 00 00  ........(.......  
000000000025EF50  1D 00 00 00 12 00 00 00 28 00 00 00 04 00 00 00  ........(.......  
000000000025EF60  13 00 00 00 41 00 00 00 1B 00 00 00 29 00 00 00  ....A.......)...  
000000000025EF70  2A 00 00 00 07 00 00 00 05 00 00 00 39 00 00 00  *...........9...  
000000000025EF80  17 00 00 00 3B 00 00 00 44 00 00 00 3B 00 00 00  ....;...D...;...  
000000000025EF90  0B 00 00 00 00 00 00 00  
"""
keys = (0x27, 0x34, 0x38, 0x0E, 0x36, 0x47, 0x19, 0x11, 
        0x07, 0x43, 0x40, 0x03, 0x1A, 0x14, 0x23, 0x47, 
        0x1A, 0x19, 0x04, 0x29, 0x17, 0x02, 0x13, 0x12, 
        0x0F, 0x2A, 0x0E, 0x46, 0x20, 0x01, 0x44, 0x29, 
        0x04, 0x1A, 0x1A, 0x03, 0x10, 0x13, 0x28, 0x02, 
        0x1D, 0x12, 0x28, 0x04, 0x13, 0x41, 0x1B, 0x29, 
        0x2A, 0x07, 0x05, 0x39, 0x17, 0x3B, 0x44, 0x3B, 
        0x0B)

sum_value = 0x44
result1 = list()
result2 = list()
result3 = list()
print('\n' + hex(sum_value) + ' ' + '='*50)
for i in keys:
    result1.append(i ^ sum_value ^ (0x31))
    result2.append(i ^ sum_value ^ (0x32))
    result3.append(i ^ sum_value ^ (0x33))

print '1',
for i in result1: print chr(i),
print '\n2',
for i in result2: print chr(i),
print '\n3',
for i in result3: print chr(i),

This resulted in:

1 RAM{C2ldr65voaV2olq\bwfgz_{3Ut1\qoovef]whg]qf4n\_rpLbN1N~
2 QBNx@1ogq56ulbU1lor_atedy\x0Vw2_rllufe^tkd^re7m_\qsOaM2M}
3 PCOyA0nfp47tmcT0mns^`udex]y1Wv3^smmtgd_uje_sd6l^]prN`L3L|

By process of elimination, we can start throwing out invalid characters, keep in spaces ("_") where appropriate, and try to piece out words. That's right, it's time to play!





 After ten minutes, my first attempt worked awesome:

 AM{C2ldr65voaV2olq bwfgz_ 3Ut1 qoovef whg qf4n\_rpLbN1N 
  N @1ogq56ulbU1lor_atedy x0Vw2_rllufe tkd re7m_\qsOaM2M}
P   A0nfp47tmcT0mns  udex y1Wv3 smmtgd_uje_sd6l ]prN`L3L 

PAN{C0ngr47ulaT1ons_buddy_y0Uv3_rooted_the_se4m__pr0aN3L}
312113321332213213321332213213322113133213332122


Congratulations buddy you've rooted the seam .. proanel?



Something's not quite right. Then you realize that the serial is shorter than the flag; it'll reiterate over it. The first part is obvious, so all we need is the first 32 bytes of the serial: 31211332133221321332133221321332

Input that, and you get the whole flag.

... I was close ...




Docs Challenge #5


As much as you'd gather from the name, the Docs track was about various document formats. Most centered around obtaining pseudo-encryption buried within a document, reversing it, and getting the decrypted key out. Challenge 5, however, was a bit more fun.


Running the standard oletools suite against the file showed very little of interest. Just a workbook with three sheets:


E:\CTF\PaloAlto\Docs\5\Calc>oledir.py calc.xls
oledir 0.50 - http://decalage.info/python/oletools
OLE directory entries in file calc.xls:
----+------+-------+----------------------+-----+-----+-----+--------+------
id  |Status|Type   |Name                  |Left |Right|Child|1st Sect|Size
----+------+-------+----------------------+-----+-----+-----+--------+------
0   |<Used>|Root   |Root Entry            |-    |-    |2    |4C      |12096
1   |<Used>|Stream |Workbook              |18   |-    |-    |2       |36567
2   |<Used>|Storage|_VBA_PROJECT_CUR      |1    |16   |14   |0       |0
3   |<Used>|Storage|VBA                   |-    |-    |7    |0       |0
4   |<Used>|Stream |dir                   |-    |-    |-    |0       |551
5   |<Used>|Stream |Sheet1                |4    |6    |-    |9       |991
6   |<Used>|Stream |Sheet2                |-    |-    |-    |19      |991
7   |<Used>|Stream |Sheet3                |5    |9    |-    |29      |991
8   |<Used>|Stream |__SRP_0               |-    |-    |-    |39      |1677
9   |<Used>|Stream |__SRP_1               |8    |11   |-    |54      |198
10  |<Used>|Stream |__SRP_2               |-    |-    |-    |58      |762
11  |<Used>|Stream |__SRP_3               |10   |12   |-    |64      |156
12  |<Used>|Stream |ThisWorkbook          |-    |13   |-    |67      |1408
13  |<Used>|Stream |_VBA_PROJECT          |-    |-    |-    |7D      |2682
14  |<Used>|Stream |PROJECT               |3    |15   |-    |A7      |545
15  |<Used>|Stream |PROJECTwm             |-    |-    |-    |B0      |104
16  |<Used>|Stream |\x05SummaryInformation|-    |17   |-    |B2      |224
17  |<Used>|Stream |\x05DocumentSummaryInf|-    |-    |-    |B6      |308
    |      |       |ormation              |     |     |     |        |
18  |<Used>|Stream |\x01CompObj           |-    |-    |-    |BB      |107
19  |unused|Empty  |                      |-    |-    |-    |0       |0


Searching for VBA macros showed only one script, as minor as it was:

E:\CTF\PaloAlto\Docs\5\Calc>olevba.py calc.xls --decode
olevba 0.50 - http://decalage.info/python/oletools
Flags        Filename
-----------  -----------------------------------------------------------------
OLE:M-S-H--- calc.xls
===============================================================================
FILE: calc.xls
Type: OLE
-------------------------------------------------------------------------------
VBA MACRO ThisWorkbook.cls
in file: calc.xls - OLE stream: u'_VBA_PROJECT_CUR/VBA/ThisWorkbook'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Sub excelulate()
    Application.Quit
End Sub


There's really not that much here from a static perspective. Opening it up shows three sheets: two blank and one with a simple flag verifier:



I then spent hours poking at this document, which eventually leads into throwing random tools at it, which eventually leads to just randomly clicking around Excel. This leads to the discovery of a hidden sheet named "secret". While this page looks blank, it contains various fields with Excel formulas with white-on-white text. Changing this shows a formula-based "script" that prompts the document to open an incrementing number of calculators for each failed attempt:



Well, there's still no flag functionality. There's also a reference to a cell of "supersecret!F13". If there's one hidden page, what about another? Using VBE, we can iterate through the sheets and make them visible:




Upon doing so, another sheet appears: supersecret

This sheet contains our sought after character decoding routines, strewn about in various cells:



Each of these fields has a long stretch of text and a four-letter 'key'. The position of that key, as an ASCII char, is the decoded byte. The final cell, F13, takes the concatenation of all these characters and compares them to the input text field on the first sheet. First off, let's see what these fields actually show by turning off formulas:



Well, there's our flag spread about. We could try to work with the formula to reverse the key:

=RETURN(EXACT(CONCATENATE(D7,A5,C5,B4,E20,B6,A8,B8,A12,B10,E10,C9,B13,D12,C11,B16,A25,A18,B19,C20,B21,B2,D23,B24,E4,B26,D16,A21,C14,A16),Sheet1!B3))

However, it's just as easy to modify it. Remove the "RETURN" and "EXACT" operands and you're left with a single CONCATENATE that shows the flag:



Bit Bucket


There are a few other cool stuff I'll just call out with little context:


Windows Challenge #8

This challenge was a Windows device driver that was my nemesis. Mostly because I tried for too long to dynamically solve it and, in a drunken state, installed WinDbg on my Windows 10 host. This caused numerous memory issues, crashed VMWare every time it started, and left me spending two days trying to restore my system back to normal. Then another two weeks trying to figure out why hitting Prt Scrn would instantly freeze my system (it was due to the debugger).

It had a number of great messages, including the flag!




Unix Challenge #3

This was one of those weird challenges that was solved within minutes, only because that morning a coworker had sent a link to a cool obfuscation method which I thought would be great for a CTF. And here it was. REpsych.

REpsych is a cool tool that creates an executable where the graph view in IDA Pro creates a message. The limitation here is that it will only work in IDA Pro, which is kind of a cheap shot for some. At least you can use the free IDA to get it.

Others found a trace to "repsych.asm" in the file, which was the major clue. I found it by working through the error messages:



Seeing that, I kept expanding the upper limit in IDA to show more nodes.



Once reached, the answer popped out:



Unix Challenge #6

In this challenge you were given a client and server app compiled in Go. One issue that others noted afterward was that IDA refused to decompile it because of the unknown type sizes with Go:




Windows Challenge #7

In this challenge you were given a program that reads the contents of an arbitrary "file.txt", encrypts it, and transmits it over the network. You receive the program and the PCAP and told to determine the flag that was transmitted. This one was fairly straight forward. After retrieving each byte out per frame, the flag was Base64 encoded with a custom alphabet, and then run through an encryption routine:

  v13 = 0xBADA55;
  delta = 0x9E3769B9;
  v12 = 0x4913092;
  v11 = 0x12345678;
  v10 = 0xDEADBEEF;
  v8 = *data;
  v7 = *(data + 4);
  sum = 0;
  for ( i = 0; i < num_rounds; ++i )
  {
    v8 += (*(key[sum & 3]) + sum) ^ (v7 + ((v7 >> 5) ^ 16 * v7));
    v13 += 4092;
    for ( j = 8; j < 32; ++j )
    {
      v11 *= 8;
      v13 -= 64;
      v12 -= 8;
    }
    v10 = 64;
    sum += delta + 0x1000;
    v7 += (*(key[sum >> 11] & 3)) + sum) ^ (v8 + ((v8 >> 5) ^ 16 * v8));
  }


As with any unique encryption routines, focus on the hardcoded seed values. However, these show up with very few results. The value of 0x9E3769B9 refers to a thread on the Rune Server forum where there's a discussion of modifying an encryption routine to allow unique clients. If you take that function name (method248()) and search for the rest of the reversed Rune Server source code, you find that this is eXtended Tiny Encryption Algorithm (XTEA) with the value changed from the standard 0x9E3779B9.

Now that we know it's what appears to be a custom XTEA encryption**
** Update: As my good friend Dan Dash pointed out, the seed of 0x9E3769B9 is 0x1000 short of the expected delta, which is added back in at the bottom. This is standard XTEA with just a small change so keep automated tools from seeing it.

Overall, great fun and a great set of challenges by Palo Alto Networks Unit 42! Thanks!

And to David Bowie:



4 comments:

  1. Great write up. I too got stuck with windows#4, and I had the same assumptions that you did. Now that the CTF is closed, I can't wait to read other great write-ups about it to see how people attacked the different issues they encountered. Plus, I want to get all the binaries from each tracks. :)

    ReplyDelete
  2. Replies
    1. The code blocks in black are my code in Python, yes. I've added shebangs to each to make that stand out. I should've clarified that up front.

      Delete
  3. Nice writeup! I really enjoyed reading this, thanks :)

    In the AntiD challenge (Windows #1), here is the code that prints the flag without bruteforcing, since the value of v2 is easy to calculate:

    https://gist.github.com/levisre/eef06125a0a2bf5dfb7c3972a10f8040

    ReplyDelete