08 February 2012

Malicious PDF Analysis: Reverse code obfuscation

I normally don't find the time to analyze malware at home, unless it is somehow targeted towards me (like the prior write-up of an infection on this site). This last week I received a very suspicious PDF in an email that made it through GMail's spam filters and grabbed my attention.

The email was received to my Google Mail account and appeared in my inbox. It was easily accessible, but within two days Google did alert on the virus in the attachment and prevented downloading it. The email had one attachment, which could still be obtained as Base64 when viewing the email in its raw form: 92247.pdf.

A quick view in a hex editor showed that the file, only 13,205 bytes in size, included no obvious dropper, decoy, or even displayable PDF data. There was just one object of note, that contained an XML subform with embedded JavaScript. Boring...

Upon examining the JavaScript, I saw a large block of data that would normally contain the shell code, or even further JavaScript, to attack the victimized system. However, this example proved odd. There was a large block of such data (abbreviated below), but it contained all integer numbers that were between 0 and 74. This is not standard shell code.

    arr='0@1@2@3@4@1@5@5@6@7@8@9@0@1@2@3@10@10@10@11@3@12@12@12@11@3@5@5@5@11@9';

So I started looking at the surrounding code: