Ghetto Forensics: Is Google Scanning Malware Email Attachments Between Researchers

14 February 2014

Is Google Scanning Malware Email Attachments Between Researchers

Disclaimer: This post is based upon experiences I found when sending malware via GMail (Google Mail). I'm documenting them here for others to: disprove, debate, confirm, or to downplay its importance.

Update:

In the comments below, a member of Google's AntiVirus Infrastructure team provided insight into this issue. A third-party AV engine used by GMail was designed by the third-party to automatically open ZIP files with a password of 'infected'. I want to thank Google for their attention to the matter as it shows that there was no ill-intent or deliberate scanning.

As a professional malware analyst and a security researcher, a sizable portion of my work is spent collaborating with other researchers over attack trends and tactics. If I hit a hurdle in my analysis, it's common to just send the binary sample to another researcher with an offset location and say "What does this mean to you?"

That was the case on Valentine's Day, 14 Feb 2014. While working on a malware static analysis blog post, to accompany my dynamic analysis blog post on the same sample, I reached out to a colleague to see if he had any advice on an easy way to write an IDAPython script (for IDA Pro) to decrypt a set of encrypted strings.

There is a simple, yet standard, practice for doing this type of exchange. Compress the malware sample within a ZIP file and give it a password of 'infected'. We know we're sending malware samples, but need to do it in a way that:

a. an ordinary person cannot obtain the file and accidentally run it;
b. an automated antivirus system cannot detect the malware and prevent it from being sent.

However, on that fateful day, the process stopped. Upon compressing a malware sample, password protecting it, and attaching it to an email I was stopped. GMail registered a Virus Alert on the attachment.

Stunned, I try again to see the same results. My first thought was that I forgot to password-protect the file. I erased the ZIP, recreated it, and received the same results. I tried with a different password - same results. I used a 24-character password... still flagged as malicious.

The instant implications of this initial test were staggering; was Google password cracking each and every ZIP file it received, and had the capability to do 24-character passwords?! No, but close.

Google already opens any standard ZIP file that is attached to emails. The ZIP contents are extracted and scanned for malware, which will prevent its attachment. This is why we password protect them. However, Google is now attempting to guess the password to ZIP files, using the password of 'infected'. If it succeeds, it extracts the contents and scans them for malware.

Google is attempting to unzip password-protected archives by guessing at the passwords. To what extent? We don't know. But we can try to find out.

I obtained the list of the 25 most common passwords and integrated them (with 'infected') into a simple Python script below:

import subprocess
pws = ["123456","password","12345678","qwerty","abc123","123456789","111111","1234567","iloveyou","adobe123","123123","sunshine","1234567890","letmein","photoshop","1234","monkey","shadow","sunshine","12345","password1","princess","azerty","trustno1","000000","infected"]
for pw in pws:
    cmdline = "7z a -p%s %s.zip malware.livebin" % (pw, pw)
    subprocess.call(cmdline)

This script simply compressed a known malware sample (malware.livebin) into a ZIP of the same password name. I then repeated these steps to create respective 7zip archives.

I then created a new email and simply attached all of the files:

Of all the files created, all password protected, and each containing the exact same malware, only the ZIP file with a password of 'infected' was scanned. This suggests that Google likely isn't using a sizable word list, but it's known that they are targeting the password of 'infected'.

To compensate, researchers should now move to a new password scheme, or the use of 7zip archives instead of ZIP.

Further testing was performed to determine why subsequent files were flagged as malicious, even with a complex password. As soon as Google detects a malicious attachment, it will flag that attachment File Name and prevent your account from attaching any file with the same name for what appears to be five minutes. Therefore, even if I recreated infected.zip with a 50-char password, it would still be flagged. Even if I created infected.zip as an ASCII text document full of spaces, it would still be flagged.

In my layman experience, this is a very scary grey area for active monitoring of malware. In the realm of spear phishing it is common to password protect an email attachment (DOC/PDF/ZIP/EXE) and provide the password in the body to bypass AV scanners. However, I've never seen any attack foolish enough to use a red flag word like "infected", which would scare any common computer user away (... unless someone made a new game called Infected? ... or a malicious leaked album set from Infected Mushroom?)

Regardless of the email contents, if they are sent from one consenting adult to another, in a password-protected container, there is an expectation of privacy between the two that is violated upon attempting to guess passwords en masse.

And why is such activity targeted towards the malware community, who uses this process to help build defenses against such attacks?

Notes:

Emails were sent from my Google Apps (GAFYD) account.
Tests were also made using non-descript filenames (e.g. a.txt).
Additional tests were made to alter the CRC32 hash within the ZIPs (appending random bytes to the end of each file), and any other metadata that could be targeted.
The password "infected" was not contained in the subject nor body during the process.

Updates:
There was earlier speculation that the samples may have been automatically sent to VirusTotal for scanning. As shown in the comments below, Bernardo Quintero from VirusTotal has denied that this is occurring. I've removed the content from this post to avoid any future confusion.

Others have come forth to say that they've seen this behavior for some time. However, I've been able to happily send around files until late last week. This suggests that the feature is not evenly deployed to all GMail users.

A member of Google's team replied below noting that this activity was due to a third-party antivirus engine used by Google.

The owner of VirusShare.com, inspired by this exchange, attempted to locate what engine this could be by uploading choice samples to VirusTotal. His uploads showed one commonality, NANO-Antivirus:

My own tests also showed positive hits from NANO-Antivirus.

At the very least, this shows how one minor, well-meaning feature in an obscure antivirus engine can cause waves of doubt and frustration to anyone who decides to use it without thorough testing.

27 comments:

DH16 February, 2014 13:24
Do not like. I wonder if Google has evidence of adversary groups using the same practice for malware exchange. I have dozens of samples sitting in one of my GMail accounts that use this exchange method. Some of them I've downloaded previously with no issues. I wonder if I'll still be able to retrieve them. I also wonder what connection, if any, this has to VirusTotal. For new samples, this certainly gives some interesting intel.
ReplyDelete
Replies
Jamie16 February, 2014 14:42
Instead of zipping how about openssl aes-256-cbc -in malware -out encryptedmalware
ReplyDelete
Replies
UUDDLRLABA_16 February, 2014 15:00
Yep, I've been experiencing this for a while when sending malware and research to friends. It's amazing that Google has the audacity to actually try to crack password protected archives and see what's inside. The fact that infected is on the password list seems to me an indicator that people like us are part of the target that they're trying to intercept. Not like the bad guys sending spear phishing are going to use 'infected' as their password of choice. Only us researchers do that....
ReplyDelete
Replies
Unknown16 February, 2014 21:57
Brian, interesting find. Thanks for doing such thorough research and sharing your findings. I hope we learn more about Google's rationale. Did you include the word "infected" anywhere else in the initial email you sent which was flagged? If you scan that unencrypted sample via VirusTotal do any of the engines detect it as malicious?
ReplyDelete
Replies
Dave Lassalle17 February, 2014 10:26
So if Google is passing it to VT, if you send a sample through Gmail that VT has no record of, I wonder if you can you later search VT for it and have it show up?
ReplyDelete
Replies
Kyhwana17 February, 2014 15:21
Why not PGP encrypt it to the other person using their public key?
ReplyDelete
Replies
Bernardo.Quintero18 February, 2014 03:45
Hi Brian,

This is Bernardo Quintero, VirusTotal's manager. Google is not using VT for scanning all emails for malware, we have nothing to do with what you mentioned. Could you update your post to clarify it? and let me know if you need more info about VirusTotal (I have no idea how Gmail scans for malware, but it's not related to VT).

Thanks,
Bernardo
ReplyDelete
Replies
Anonymous18 February, 2014 08:02
Try the same thing with RAR + encrypt file list. (Its not just a Google thing, lots of AV can detect malware in a pass protected ZIP).
ReplyDelete
Replies
Anonymous18 February, 2014 12:09
I imagine Google is just scanning the body of the e-mail and isolating keywords to unzip the file for scanning. It's still ominous even if this is something that is being beta tested on the power users of malware (researchers), but as far as the technique used, I think that's the best explanation as many researchers will zip and include the password in the body. For example, try the contagio convention for testing.
ReplyDelete
Replies
Unknown18 February, 2014 15:12
Trying to replicate the issue on my side with a standard VirusShare zipped sample download which uses the 'infected' password, but the filename is simply a hash of the file with no extension and also made a point of not having the password appear anywhere in the email. Sent to my GMail account and the sample has been sitting there waiting for me to download it for the last 23 minutes.

Will continue to monitor things on my side and see if I can determine if/when it is detected.
ReplyDelete
Replies
the JoshMeister18 February, 2014 17:28
My theory is that a back-end antivirus that Google's using may have signatures for certain specific .zip files that have the password "infected". In other words, the hash of that exact encrypted .zip file got added to a malware signature database at some point, probably by mistake.

Antivirus software shouldn't flag .zip files with the password "infected" as infected—but I *have* seen it happen before, which makes me wonder if that's part of what's going on here.

(Google's further step of blocking any attachment with the same file name within 5 minutes is obviously something Google is doing to try to foil automated or weak attempts to send someone malware on purpose. If my theory is correct, then Google doesn't actually know you're using the password "infected"; all it knows is that their back-end AV identifies the file as malicious.)

You could easily test this theory by uploading the same encrypted .zip files that Gmail flags as malicious to VirusTotal, and see whether it gets any hits. I wouldn't be surprised if it does. You might even be able to figure out which antivirus Google is using.

I'm also curious whether you'd see the same results with the same files when trying to e-mail them from the same Gmail account over SMTP. It's clear from your screenshot that you were testing this with the Gmail site, not an e-mail application.
ReplyDelete
Replies
Anonymous18 February, 2014 19:20
Upload your encrypted ZIPs to virustotal.com. You'll be surprised. I don't think this is a Google trick.

The only obvious conclusion is to stop pretending ZIP is secure.
ReplyDelete
Replies
argv18 February, 2014 23:38
Please note: these are my personal comments, not the comments of my employer.

I've played with this quite a bit because it drives me bonkers when Gmail blocks me trying to share samples with people. One thing to keep in mind is that depending on the utility, and how you're creating the archive, a password-protected ZIP may still have a readable table of contents (file listing).

To recreate this, on my Mac using the standard zip util:

# Make archive, password infected
argv-macbookair2:~ argv$ zip -e protected.zip suspicious.exe
Enter password:
...

# Use hex editor to see that the filename is visible
argv-macbookair2:~ argv$ xxd protected.zip | grep -i suspicious
00000a0: 0073 7573 7069 6369 6f75 732e 6578 6555 .suspicious.exeU

My theory is that Gmail looks at the archive's table of contents. Gmail will by default reject any attempt to attach a .exe directly, and so I suspect it also rejects a ZIP archive with an .exe inside it as long as it can tell it has one. Would be interesting to see what happened if you could get an archive without that listing in it, and whether Gmail would pass/deny it. My testing showed it would go through.

Another interesting test, also pointed out by others here, is that A/V scanners are pretty dumb about signaturing, and ZIP's compression algorithms are highly predictable. Here's an "encrypted" archive of eicar.com:

https://www.virustotal.com/en/file/3365fbb7a0c847f38fcbd4cc1f4a5126e63e2992c1cfaeeb9d07c230807291e4/analysis/1379967790/

That is, the ZIP compression is predictable enough to write an A/V signature based on the expected compressed representation of eicar.com. So depending on what sample you're transferring around, it's possible some malware researcher wrote a signature not only for a section of code in the malware, but also what that section might look like when archived in a zip file.

--Heather Adkins (Googler writing in her free time)
ReplyDelete
Replies
Unknown19 February, 2014 13:26
Hey - to protect our users from downloading malicious files, we use a combination of third party antivirus software and internal virus scanning solutions to detect whether or not attachments or other downloadable files may be harmful. Your post alerted us to the fact that one of our third party software components was checking for encryption using 'infected.' as a password.

As a result, it decrypted a limited set of zipped payloads in attempts to search for malware. We're currently working on disabling that feature and appreciate you bringing it to our attention.

- Alex Petit-Bianco, Google Antivirus Infrastructure.
ReplyDelete
Replies
Vess20 February, 2014 05:36
Self-respecting anti-virus researchers do NOT send malware by e-mail in ZIP archives. Not even password-protected ZIP archives. Not only is the encryption used by ZIP archives insecure and easily broken - one does not even need to break it, in order to detect that the archive contains some kinds of malware.

A ZIP archive (as well as most other kinds of archives) contains, among other things, the CRC-32 of each uncompressed (and unencrypted) file. If the file contains static malware (i.e., not a program infected by a parasitic virus or a self-modifying Trojan horse), its CRC will be the same, no matter what password is used to encrypt it. An external program can detect it in the archive without even having to bother with the encryption.

In addition, as far as I know, McAfee's scanner automatically tries the password "infected" when scanning password-protected ZIP archives; probably other scanners can do it too. The reason for this is not anything nefarious - it is because the developers of the scanner use their own product as a tool when examining incoming virus samples and these are often contained in ZIP archives protected with the password "infected". There are simply way too many people who ignorant about cryptography and don't take sufficient precautions when sending malware by e-mail, alas.

That's why professional anti-virus researchers always use PGP when sending mailware to others. It is way more secure and responsible. It also ensures that only the intended recipient can decrypt the sample - not just about anyone who knows (or can guess) the password. Sure, this method is not suitable for samples that are made publicly available (e.g., on a web or ftp site) - but responsible anti-virus researchers don't do that.
ReplyDelete
Replies
Anonymous22 February, 2014 20:15
Wow such all-encompassing intrusion and monitoring.

So, WTF can't Google and 99.99% of the AV vendors find Sefnit related malware on a system when scanned?!!
ReplyDelete
Replies
Aggregate Obscurity23 March, 2014 15:40
Also confirmed on my end. I recently worked a support case where a ".jar" file needed to be emailed to a Google Apps user. The ".jar" file wasn't malicious, but Google blocked it anyway. Only password-protecting the zip enabled it to make it through, and we only knew that after reading your post, so thanks for the help!

Make me curious, though. I'm sure Google is scanning for signature-based recognition, but are they also using heuristics or extension-based scanning? Those are the only two other methods I'm aware of that would've flagged this support file as malicious.
ReplyDelete
Replies
Sarah12 July, 2016 04:46
Here's the bad news: when you set up your Internet email sign in, you were asked to set up secret questions, password hints, and an alternative email address to use, and other information. What were these for? http://public.fotki.com/gmailsuport/ways-to-consult-wit/ways-to-consult.html
ReplyDelete
Replies
Anonymous02 November, 2016 01:44
You can get the list of files in a zip file which has a password.
so i guess google won't crack your zip file. it just tries to list the file names in zip file and checks if there are any "executable" files (.exe, .jar, etc) in it. so you can rename the file. For instance, you can rename a.exe to a.old, and zip it with a password. This zip file is accpetable for gmail.
ReplyDelete
Replies

Add comment