02 June 2011

Geolocational Log Analysis: Think Globally, Act Locally


In many network environments the administrators and security engineers have an understanding of the full geographical scope and reach of their network. While some corporations have a global audience and expect traffic from the far reaches of the world, others are more localized and target a specific small region.

A health care provider for Alaska would monitor its network connections to ensure that network connections are limited to its main source of users, i.e. those in Alaska. An insurance company in St. Louis will see mostly traffic from IP addresses in Missouri, but Illinois as well, due to the city  being on the state line.

Occasionally, administrators may notice connections being made from Hawaii, Bermuda, or Italy, signifying users who are on vacation but are still wired in to their work. However, a long-term series of connections from a Eircom subscriber, Ireland’s largest ISP, should spark interest to the network administrator of a Seattle tax firm.

While anonymous web connections from global addresses are common, specific attention should be paid to such addresses being used to access password-protected areas of a corporation. This could include remote file access, VPN and web-based corporate email.

In such cases the logs from these applications, usually supplied in plain text or W3C format, contain details about transactions to include the remote IP address and the account name being authorized. In reviewing logs from various incident responses cmdLabs has found details to show that a short log review made on a daily basis could help smaller corporations determine quickly if a user account was compromised and accessed from a remote location.

For example, the log sample below from a Cisco ASA tracks VPN connections. The user “cmdLabs\bbaskin” was accessed via the IP address of 159.134.100.100 on 2 April, 2011, an IP that was traced back to Ireland. A few hours later the same account was accessed from an IP address in Austria.



Apr  2 21:53:37 192.168.1.1 Apr 02 2011 21: 53:08: %ASA-6-302013: Built outbound TCP connection 7823 for inside:10.10.10.50/389 (10.10.10.50/389) to NP Identity Ifc:192.168.1.1/1047 (192.168.1.1/1047)
Apr  2 21:53:37 192.168.1.1 Apr 02 2011 21: 53:08: %ASA-6-104: AAA user authentication Successful : server =  10.10.10.50 : user = cmdLabs\bbaskin
Apr  2 21:53:37 192.168.1.1 Apr 02 2011 21: 53:08: %ASA-6-113009: AAA retrieved default group policy (DfltGrpPolicy) for user = cmdLabs\bbaskin
Apr  2 21:53:37 192.168.1.1 Apr 02 2011 21: 53:08: %ASA-6-113008: AAA transaction status ACCEPT : user = cmdLabs\bbaskin
Apr  2 21:53:37 192.168.1.1 Apr 02 2011 21: 53:08: %ASA-6-734001: DAP: User cmdLabs\bbaskin, Addr 159.134.100.100, Connection Clientless: The following DAP records were selected for this connection: DfltAccessPolicy


For this small set of data it is trivial to query each IP address to determine its country of origin, netblock owner, and other details that would highlight unauthorized access. The problem arises when you have hundreds of thousands of such transactions in your daily log files.




One service that cmdLabs uses regularly is the IP to ASN WHOIS server run by Team Cymru. This server provides quick and easy access to country codes for a given IP address. However, it has two limitations: it requires Internet-access which is not readily available from a forensic workstation and to process a large bulk of IPs you have to use their Netcat process which only returns ASNs and not country codes. To overcome these limitations I've developed a simple solution that could process hundreds of thousands of IP addresses to determine country codes.

This solution is a small Python script called IP2CC that takes an IP address as input and outputs the originating country code for that IP. This solution requires three components:
  1. The free country code database located at http://www.maxmind.com/app/geolitecountry (updated monthly)
  2. Python API module to access this database located at https://github.com/appliedsec/pygeoip
  3. The ip2cc.py script. Downloadable at the end of this blog post.
The script allows for input to be given via the command line, stdin, or an input file. In normal use it will simply output the country code. With the –c or -t option the output will contain both the IP and country code in either a comma-separated version (CSV) or tab-separated (TSV) output, respectively.


python ip2cc.py –i  -f <input file> [-c] [-t]

> python ip2cc.py -i 11.11.11.11
US
> python ip2cc.py -i 22.22.22.22 -c
22.22.22.22,US
> echo 33.33.33.33 | python ip2cc.py
US
> python ip2cc.py -f IP.txt -c
14.48.7.101,AU
12.51.21.19,US
10.61.14.9,Internal

In one use, we'll eliminate known intranet/extranet IP addresses and run the resulting list through IP2CC to produce a master list of foreign accesses. This script will run in Linux and OSX in conjunction with the native OS command line tools. For a Windows environment you will find additional capabilities by installing the necessary GnuWin32 components. For example, when reviewing a NCSA-formatted log with the IP address in the first field:

D:\> type in051611.log | egrep –v “^192” | gawk “{print $1}” | python ip2cc.py -t | egrep –v “US|Internal” | gawk -F\t "{print $1}" | sort | uniq > DailyForeignIPs.txt

D:\> for /F %i in (DailyForeignIPs.txt) do grep “%i” in051611.log >> DailyForeignConnections.txt

The first command above will save a simple text listing of all unique foreign IP addresses into a file for processing. The second line takes each IP address from that resulting file and compares it back against the logs to extract all lines that include its presence. The resulting DailyForeignConnections.txt can then be quickly reviewed to determine if any accounts were accessed from a foreign IP address.

Dealing with the VPN logs shown earlier, we'll change our command line a bit. Using the standard Cisco log file index as a source we can see that the log id of 734001 will show us the remote IP address of a user login. We'll search the log for that id and then parse out the IP address in the 15th field. An additional hindrance is that the IP address is appended with a comma, which we’ll remove with the ‘tr’ command.

D:\> type asavpn-051611.log | findstr "734001" | gawk “$15 !~ /^192/ {print $15}” | tr -d "," | python ip2cc.py –t | egrep –v “US|Internal” | sort | uniq > DailyVPNForeignIPs.txt

This is ultimately just a very simple Python script. In-house, we use it as a mere function within larger processes, but its simplicity allows for it to be used in a variety of result-tuning processes. Customization is easy. At times I'll make an offshoot of the script to process input from `uniq` command with the `-c` count option occasionally. The `uniq –c` adds a new column that specifies the total number of instances of that IP address which is useful when evaluating the persistence of a single IP amongst thousands. A few small changes to the Python will allow you to read this count and add it to the CSV output for easy integration into a spreadsheet.

Usage of a tool like IP2CC is a first step to opening an administrators eyes to traffic beyond their network. A good administrator or security engineer should monitor not only the traffic that flows across their network but also the perceived traffic that flows from a network’s outer nodes to the Internet. Monitoring for your company’s existence in spam black-lists, a malware rating on services like Web of Trust, and other indicators can give clues that an infection or intrusion may be underway within your network.

Downloads:
IP2CC Python Source Code v1.0

1 comment: