Secret Archives of Execution Evidence: CCM_RecentlyUsedApps

UPDATE 2017-04-03: Unicode strings are used when needed. See the update post.

I seem to be running into more and more systems that have Windows Prefetch disabled for one reason or another. It is especially frustrating for me as a consultant since I cannot make the changes necessary to enforce the creation of the trace files nor can I implement any kind of central logging. Without this digital forensic artifact, it becomes increasingly difficult to build out a timeline of events across all the systems involved in an incident response.

One of the evidence sources that has shown itself over and over comes from a connection with a Microsoft SCCM server. SCCM has the ability to collect inventory data from many sources, and tracking executables launching is one. This feature isn’t turned on by default to have the SCCM server collect this data; however, the logging occurs on the endpoints regardless of the settings that are configured on the server.

If you search for CCM_RecentlyUsedApps, you will find tons of articles about configuring SCCM to collect this data or how to perform queries to extract the collected data. If you have the ability to push this in your organization, I say do it! If you can’t, then read on so I can show you how to take advantage of this data anyways.

Data Source

The records holding the information behind CCM_RecentlyUsedApps are stored in the collection of files that make up the database behind WMI. The locations are consistent from Windows XP through Windows 10, and you will find them here:
c:\windows\system32\wbem\repository\
c:\windows\system32\wbem\repository\fs\

I have even seen some systems that have what appears to be an old version of the WMI database. It seems to roll like the Windows Registry controlset keys. When the rebuild process kicks off, a new version of the database is built and it does not carry the previous information with it. I have seen up to 003, but it would likely go further. The previous versions look like this:
c:\windows\system32\wbem\repository.001\
c:\windows\system32\wbem\repository.001\fs\

This specific artifact was a very critical piece in a previous case. It allowed us to narrow the time window of the compromise to be much more specific. Even a single day of exposure can make a big difference in the fines against the victim company during a PCI Forensic Investigation (PFI).

You will see a handful of files in these locations. They are all used to link all the various records together to properly parse these. The guys at FireEye did some work on reverse engineering this database and released a python script to extract all of the available classes and namespaces. You can find their tool here:
https://github.com/fireeye/flare-wmi/tree/master/python-cim

Using this script, you can extract this data using these parameters:
Namespace: root\ccm\SoftwareMeteringAgent
Class: CCM_RecentlyUsedApps

This script was very helpful to me in a number of previous cases, although I have to mention that it is a bit of a pain to get installed properly. The other trouble that I ran into with this script, by no fault of the FireEye team, is that it can only parse the namespaces from the database if the data is not ‘corrupted’. I have found that imaging a live system can cause ‘corruption’ almost half of the time. It is frustrating to know that there are Indicator Of Compromise (IOC) hits inside that data blob, but the data won’t allow for the parsing.

Different Approach

As I manually looked over those seemingly lost IOC hits, I started to recognize patterns surrounding the hits. The fields holding all the property data seemed to be in the same order for all of the records of a certain system that I was reviewing at the time. I then pulled up a few systems with different OS’s from previous cases and found the same structure. YES!! The perfect setup for carving. Time to reverse engineer the record format.

The index uses a hash value in tracking and sorting structures that I won’t bore you with here. I mention though, because this hash is the piece that we will use to find these records. WinXP uses MD5 and newer uses SHA256. The hash in these records is generated from the class name CCM_RecentlyUsedApps, only the text needs to be upper cased as CCM_RECENTLYUSEDAPPS, and then converted to Unicode C\x00C\x00M\x00_\x00R\x00… (and you get the point).
WinXP MD5:
6FA62F462BEF740F820D72D9250D743C
WinVista+ SHA256:
7C261551B264D35E30A7FA29C75283DAE04BBA71DBE8F5E553F7AD381B406DD8

These hashes are what start the records. They are stored in Unicode themselves, for some reason. 128 bytes for the SHA256 and 64 bytes for the MD5.

The next 16 bytes following the hash are two 8 byte FileTimes.

After that will be 2 bytes to tell you the size of the data portion of this record. I have not seen any records using more than 2 bytes and the max size of 2 bytes is either 65,535 unsigned or 32,767 signed. Either of those provide plenty of space for this data, so I wouldn’t expect it to expand for size purposes. The data portion of the record includes these 2 bytes.

You can see on the right in the screenshot above that the size of the data is 432. You can then see at the bottom that I have highlighted 432 bytes (Sel 432 [1B0h]). You can also see another ‘7C261…’ starting immediately after my selection, although don’t let this fool you into thinking that these records will always be contiguous.

From here, the data is broken into 2 sections. The first section consists of various 4 byte fields with some being offsets and others being property values. The second section contains all the string based property values separated by double 0x00 bytes.

There are 3 values we can extract from the number section that are helpful.
Filesize
Offsets: Vista 178d (128+16+34), XP 114d (64+16+34)

ProductLanguage
Offsets: Vista 194d (128+16+50), XP 130d (64+16+50)

LaunchCount
Offsets: Vista 202d (128+16+58), XP 138d (64+16+58)

The string section always starts with ‘CCM_RecentlyUsedApps’ and is followed by the double 0x00 separator. If there are 4 bytes of 0x00 following, then the next string field is null. If there are 6 bytes of 0x00, then the next 2 string fields are null. Follow the pattern?

The string properties are listed in the following order:
ClassName (always “CCM_RecentlyUsedApps”)
AdditionalProductCodes
CompanyName
ExplorerFilename
FileDescription
FilePropertiesHash
FileVersion
FolderPath
LastUsedTime
LastUsername
MsiDisplayName
MsiPublisher
MsiVersion
OriginalFilename
ProductCode
ProductName
ProductVersion
SoftwarePropertiesHash

There will only be a single 0x00 at the very end of the record. Wasn’t that easy?

New Python Tool

After I determined these structures, I was chatting with Willi Ballenthin since he was involved in the research of the database structure. He said something like “that tool sounds pretty neat” and then followed up saying “possibly similar to this” and pointed me to a blog post by David Pany at FireEye.
https://www.fireeye.com/blog/threat-research/2016/12/do_you_see_what_icc.html

Sure enough, David beat me to it with a python script to search for the classname hashes and parse the record structure. The good news is that we arrived at the same basic approach and record structures. Validation is always nice. His python script is on GitHub here:
https://github.com/davidpany/WMI_Forensics/blob/master/CCM_RUA_Finder.py

I have had some trouble running this python script against my systems, but I haven’t spent the time to determine the cause. The output is a CSV file, but I don’t have any screenshots to show because of the errors I ran into.

New EnScript Tool

I decided to write this approach in EnScript. My cases have involved upwards of 500 systems for analysis. Using a python based approach would force me to either extract all those files, or use a mounting or parsing solution to expose the files. By using EnScript in EnCase v7 or v8, I can run the EnScript over all system images with one pass. I was able to successfully do this in testing on a recent case with 73 systems in the same case. EnCase proved to be a powerful tool in this specific scenario.

The EnScript starts off with a GUI to give you the option of running against all files in the case or a smaller subset designated by a blue check or tag selection.

I found records existing in OBJECTS.DATA and INDEX.BTR files. Some seem to be in areas of the file that have been deallocated from the active records of the database. Additionally, I have found quite a large number of records in the PAGEFILE.SYS file as well. You will see a selection option in the GUI for these common filenames.

The output of this EnScript is a CSV file. It includes a few columns in addition to the properties that were parsed from the records: evidence filename to indicate the system source, item path to show which file it was found in, and file offset to manually validate the data later if needed.

I encourage you to use Excel’s data deduplication function since I ran into a number of bugs in EnCase trying to make this EnScript work. There are some hacky workarounds in the code currently. Dedupe on all columns except item path and file offset. This will remove dupes that are found in both pagefile.sys and objects.data files.

I suspect we might be able to pull some of these records from unallocated clusters, but I haven’t found any there yet. Please let me know if you do!

You can grab the latest version of the EnScript on GitHub:
https://github.com/JamesHabben/ccm-rua-enscript

See the followup post about the forensic meanings.

James Habben
@JamesHabben

Know Your Network

Do you know what is on your network?  Do you have a record of truth like DHCP logs for connected devices?  How do you monitor for unauthorized devices?  What happens if none of this information is currently available?

Nathan Crews @crewsnw1 and Tanner Payne @payneman at the Security Onion Conference 2016 presented on Simplifying Home Security with CHIVE that will definitely help those with Security Onion deployed answer these questions.  Well worth the watch: https://youtu.be/zBDAjNnRiQI

My objective is to create a Python script that helps with the identification of devices on the network using Nmap with limited configuration.  I want to be able to drop a virtual machine or Raspberry Pi onto a network segment that will perform the discovery scans every minute using a cron job.  Generating output that can be easily consumed by a SIEM for monitoring.
I use the netifaces package to determine the network address that was assigned to the device for the discovery scans.
I use the netaddr package to generate the network cidr format that the Nmap syntax uses for scanning subnet ranges.
The script will be executed from cron thus running as the root account, so important to provide absolute paths.  Nmap also needs permission to listen to network responses that is possible at this permission level too.
I take the multi-line native Nmap output and consolidate it down to single lines.  The derived fields are defined by equals (=) for the labels and pipes (|) to separate the values.  I parse out the scan start date, scanner IP address, identified device IP address, identified device MAC address and the vendor associated with the MAC address.
I ship the export.txt file to Loggly (https://www.loggly.com) for parsing and alerting as that allows me to focus on the analysis not the administration.

The full script can be found on GitHub:  https://gist.github.com/jblukach/c67c8695033ad276b4836bea58669958

John Lukach
@jblukach

GUIs are Hard – Python to the Rescue – Part 1

I consider myself an equal opportunity user of tools, but in the same respect I am also an equal opportunity critic of tools. There are both commercial and open source digital forensic and security tools that do a lot of things well, and a lot of things not so well. What makes for a good DFIR examiner is the ability to sort through the marketing fluff to learn what these tools can truly do and also figure out what they can do very well.

One of the things that I find limiting in many of the tools is the Graphical User Interface (GUI). We deal with a huge amount of data, and we sometimes analyze it in ways we couldn’t have predicted ourselves. GUI tools make a lot of tasks easy, but they can also make some of the simplest tasks seem impossible.

My recommendation? Every tool should offer output in a few formats: CSV, JSON, SQLite. Give me the ability to go primal!

Tool of the Day

I have had a number of cases lately that have started as ‘malware’ cases. Evil traffic tripped an alarm, and that means there must be malware on the disk. It shouldn’t be surprising to you, as a DFIR examiner, that this is not always the case. Sometimes there is a bonehead user browsing stupid websites.

Internet Evidence Finder (IEF) is the best tool I have available right now to parse and carve the broad variety of internet activity artifacts from the drive images. It does a pretty good job searching over the disk to find browser, web app, and website artifacts (though I don’t know exactly which versions are supported due to documentation, but I will digress in a different post).

Let me cover some of IEF’s basic storage structure that I have worked out first. The artifacts are stored in SQLite format in the folder that you designate, and it is named ‘IEFv6.db’. Every artifact type that is found creates at least one table in the DB. Because each artifact has different properties, each of the tables have a different schema. Some good things that the dev team at Magnet seem to have decided on do allow for some kind of consistency. If the column has URL data in it, then the column name has ‘URL’ in it. Similarly for dates, the column name will have ‘date’ in it.

IEF provides a search function that allows you to do some basic string searching, or you can get a bit more advanced by providing a RegEx (GREP) pattern for the search. When you kick off this search, IEF creates a new SQLite db file named ‘Search.db’ and it is stored in the same folder. You can only have one search completed at a time since kicking off a new search will cause IEF to overwrite any previous db that was created. The search db, from what i can tell anyways, seems to have an identical schema structure as the main db, only it has been filtered down on the number of records that it holds based on the keywords or patterns that you provided.

There is another feature called filter, and I will admit that I have only recently found this. It allows you to apply various criteria to the dataset, with the major one being a date range. There are other things you can filter one, but I haven’t needed to explore those just yet. When you kick off this process, you end up with yet another SQLite database filled with a reduced number of records based on the criteria and again it seems identical in schema as the main db. This one is named ‘filter.db’ and indicates that the dev team doesn’t have much creativity. 😉

Problem of the Month

The major issue I have with the tool is in the way it presents data to me. The interface has a great way of digging into forensic artifacts as they are categorized and divided by the artifact type. You can dig into the nitty gritty details of each browser artifact. For the cases that I have used IEF for lately, and I suspect many of you in your Incident Response cases as well, I really don’t actually care *which* browser was the source of the traffic. I just need to know if that URL was browsed by bonehead so I can get him fired and move on. Too harsh? 🙂

IEF doesn’t give you the ability to have a view where all of the URLs are consolidated. You have to click, click, click down through the many artifacts, and look through tons of duplicate URLs. The problem behind this is on the design of the artifact storage in multiple tables with different schemas in a relational database. A document based database, such as MongoDB, would have provided an easier search approach, but there are trade-offs that I don’t need to tangent on here. I will just say that there is no 100% clear winner.

To perform a search over multiple tables in a SQL based DB, you have to implement it in some kind of program code because a SQL query is almost impossible to construct. SQLite makes it even more difficult with its reduced list of native functions and it’s lack of ability to create any user-defined functions or stored procedures. It just wasn’t meant for that. IEF handles this task for the search and filter process in c# code, and creates those new DB files as a sort of cache mechanism.

Solution of the Year

Alright, I am sensationalizing my own work a bit too much, but it is easy to do when it makes your work so much easier. That is the case with the python script I am showing you here. It was born out of necessity and tweaked to meet my needs for different cases. This one has saved me a lot of time, and I want to share it with you.

This python script can take an input (-i) of any 3 of the IEF database files that I mentioned above since they share schema structures. The output (-o) is another SQLite database file (I know, like you need another one in addition) in the location of your choosing.

The search (-s) parameter allow you to provide a string to filter the records on based upon that string being present in one of the URL fields of the record being transferred. I added this one because the search function of IEF doesn’t allow me to direct the keyword at a URL field. I had results from my keywords that were hitting on several other metadata fields that I had no interest in.

The limit (-l) parameter was added because of a bug I found in IEF with some of the artifacts. I think it was mainly in the carved artifacts so I really can’t fault too much, but it was causing a size and time issue for me. The bug is that the URL field for a number of records was pushing over 3 million characters long. Let me remind you that each character in ASCII is a byte, and having 3 million of those creates a URL that is 3 megabytes in size. Keep in mind that URLs are allowed to be Unicode, so go ahead and x2 that. I found that most browsers start choking if you give them a URL over 2000 characters, so I decided to cutoff the URL field at 4000 by default to give just a little wiggle room. Magnet is aware of this and will hopefully solve the issue in an upcoming version.

This python script will open the IEF DB file and work its way through each of the tables to look for any columns that have ‘URL’ in the name. If one is found, it will grab the type of the artifact and the value of the URL to create a new record in the new DB file. Some of the records in the IEF artifacts have multiple URL fields, and this will take each one of them into the new file as a simple URL value. The source column is the name of the table (artifact type) and the name of the column of where that value came from.

This post has gotten rather long, so this will be the end of part 1. In part 2, I will go through the new DB structure to explain the SQL views that are created and then walk through some of the code in Python to see how things are done.

In the meantime, you can download the ief-find-url.py Python script and take a look for yourself. You will have to supply your own IEF.

James Habben
@JamesHabben

MatchMeta.Info

Filenames are trivial to being changed.  It is still important to know what ones are common during your investigation.  You can’t remember every filename as there are already twenty-four million plus in the NSRL data set alone.  MatchMeta.Info is my way of automating these comparisons into the analysis process.  Not all investigators have Internet access on their lab machines so I wanted to share the steps to build your own internal site.    
Server Specifications
Twisted Python Installation
I prefer using Ubuntu but feel free to use whatever operating system that your most comfortable using.  The installation process has become very simple!!
                                 
apt-get install python-dev python-pip
pip install service_identity twisted
Twisted Python Validation

NSRL Filenames
I download the NSRL data set direct from NIST than parse out the filenames with a Python script that I have hosted on the GitHub project site.
Or feel free to download the already precompiled list of filenames that I have posted here. 
meow://storage.bhs1.cloud.ovh.net/v1/AUTH_bfbb205b09774544bb79dd7bf8c3a1d8/MatchMetaInfo/nsrl251.txt.zip
MatchMeta.Info Setup
First create a folder that will contain the mmi.py file from the GitHub site and the uncompressed nsrl251.txt file in the previous section.  One example is a www folder can be created in the opt directory for these files.  
/opt/www/mmi.py
/opt/www/nsrl251.txt
Second make the two files read only to limit permissions.
chmod 400 mmi.py nsrl251.txt
Third make the two files owned by the webserver user and group.
chown www-data:www-data mmi.py nsrl251.txt
Fourth make only the www folder capable of executing the Twisted Python script.
chmod 500 www
Sixth make the www folder owned by the webserver user and group.
chown www-data:www-data www
MatchMeta.Info Service
Upstart on Ubuntu will allow the Twisted Python script to be run as a service by creating the /etc/init/mmi.conf file.  Paste these commands into the newly created file.  Its critical to make sure you use exact absoulute paths in the mmi.py and mmi.conf files or the service will not start.
start on runlevel [2345]
stop on runlevel [016]

setuid www-data
setgid www-data

exec /usr/bin/python /opt/www/mmi.py
respawn
MatchMeta.Info Port Forwarding
Port 80 is privileged and we don’t want to run the service as root so port forwarding can be used.  This will allow us to run the Python service as the www-data user by appending the following to the bottom of the /etc/ufw/before.rules file.
*nat
-F
:PREROUTING ACCEPT [0:0]
-A PREROUTING -p tcp –dport 80 -j REDIRECT –to-port 8080
COMMIT
Thanks to @awhitehatter  for the tip on their GitHub site.
Configure Firewall
Please setup the firewall rules to meet your environments requirements.  Ports 80 and 8080 are currently setup to be used for the MatchMeta.Info service.  Don’t forget SSH for system access.
ufw allow 80/tcp
ufw allow 8080/tcp
ufw allow ssh
ufw enable
MatchMeta.Info Validation
Finally, all set to start the MatchMeta.Info Service!!
start mmi
Browsing to these sites should return the word OK on the website.
Browsing to these sites should return the phrase NA on the website.       
I plan to keep moving MatchMeta.Info features from the command line version into the web interface in the future.  A morph for James Habben’s evolve project a web interface for Volatility has already been submitted to incorporate the analysis process.
John Lukach
@jblukach

Building Python Packages, By a Novice

I am excited to see that Evolve has been getting some use by more and more people. It has gained enough use and attention to even get the attention of SANS. They want to include Evolve in their SIFT workstation build. This is by no means an endorsement by SANS, but it means a lot to an open source developer to know that their tools are being used and helpful.

The requirement of Evolve making it into SIFT is that it needs to be installed from the Python Package Index (PyPI). This is a very reasonable requirement since it makes the maintenance of SIFT a much more reasonable project. A project, may I add, that is also available for free and maintained in the free time of the volunteers. Thanks!

I started reading about Python packages and distribution, and found that it is very capable and very flexible, almost too much. It has been a challenge for me to squeeze in the reading and testing between the demands of my full time job, but I finally stumbled my way through it to a final version. There are many tutorials available that explain the basics of Python packages, but I had a little different problem than what most Python projects face, I guess. I wanted to write this all down to share my experience for any of you that might benefit.

El Problemo

First of all, I am not a full time developer, nor am I a master of Python. I think this may be the root of my problem! I started Python, like many others, with assembling pieces of others scripts to make a solution to the problem I was facing at the time. Because of my background with so many other languages, it was a fairly short phase of learn the environment and intricacies of Python before I was able to start from scratch and StackOverflow my way through. I took on Evolve as a way to expand further, and to solve another problem. That is the mother of invention, after all. It has been a fun and rewarding experience, with many thanks to all of you supporting it!

Put me aside now, and let’s talk about the technical problem with packaging Evolve. A typical package available from PyPI is python code. You get a little more exotic when you find the author including some code written in C or C++ to make for a more efficient function. This code requires compiling, but the PyPI can handle it, and does it well (except for on Windows). Where Evolve presented the problem is in the HTML, CSS, JS, and images used in the web interface. These aren’t considered code by the Python packager, so it was a challenge for me to get them included.

Basic Building

I don’t want to rehash the basic build process since there are already many very well written tutorials out there to explain that. Instead, I will include a short list of some links that were helpful during my journey here.

Creating the setup.py file to start it all off

https://docs.python.org/2/distutils/setupscript.html
This tutorial was very helpful in building the basics of the setup.py file. It explains most of the properties well. The part that I found lacking was in the sections for including extra non-python code in the package.

The trouble with including non-python files

http://blog.codekills.net/2011/07/15/lies,-more-lies-and-python-packaging-documentation-on–package_data-/

C/O
http://stackoverflow.com/questions/7522250/how-to-include-package-data-with-setuptools-distribute

This was a great help in truly understanding what I thought I understood after reading the Python docs. It also helped keep me sane and moving forward!

Including some other files

http://stackoverflow.com/questions/9654694/where-are-package-data-files
This helped me somewhat. I was able to get the folder of HTML files included, but it was a rather manual process. I knew I could fall back on this, but I figured that there must be a better way. We are doing programmer things, after all.

More on including other files

https://wiki.python.org/moin/Distutils/Tutorial
Another good tutorial on the packaging process, but it didn’t fully register with what I needed.

Yet another on the process of building

https://www.digitalocean.com/community/tutorials/how-to-package-and-distribute-python-applications

This is the article that finally made things fit together. In fairness to the others, I think it just took some time to sink in.

Highlighted Points

Here are some points I thought I would share. Some of these may be obvious to you already, but I had trouble getting a full grasp of the exact requirements. These are not listed in any particular order of importance.

sdist and bdist use different properties

As stated in one of the articles, there are different ways to package your project. You can distribute the source code with sdist, or you can build it into a binary with bdist. Each of these methods uses different properties from inside the setup.py file. Be aware of which method you are using to distribute, and which properties are associated with each.

__init__.py is a critical file, even if it’s blank

In building the Morph feature of Evolve, I found that I had to have an __init__.py file in the morphs directory for proper Python function. You can look at both of these files that are in the project (project root and the morphs folder), and you will see they are both essentially blank. There is function behind having them, and they provide more function by placing code inside. I don’t need that function though, so they remain empty. In the process of making this package, I found that the __init__.py file is needed for the build process to recognize that the folder includes other Python files of code that need to be included in the package.

I am using classes in Evolve, but only for the Morphs. It’s something I want to address in the future, but moving the main code into classes will require time in refactoring and testing that I just don’t have quite yet.

MANIFEST is not MANIFEST.in

Bonehead move on my part, but this was the final blocker to me getting this venture to work. I read through many articles talking about modifying the MANIFEST.in file to search for other files to include. I made the bad assumption that MANIFEST was the file being specified. WRONG. The MANIFEST.in file is a template that is used during the build process. The MANIFEST file is written by the build process, as a log of the files included in the distribution package. To make me look even dumber, the first line in the file says ‘# file GENERATED by distutils, do NOT edit’…

Use distutils instead of setuptools

There are a few shortcomings in setuptools that I read about at the start of this project that were addressed in distutils. In typical Python and open source fashion, a problematic library was fixed, but moved to be named differently. Rants aside, just use the newer distutils and things will be much smoother.

That’s It

Again, enough tutorials out there already to explain this process. They do a pretty good job, but I ran into troubles including the extra files in Evolve. I hope this helps one of you to not pull your hair out when trying to build your Python package. Share any other tips you have below in the comments!

James Habben
@JamesHabben

Critical Stack Intel Feed Consumption

Critical Stack provides a free threat intelligence aggregation feed through their Intel Market for consumption by the Bronetwork security monitoring platform. This is a fantastic service that is provided for free!! Special thanks to those who have contributed their feeds for all to take advantage of the benefits!! Installation is beyond the scope of this post as it is super easy with decent documentation available on their website. The feed updates run roughly hourly by default into a tab delimited file available on disk.

My goal was to make the IP address, domain and hash values accessible through a web interface for consumption by other tools in your security stack. Additionally, I didn’t want to create another database structure but be able to read the values into memory for comparison on script restarts. Decided to use Twisted Python by Twisted Matrix Labs to create the web server. Twisted is an event-driven networking engine written in Python. The script provides a basic foundation without entering into the format debate between STIX and JSON.  Kept it simple…

Twisted Python Installation

The following installation steps work on Ubuntu 14.04 as that is my preference.

apt-get install build-essential python-setuptools python-dev python-pip

pip install service_identity

wget https://pypi.python.org/packages/source/T/Twisted/Twisted-15.5.0.tar.bz2

bzip2 -d Twisted-15.5.0.tar.bz2

tar -xvf Twisted-15.5.0.tar

cd Twisted-15.5.0/

python setup.py install

The PIP package installation allows for the future usage of SSL and SSH capabilities in Twisted.

TwistedIntel.py Script

The default installation file and path containing the Critical Stack Intel Feed artifacts.

 

 

 

The field separator on each line that gets loaded into the Python list in memory.

 

 

 

The output that gets displayed on the dynamically generated web page based on user input.

 

 

 

 

The port that the web server runs on for the end-user to access the web page.

 

 

TwistedIntel.py Usage

The TwistedIntel.py script can be used after execution by browsing to the website with an IP address, domain or hash value provided in the path.  If the result returns FOUND that means it is part of the Critical Stack Intel Feed.

Download TwistedIntel.py

https://gist.github.com/jblukach/00f68e560dac78e6bd29

Feel free to change the code to meet your needs and really appreciate any contributions back to the DFIR community.

 Happy Coding!!
John Lukach

Updated 12/15/2015

  •       TwistedIntel2.py displays the feed that an IP address, domain, or hash originated.
  •       Upstart configuration file for running the Twisted Python script at startup.
  •       Crontab configuration that restarts the script hourly after Critical Stack Intel updates.

 

Unified We Stand

Big news happened at #OSDFcon this week. Volatility version 2.5 was dropped. There are quite a number of features that you can read about, but I wanted to take a few minutes to talk about one feature in particular. There have been a number of output options in the past versions of Volatility, but this release makes the different outputs so much easier to work with. The feature is called Unified Output.

This post is not intended to be a ‘How To’ of creating a Volatility plugin. Maybe another day. I just wanted to show the ease of using the unified output in these plugins. If you are feeling like taking on a challenge, take a look through the existing plugins and find out which of them do not yet use the unified output. Give yourself a task to jump into open source development and contribute to a project that has likely helped you to solve some of your cases!

Let me give you a quick rundown of a basic plugin for Volatility

Skeleton in the Plugin

The framework does all the hard work of mapping out the address space, so the code in the plugin has an easier job of discovery and breakdown of the targeted artifacts. It is similar to writing a script to parse data from a PDF file verses having to write code into your script that reads the MBR, VBR, $MFT, etc.

To make a plugin, you have to follow a few structural rules. You can file more details in this document, but here is a quick rundown.

  1. You need to create a class that inherits from the base plugin class. This gives your plugin structure that Volatility knows about. It molds it into a shape that fits into the framework.
  2. You need a function called calculate. This is the main function that the framework is going to call. You can certainly create many more functions and name them however you wish, but Volatility is not going to call them since it won’t even know about them.
  3. You need to generate output.

Number 3 above is where the big change is for version 2.5. In the past versions, you would have to build a function for each format of output.

For example, you would have a render_text to have the results of your plugin output basic text to stdout (console). Here is the render_text function from the iehistory.py plugin file. The formatting of the data has to be handled by the code in the plugin.

If you want to allow to CSV output from that same plugin, then you have to create another function that formats the output into that format. Again, the formatting has to be handled in the plugin code.
For any other format, such as JSON or SQLite, you would have to create a function with code to handle each one.

Output without the Work

With the unified output, you define the column headers and then fill the columns with values. Similar to creating a database table, and then filling the rows with data. The framework then knows how to translate this data into each of the output formats that it supports. You can find the official list on the wiki, but I will reprint the table for a quick glance while you are reading here.
There is a requirement in using this output format and it is in the similar fashion to building a plugin in the first place.
  1. You need to have a function called unified_output which defines the columns
  2. You need to have a function called generator which fills the rows with data

Work the Frame

The first step in using the unified output is setting up your columns by naming the headers. Here is the unified_output function from the same iehistory.py plugin.
Then you define a function to fill each of those columns with the data for each record that you have discovered. There is no requirement on how you fill these columns, they just need the data.
The other benefit from this unified output is that a new output format can be easily added. You can see the existing modules, and add to it by writing code of your own. How about a MySql dump file format? Again, dig in and do some open source dev work!

Experience the Difference

Allow me to pick on the guys that won 1st place in the recent 2015 Volatility plugin contest for a minute. Especially since I got 2nd place behind them. Nope, I am not bitter… All in fun! They did some great research and made a greatplugin.

When you run their plugin against a memory image, you will get the default output to stdout.

If you try to change that output format to something like JSON, you will get an error message.

The reason for this is because they used the previous version rendering. The nice part is if they change the code to add JSON output, the unified output would also support SQLite and XLS or any other rendering format provided by the framework. Thanks to the Fireye guys for being good sports!

Now, I will use one of the standard plugins to display a couple different formats. PSList gives us a basic list of all the processes running on the computer at the time the memory image was acquired.

Here is the standard text output.

Here is JSON output. I added the –output=json to change it. It doesn’t look that great in the console, but it would be great in a file to import into some other tool.
Here is HTML output. Again, the change in with –output=html.

Hear My Plea

Allow me to get a shameless plug for my own project now. Evolve is a web based front end GUI that I created to interface with Volatility. In order for it to work with the plugins, they have to support SQLite output. The easiest way of supporting SQLite is to use the new unified output feature. The best part is that it works for a ton of other functions as well, with all the different formats that are supported.

If you have written, or are writing, a plugin for Volatility, make it loads better by using the unified format. We have to rely on automation with the amount of data that we get in our cases today, so let’s all do our part!

Thanks to the Volatility team for all of their hard work in the past, now, and in the future to come. It is hard to support a framework like this without being a for-profit organization. We all appreciate it!

James
@JamesHabben