Unified We Stand

Big news happened at #OSDFcon this week. Volatility version 2.5 was dropped. There are quite a number of features that you can read about, but I wanted to take a few minutes to talk about one feature in particular. There have been a number of output options in the past versions of Volatility, but this release makes the different outputs so much easier to work with. The feature is called Unified Output.

This post is not intended to be a ‘How To’ of creating a Volatility plugin. Maybe another day. I just wanted to show the ease of using the unified output in these plugins. If you are feeling like taking on a challenge, take a look through the existing plugins and find out which of them do not yet use the unified output. Give yourself a task to jump into open source development and contribute to a project that has likely helped you to solve some of your cases!

Let me give you a quick rundown of a basic plugin for Volatility

Skeleton in the Plugin

The framework does all the hard work of mapping out the address space, so the code in the plugin has an easier job of discovery and breakdown of the targeted artifacts. It is similar to writing a script to parse data from a PDF file verses having to write code into your script that reads the MBR, VBR, $MFT, etc.

To make a plugin, you have to follow a few structural rules. You can file more details in this document, but here is a quick rundown.

  1. You need to create a class that inherits from the base plugin class. This gives your plugin structure that Volatility knows about. It molds it into a shape that fits into the framework.
  2. You need a function called calculate. This is the main function that the framework is going to call. You can certainly create many more functions and name them however you wish, but Volatility is not going to call them since it won’t even know about them.
  3. You need to generate output.

Number 3 above is where the big change is for version 2.5. In the past versions, you would have to build a function for each format of output.

For example, you would have a render_text to have the results of your plugin output basic text to stdout (console). Here is the render_text function from the iehistory.py plugin file. The formatting of the data has to be handled by the code in the plugin.

If you want to allow to CSV output from that same plugin, then you have to create another function that formats the output into that format. Again, the formatting has to be handled in the plugin code.
For any other format, such as JSON or SQLite, you would have to create a function with code to handle each one.

Output without the Work

With the unified output, you define the column headers and then fill the columns with values. Similar to creating a database table, and then filling the rows with data. The framework then knows how to translate this data into each of the output formats that it supports. You can find the official list on the wiki, but I will reprint the table for a quick glance while you are reading here.
There is a requirement in using this output format and it is in the similar fashion to building a plugin in the first place.
  1. You need to have a function called unified_output which defines the columns
  2. You need to have a function called generator which fills the rows with data

Work the Frame

The first step in using the unified output is setting up your columns by naming the headers. Here is the unified_output function from the same iehistory.py plugin.
Then you define a function to fill each of those columns with the data for each record that you have discovered. There is no requirement on how you fill these columns, they just need the data.
The other benefit from this unified output is that a new output format can be easily added. You can see the existing modules, and add to it by writing code of your own. How about a MySql dump file format? Again, dig in and do some open source dev work!

Experience the Difference

Allow me to pick on the guys that won 1st place in the recent 2015 Volatility plugin contest for a minute. Especially since I got 2nd place behind them. Nope, I am not bitter… All in fun! They did some great research and made a greatplugin.

When you run their plugin against a memory image, you will get the default output to stdout.

If you try to change that output format to something like JSON, you will get an error message.

The reason for this is because they used the previous version rendering. The nice part is if they change the code to add JSON output, the unified output would also support SQLite and XLS or any other rendering format provided by the framework. Thanks to the Fireye guys for being good sports!

Now, I will use one of the standard plugins to display a couple different formats. PSList gives us a basic list of all the processes running on the computer at the time the memory image was acquired.

Here is the standard text output.

Here is JSON output. I added the –output=json to change it. It doesn’t look that great in the console, but it would be great in a file to import into some other tool.
Here is HTML output. Again, the change in with –output=html.

Hear My Plea

Allow me to get a shameless plug for my own project now. Evolve is a web based front end GUI that I created to interface with Volatility. In order for it to work with the plugins, they have to support SQLite output. The easiest way of supporting SQLite is to use the new unified output feature. The best part is that it works for a ton of other functions as well, with all the different formats that are supported.

If you have written, or are writing, a plugin for Volatility, make it loads better by using the unified format. We have to rely on automation with the amount of data that we get in our cases today, so let’s all do our part!

Thanks to the Volatility team for all of their hard work in the past, now, and in the future to come. It is hard to support a framework like this without being a for-profit organization. We all appreciate it!

James
@JamesHabben

Say Uncle

You have all run into this dreaded case, I’m sure. Maybe a few times. I am talking about that case where you are asked to prove or disprove a given action of a user, but the evidence just doesn’t give in. You poke, you prod, you turn it inside-out. Nothing you do seems to be getting you any closer to a conclusion. These are the cases that haunt me for a long time. I still have some rumbling around in my head from over 10 years ago.

What am I missing?
What haven’t I thought of?
Are my tools parsing this data correctly?
Am I using my tools correctly?
Does this user know about forensics and cleaning up?
Did this user really clean up that well?
How can I call myself a forensic examiner?
Should I have eaten that entire tub of ice cream with gummy bear and potato chip toppings while pulling my hair out and questioning my mere existence?

Building Myself

I tend to form an attachment to my cases. Maybe you do to? I take it on as a personal challenge. A challenge to prove that I am capable. A challenge to learn something new. A challenge to use a new tool, or an old tool in a new way. A challenge to take a step towards being a better investigator.
When I run into these cases that present no artifacts of activity, I end up spending a ton more time on them. I try to find something in the obscure areas. Sometimes it works and my perseverance pays off. I finally find that one little hint, and it unravels the rest of my case. I put that hint into my bank of things to look for in the future, and I have just improved my skills as an investigator.
Other times, I have to give in and say “uncle”. Because of the attachment I form with the case, it can sometimes drag me down. It feels like a failure to me. I should be able to find something, even the tiniest thing, which can give some kind of hint. But when have I spent enough time on it?

Evidence of Logic

There is a phrase that has been passed around the DFIR circles for many years. As much as it may seem, it didn’t start in our industry. It is a logic based thought and is discussed in probability as a way of forming a hypothesis. Fortunately, it fits for us too, so here it is:
The absence of evidence is not evidence of absence
In the original context, the usage of the word evidence is referring to an event. We base our investigations on the existence of an artifact on disk as a result of a user action (event). If artifact X exists in this data, then event Y must have taken place on this computer. It is a very sound approach, and one that many investigators use, digital or not. The finding of that artifact allows us to build a case with a reasonable certainty about the actions taken on this computer. We can’t always extend that onto the person on the opposite side of the table from us, but that is a whole ‘nother topic.
If we scour the drive and find that it is lacking evidence of artifact X, does that mean that event Y did not take place on this computer? Of course not! Because of the intimate knowledge we, as forensic experts, have about the programs and systems of computers, we could perform any number of actions and wipe the disk of any artifacts, if given enough time. In that scenario, event Y did happen. We know because we carried out the actions ourselves. Looking back at our phrase, we find it to ring pretty true.
This is what makes our investigations so difficult. Science is able to say things so simply: If x then y; if !y then !x. We cannot use this logic structure because there are too many variables at play. The courts accept our tools and findings based on them being a scientific process, but some argue that digitalforensics is not a science while others say it is a combination of art and science. No matter what your opinion is, we can agree on the phrase I highlighted above. Just because I don’t find a deleted file, it doesn’t mean that file didn’t exist.

Going the Distance

How far, then, do you go? How much time do you spend?
Some of you may have the luxury of spending as much time on a case as you need in order to be satisfied with the conclusion. Most of you, however, have someone asking telling you when you will have this case wrapped up.
I used to have more freedom in the time I spent on my cases. It’s not my decision anymore. I have someone paying for every hour I put into my cases, and that can get expensive. I absolutely am doing my best to find even the tiniest little artifact and I am determined to break the case open. My investigation is a direct cost, though. They will get an invoice. The invoice will get passed around to various departments as it gets processed. My forensic report will get passed around various departments as well. Some of those departments will be analyzing the cost of getting this report. If my report cost them $50,000 and simply says ‘no findings’, you can bet there will be some unhappy people.
So, how much time do you spend?

Releasing the Burden

My approach to this, in both past and present, is to take the burden of cost off of my shoulders. When a case is starting to head towards the dreaded ‘no findings’, I start preparing myself for it. In preparing, I start documenting my actions instead of just my findings. I prepare to deliver the bad news of ‘no findings’ to the person requesting the case.
When I get to a point of being comfortable with having exhausted all reasonable artifacts, I present my work to the customer. I take a different approach because I don’t have a list of artifacts proving the case. I explain that I don’t have any findings yet. Then I spend some time in education with the customer. I explain the standard processes that we use in forensics. I explain the approach that I have taken and the reasons behind it. I want to make sure the customer understands that they are paying for the work I am doing and not just the result. I want them to feel good about spending the money, and that they are also getting a very thorough review of the evidence.
Then, I give the options. I explain the techniques that I feel might give results, and I am honest about my expectations. I don’t ever want a customer to come back to me, unhappy, because I talked them into paying for a technique that didn’t pan out. This usually results in the customer having an internal chat about moving forward. Sometimes they really want that forensic answer, but other times they are satisfied with knowing the proposed scenario of the case was highly improbable based on the lack of evidence to support that the actions were performed. It’s a balance of cost vs benefit.

Letting Go

This is the hardest part for me. If the customer decides that the ‘no findings’ are enough, then I have to move on. There are more cases lined up and waiting for my time. I want resolution in every case, but it just isn’t reasonable. I can only do my best with the time that I am allowed.
Finding no evidence of a proposed action does not make you a substandard examiner. If you can stand up proud with your report in your hand, then you have done all you can. If you can defend your findings to other examiners who constructively ask if you tried technique A or technique B, then you have proven your skills.
Be proud of your work!
James

Crashing into a Hint

I haven’t spent much time looking at crash dumps, but I ran into one recently. I had a case that presented me with little in the way of sources for artifacts. It forced me to get creative, and I ended up finding a user-mode crash dump that gave me the extra information I needed. It wasn’t anything earth-shattering, but it could be helpful for some of you in the future.

The case was a crypto ransomware infection. Anti-virus scans had already wiped out most of the infection, but the question was how it got onto the box in the first place. I determined to a reasonable certainty that a malicious advertisement exploited an out of date flash plugin. The crash dump gave me the URL that loaded the flash exploit. Let’s get the basics out of the way first.

Crash Dump Files

There are several different types of user-mode crash dumps that can be created. The default, starting in Windows Vista, is to create a mini dump, but true to Windows, you can change those defaults in the registry.
The default location drops them in the user profile folder for all applications. The path looks something like this:
C:\uses\[user]\AppData\Local\CrashDumps
The filename convention is pretty simple. It starts with the name of the process. Then includes the process identifier (PID). Last is the extension of dmp. All together, it looks like this:

process.exe.3456.dmp

Forcing a Crash Dump

You can go look in that directory on your computer, and very likely find a few of those crash dumps hanging out. Any process will cause a crash dump file to be created, so it can be a great sign of user activity on the box in absence of other indicators.
If you see no files in that folder, or an absence of that folder altogether, RUN! Go get yourself a lottery ticket! It just means you haven’t used your computer for more than reading this blog post. Don’t worry though because Sysinternals has another useful tool that you can employ for giving you sample crashes.
ProcDumpis a command line tool that allows you to customize the dumps and set up all kinds of triggers on when to create the dumps. Great functionality if you are a developer tracking down that pesky bug in code, but doesn’t do much for us in forensics. Best part about this tool is that it creates the crash dumps without having to actually crash the program.
Here are the commands I used while researching this further and dropping some dummy data to use in screen shots. The number is the PID of the process that I wanted. You can use the name, but you have to make sure there is only one process with that name or it bails out. I used the PID because I was going against Internet Explorer and even a single tab creates at least two processes.
Procdump 1234
Procdump –ma 1234
The –ma switch creates a crash dump with the full process memory. Otherwise, the only difference in the dump file when using ProcDump is that the naming convention uses the current date and time instead of PID.

Analyzing the Dump File

The dump file in my case originally stuck out to me because of the timeline I built. I had a pretty specific time based on a couple other artifacts, so I crashed into it by working backwards in time. Exploit code can often crash a process when encountering unexpected versions or configurations. The flash player was probably older than expected by the code.
Fact #1: IE crashed during relevant time window.
I started by looking around for tools to analyze these dump files. I used IDA Pro and WinDbg, but decided it wasn’t giving me any useful information. I ended up going old school forensics on it.
I have some screen shots to show you the goods, but of course the names have been changed to protect the innocent. Here we go with sample data. I search for ‘SWF’ and locate several hits. One type of hit is on some properties of some sort.
Another type of hit is a structure that contains a URL of a SWF file that was loaded. Alongside that URL is another for the page that loaded the SWF URL.
At this point in my case, I took all of the URLs that I found associated with these SWF hits, and I loaded them up in VirusTotal for evaluation. Sure enough, I had one that was caught by Google. The others showed it as being clear, but this was only a few days after the infection. I checked it again a couple days later and more were detecting it as malicious.
Fact #2: Internet Explorer loaded a SWF file from a URL that has been detected as malicious.

Dump Containers

The default dump type is mini dump, which contains heap and stack data. These files contain parts of the process memory, obviously, so maybe they contain actual flash files? I wanted to find that out to confirm the infection.
The CPU does not have direct access to hard disk data, so it relies on the memory manager to map those files into memory. This means that, most of the time, files sit in memory just like they do on disk. Here is the specfor SWF files (PDF). With this information I can prepare a search for the header. I will print a snippet of the spec here for reference.
Field
Type
Comment
Signature
UI8
Signature byte:
“F” indicates uncompressed
“C” indicates a zlib compressed SWF (SWF 6 and later only)
“Z” indicates a LZMA compressed SWF (SWF 13 and later only)
Signature
UI8
Signature byte always “W”
Signature
UI8
Signature byte always “S”
Version
UI8
Single byte file version (for example, 0x06 for SWF 6)
FileLength
UI32
Length of entire file in bytes
I compare that with several SWF files that I have available, and find them to all use CWS as the header.
This means they are all compressed, and rightfully so since they are being sent over the internet. In the dump file, however, I come up with zero hits for CWS. I do locate 3 hits for FWS in the dump file.
Aha! Nailed this sucker! Now I just need to determine which of these contains the exploit code.

Carving SWF Files

Now I want to get these SWF files out of here. There is a small potential for files to be fragmented in memory, but the chances are generally pretty slim for small files like this. I start working on the headers that I have located. Fortunately, the F in FWS means that it is uncompressed. This means that the value for the file size field will be accurate since it always shows the size of the uncompressed version of the file.
The size field tells the size of the file from the start of the header. So I start my selection at the FWS and go for a length of 6919 in this file pictured.
As a confirmation of my file likely being in one piece, I see the FWS header for another SWF file following immediately after my selection. After selecting this file, I export it. I can now hash it and take the easy way out.
Now the easy way.
Dang! I do the same process for the other 2 hits, and I get the same result for all 3 of these. I have 3 SWF files loaded in my crash dump, but none of them are malicious. My curiosity is now getting the better of me, and I need to know more!
I looked on a different computer and found a crash dump from Chrome. This computer even has a newer flash plugin installed than my sacrificial VM. Now I perform the same search on this crash dump, and I find an interesting result.
The Chrome dump:
The Internet Explorer Dump:
The hits have the same version numbers and file sizes. I did some poking around inside the SWF file, and I found these to be components built into the flash player. You can see one of them in this image.
If I search one of the dumps that I made using the –ma switch of ProcDump, then I do find the exploit code. It is hidden amount many other hits. There are some funny looking strings at the top of this SWF file, but I will get to that in just a bit.

Collecting the Sample

My favorite tool for collecting web based samples is HTTrack. It has a ton of configuration options including the ability to use a custom User Agent string and to route through a proxy server, such as TOR. It is a little easier to use than wget or curl, and it easily gives me the full context of the pages surrounding the sample.
Since I found the malicious URL in the IE crash dump, I can use that to pull down the sample for further analysis.

Analyzing the Sample

Once I have the sample collected, I can now pull it apart to see what’s going on. I won’t claim to be anything other than a novice when it comes to SWF files, but there are similar behaviors used across all sorts of malicious code that can stick out.
I use a couple tools for breaking down SWF files: SWFTools and FFdec. I am not going into a ton of detail because therearemanyothersthat already cover this topic much better.
With this exploit, it is pretty obvious from just a glance that it is not a normal SWF. Take a look at this image and tell me this is normal.
A class named §1I1l11I1I1I1II1IlllIl1§, another named IIl11IIIlllIl1? Can you tell the difference between |, 1, I, l? That’s the idea!
Then there are the strings from the header we saw earlier. They look like this in the code.

Easy Analysis

Now I use the easy button to confirm what I already suspect. I send the hash to VirusTotal and find that it has a pretty strong indication.
I already know the end goal of this sample, so I don’t want to waste time on this. You can read about my new perspective in this earlier post.
Fact #3: SWF sample contains obfuscated code and has a VT score of 17/57

The Last Piece

I need one more piece to complete the findings. Knowing how these files typically get onto disk is our jobs. If it was seen on the screen, then it had to have been in memory at some point in some form. If it was in memory for some time, then there is a fair chance that it has been paged out into swap. If it came from the internet, there is a strong chance that it touched the temporary cache, even if just for a second. If it touched the cache and was cleared out, then there is a reasonable chance of pieces still sitting in unallocated clusters. Follow the logic?
I prepare a search by grabbing the first 8 bytes of the sample.
We saw earlier that SWF files can be either compressed or uncompressed, with the difference indicated by the first character. I don’t know if this SWF file will be compressed, so my keyword will account for that by not forcing a C, as the sample has. Otherwise, this has the SWF signature, the file version, and the file size. It is no SHA256, but it’s unique enough for some confirmation to be made.
? 57 53 0D 54 D4 00 00
Sure enough! The exploit SWF is located in pagefile.sys with this keyword. You can see those same broken strings and I|l1I names in there. Unfortunately, the nature of pagefile operation doesn’t always place file pieces together, nor does it even hold onto the entire copy of the file.
Fact #4: Exploit SWF code found in pagefile.sys

Putting It All Together

I walked you through some of the basic steps I used to discover the infection point. I did this without unallocated clusters, and with no browser history or cache. I put this together to challenge you.
Use your knowledge and be creative! You have a unique understanding of how many components of operating systems and applications work. Don’t get discouraged if your forensic tools don’t automagically dime out the culprit.
Here is a summary of what we found here:
Fact #1: IE crashed during relevant time window.
Fact #2: Internet Explorer loaded a SWF file from a URL that has been detected as malicious.
Fact #3: SWF sample contains obfuscated code and has a VT score of 17/57
Fact #4: Exploit SWF code found in pagefile.sys

What do you think? Enough to satisfy?

Here is the sample SWF I used for this walk through.

Be creative in your cases!
James

I Got This!

I recently made the jump into the consulting side of incident response and forensics. I come from internal investigations and training. I am quick to adapt and learn, but the type of work is the same. What has changed for me is how much work. Probably not in the way you are thinking though.

In my past roles, I had a lot of flexibility in the amount of time that I could spend on cases and projects. I guess I have a bit of perfectionism in me, since I like to exhaust all of the possible options for artifact locations of a particular action before I call it quits on a case. I had reasonable expectations to meet in the time it took for me to deliver my findings, but no one was tracking the exact hours that I put into any one case. All they really cared about was the number of days until I had my report ready. I was on salary so it really didn’t matter. I was paid to get the job done, and it took what it took.

Time is Money

In the consulting world, you have people paying for your time by the hour. They want results that are good, fast, and cheap. You know the story on that one, right?
Here is a quick rundown for those that haven’t been in consulting yet. The client calls in to someone on the team that discusses the project to get an idea of the type and amount of work that will be involved. The project then gets scoped for the number of hours that it shouldn’t exceed. It gets handed off to the lead investigator who then takes over and sets the plan for work. Collection is the first step. Then into analysis and finally reporting. It seems so simple!

Better than Real Life

I want to create a scenario for you based on some fairy tales with a couple facts from made-for-TV movies. I am having a ton of fun in my new job. It has brought a new challenge to me, and I don’t back away from those!
Let’s say that you get a case with a single system and it gets scoped for 40 hours. It’s a standard scenario of an employee leaving and the client wanting to confirm or deny that data has been taken, and it starts you off feeling pretty confident. You think, “I have a full week to poke around on this disk? I got this!” It sounds like a lot of time to analyze a system for the typical artifacts of data exfil.

Walk the Walk

Did I mention that your client’s office is on the opposite side of the country? Go ahead and take off two 5 hour flights because you client is paying for you to come out there. Now we are down to 30 hours.
Of course, that system has a 1tb drive in it because that was the standard option when they ordered it. There goes another 10 hours by the time you get your gear setup, disk acquired, e01 verified, and gear torn down. Down to 20 hours.
Now you lose a couple more hours because you have to either drive that evidence to the lab or ship it. The lab guys then add in hours because of the intake process, verification and backup. We can use a round number of 5 on this one, bringing the project in at 15 hours.
You need to do some processing and parsing. You don’t charge for machine time, but you do need to spend time setting up the case folders and getting all your tools ready to roll. You can’t unleash all of your tools at once, so over the next day you work on some admin stuff or reviewing results from another case with one eye on the tools. You just need to start the next tool when the last one completes. All together it probably takes 2 hours of your time. Now you have 13 hours left.
It feels a bit tight, but 13 hours leaves a fair bit of room for you to wander the disk. That is until you remember that you still have to write a report. Ever done one of those? Don’t forget that you also have peer reviews that need to get done before it can be finalized. I will be nice to you and say that you were able to blaze through this in 6 hours.

Putting the ‘Min’ in Examination

You went from a full week of examination time to just 7 hours. It happened without you even realizing it! You have gigabytes of data that have been parsed into output files in all sorts of different formats, though many of them are just text files.
Do you still have that pep in your step that you had when you first took on the case? Of course you get used to this after a while (I am hoping so, at least. :). You are now under pressure to perform a thorough exam in less than a day’s worth of work. You better get steppin’!

Deliverables

I hope you read that in a fun tone of voice. The same tone that I wrote it in. I wanted to share my experience so far with those who have not been in the consulting world. I knew things would be different. Some things were much easier than I anticipated, but others caught me a little off guard. The hours I described above is one of those on the tougher side.
I am learning a lot from all of my awesome team members. I have so many of them reaching out to me and offering help. The toughest part is figuring out which one to call!
I wanted to face a new challenge and work with a team I can really learn from. I have joined the right team for this, and I look forward to many years of continued learning.
Keep yourself challenged, and you will find a lot of happiness in your job!
James