Basis Technology released a nice tutorial on writing Python modules for Autopsy . Figured, I would start with hash analysis using the built in Hash Lookup module for the National Software Reference Library (NSRL) and VirusShare.com than create a custom gold build hash set.
The current Minimal Set of the NSRL 249 containing over 42 million known hashes was not posted for download as an Autopsy Hash Index yet. Building your own will involve downloading the reduced minimal hash set . A quick little Python script removes the header and writes only the MD5 hash to a new file. Currently Autopsy has an issue generating the index for large hash sets requiring the NSRL to be split into two chunks. The Unix command ‘wc NSRL.txt’ can be used to determine the number of lines in the file so it can be split evenly with the ‘split -l 21030270 NSRL.txt’ command.
|NSRL Python Code|
VirusShare.com provides hash lists that can be publically downloaded and used to identify over 20 million potentially bad files. The Unix command ‘cat VirusShare_00* | grep -v ‘#’ > VxShare.txt’ will create a single hash file that can be imported into Autopsy.
HashSets.com is another place to get additional known hashes that are not in the NSRL too. It is definitely preferred to generate a Gold Build hash set that is specific to your environment. The Autopsy Python Gold Build Module requires that the hash values have already been calculated prior to execution. Running the Gold Build Module will create a text file containing only unique hash values stored in the case folder called GoldBuild.txt .
Now that the NSRL, VirusShare.com and the Gold Build output are all in the same format, they can be imported and used to generate hash indexes for comparisons in Autopsy . Only indexes are required for hash analysis to complete. Additional hashes can not be added to an existing index only the hash library through the GUI interface.