No Maintenance Artifact Library

No longer having to maintain forensic artifact libraries for every security vendor has provided me time to tackle other repetitive investigation tasks with automation.

My next challenge is to eliminate the need for manual updates to the artifact library. As software patches are released, I want to collect the new artifacts and update the database.

No Maintenance Artifact Library

AWS System Manager allows you to automate operational tasks for Linux and Windows systems through an installed SSM agent. I set up a maintenance window to apply all available patches to EC2 instances for CentOS, Ubuntu, and Windows operating systems. The Microsoft system rebooted when required, but I needed shell scripts to test if the Linux boxes required a restart after patching.

Ubuntu

#!/bin/bash
if [ -f /var/run/reboot-required ]
then
    init 6
fi

CentOS

#!/usr/bin/bash
needs-restarting -r
if [ $? -eq 1 ]
then
    init 6
fi

I am capturing the SHA256 values of directory names, file full paths, file contents, and filenames for my primary use cases. I decided to leave MD5 available for legacy vendor support for now. The data is sent to an S3 bucket that gets inserted into DynamoDB that is accessible from an API Gateway.

If you would like to try using the data set, I have provided code samples here: https://4n6ir.com/matchmeta/

Happy Coding!
John Lukach

AWS Pseudo Pipeline

I have been running my Forensic Artifact API on Ubuntu with a Nginx, Flask Python, and MariaDB stack. I wanted to get out of the infrastructure administration business by moving to the AWS Cloud. I decided to start with the migration of my SHA256 hash library. My goals were to improve availability, allow collaboration and keep the costs down. I wound up having an expensive learning experience while importing the data into DynamoDB!

Forensic Artifact API Diagram

I decided to use the Amazon Web Services (AWS) Boto3 SDK for Python so I could read from an S3 bucket with an EC2 instance that inserts into a DynamoDB table. I was able to read the line-delimited text file of SHA256 hashes as a stream minimizing the amount of memory required on the EC2 instance for Python. Batch writing of items into the DynamoDB table can use a maximum set of twenty-five. I set the batch volume with ‘range’ in the for loop that must match the minimum provisioned capacity for auto-scaling startup. Global tables being used to replicate DynamoDB across regions needs to match ‘range’ until the first auto-scale completes.

import boto3

def import_hash(hashlist,hashtype,hashsrc,hashdesc):

    client = boto3.client('s3')
    resource = boto3.resource('s3')
    matchmeta = resource.Bucket('bucketname')
    obj = client.get_object(Bucket='bucketname', Key=hashlist)

    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('sha256')

    while True:
        with table.batch_writer() as batch:
            for i in range(25):
                item = obj['Body']._raw_stream.readline()[:-2].decode('utf-8')
                if not item: break 
                batch.put_item(Item={'sha256':item.upper(),'type':hashtype,'source':hashsrc,'desc':hashdesc})
        if not item: break

import_hash('Folder/File.txt','Known','HashSets.com','Windows')

DynamoDB has an issue if read/writes go to ‘zero’ that auto-scaling will not reduce down to the minimum provisioned capacity. I needed to use a time-based CloudWatch event to execute a Lambda function to generate regular database activity.

import boto3

dynamodb = boto3.resource('dynamodb')

def lambda_handler(event, context):

    table = dynamodb.Table('sha256')
    table.get_item(Key={'sha256':'0000000000000000000000000000000000000000000000000000000000000000','type':'TEST'})
    table.put_item(Item={'sha256':'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF','type':'TEST','source':'JOHN','desc':'PING'})

    return

Happy Coding!
John Lukach