Open-source software is at the heart of Hunt & Hackett’s automated incident response strategy. Over the past few years, we have built an innovative cloud-based incident response lab, contributed to open-source projects, and showcased our methods at industry events such as the SANS DFIR summit and HackerHotel.
While the advantages of our open-source journey are numerous, they do not fully address the challenges encountered along the way. Incident response is inherently complex, and relying on a self-built incident response lab that may contain bugs can make it even more challenging. Yet, these very challenges compel incident response teams to confront problems directly and transform them into opportunities.
This blog post will walk you through our journey of leveraging open-source software, shedding light on the obstacles we faced during incident response cases and illustrating how we transformed challenges into opportunities.
At Hunt & Hackett, we strongly believe in the power of open-source software as an essential part of our incident response service. Leveraging open-source software provides key advantages, such as transparency and the ability to collaborate with the incident response community. These advantages are highlighted in our previous blog posts:
In addition, we showcased these advantages at the SANS DFIR summit 2024, demonstrating how incident response can be performed scalable and fast within minutes. We achieve this by applying an automated incident response strategy that combines the investigative prowess of a digital detective with a DevOps mindset, enabling fast and scalable investigations.
As a result of this strategy, we built an innovative cloud-based incident response lab, as shown in Figure 1, that can initiate investigations within 15 minutes.
Figure 1 - Overview of the incident response lab
This strategy requires us to have a deep understanding of how the software used in our incident response lab works, as there is a significant distinction between:
Our open-source journey began in 2022, when we built our incident response lab from scratch, leveraging the open-source project Digital Forensics & Incident Response Lab[1] as its foundation. Shortly after, we made our first contributions to several open-source projects[2], marking the start of our active engagement in the incident response community.
Fast forward: over the past three years, we gained valuable insights by not only integrating open-source tools in our incident response lab but also by actively maintaining and improving upon them. As shown in Figure 2, our journey reflects an increasing level of capabilities for integrating, contributing, and ultimately maintaining open-source projects[3]. This strategic progression has not only enhanced our technical expertise but also strengthened our ability to tailor solutions to our specific incident response needs.
Figure 2 - From Ruins to Resilience: How Developing and Utilizing Open Source Solutions Enhances CSIRT Capabilities
The advantages of our open-source journey, as described in the previous chapter, highlight many benefits but do not fully address the challenges encountered along the way. Incident response itself is already complex and relying on a self-built incident response lab composed of software that may contain bugs, can make that even more challenging.
These difficulties may arise in scenarios such as:
Applying an automated incident response strategy requires incident response teams to confront these challenges head-on, to flip these challenges into opportunities. The next paragraphs cover three challenges we encountered and explain how they were turned into opportunities:
For the acquisition of forensic artefacts and logs from Windows, Unix and MacOS systems, we use Velociraptor collectors. Velociraptor is an open-source digital forensics tool that can perform targeted collections of forensic artefacts and logs across multiple systems, as well as actively threat hunt and monitor systems for suspicious activity.
During investigations, our goal is to transform these forensic artefacts and logs into a timeline of events, that describes which events took place on a system. To achieve this, the Dissect incident response framework is used, as it can effortlessly extract forensics artefacts and logs from any source, including Velociraptor collections.
On 3 January 2023, we contributed the Velociraptor loader[4] to the Dissect framework, enabling it to correctly transform Velociraptor collections into Dissect targets. Targets describe a certain state of a system and are used by Dissect to understand where in the collections the forensic artefact and logs are stored, which can then be transformed into a timeline of events.
Fast forward to 10 December 2024, a total of twelve contributions have been made to improve the Velociraptor loader[5], not only by us but also by other members of the incident response community. During this period, we encountered various problems when using the loader for incident response cases, Table 1 describes these problems, including the impact of and the corresponding solution.
Problem | Impact | Solution |
The loader did not support the initialization of Windows Volume Shadow Copies (VSS) | This could lead to missing crucial traces of an attacker that are stored in VSS volumes. | The loader was updated to support the initialization of Windows VSS[6], ensuring that file system entries stored in VSS volumes are properly retrieved. |
Windows timelines did not include the USN Journal data source, which tracks changes in NTFS file systems. | This could have caused crucial traces of modified files to be missed. | The location of the USN Journal was added to the loader’s initialization for Windows systems[7], ensuring that Dissect could parse the data source. |
The loader did not support the initialization of Velociraptor collections stored in ZIP files, requiring extraction before processing them with Dissect. | Depending on the collection, extracting files could alter the investigation material, compromising its integrity. | A ZIP loader[8] for Velociraptor collection was implemented, eliminating the need for extraction of ZIP files. |
Velociraptor URL-encodes file system entry path, e.g. ‘/home/user/.ssh’ becomes ‘/home/user/%2Essh’. | Dissect plugins like the SSH plugin expect unencoded path names and therefore fail to function correctly on Velociraptor collections. | This was corrected by URL-decoding the file systems entries before loading these in Dissect targets[9]. |
Table 1 - Problems identified in the Velociraptor loader and the solution to solve these problems
While investigating a Linux system containing UTMP logs, we observed that the UTMP logs showed successful logons from IP addresses which were not used by the system administrator. This raised suspicions that the system could have been compromised by an attacker. While reviewing this finding, it became apparent that the originating locations of these IP addresses were highly unusual. Consequently, we checked whether the Dissect parser for UTMP logs was functioning correctly.
The IP address of a successful logon on a Linux system is stored in the field ut_addr_v6 of the UTMP structure[10], as shown below. This field has 16 bytes and can accommodate the required 16 bytes of an IPv6 address or 4 bytes for an IPv4 address.
struct utmp {
[...] int32_t ut_addr_v6[4]; /* Internet address of remote host;
IPv4 address uses just ut_addr_v6[0] */ }
One might wonder what could go wrong. Here’s an example:
Figure 3 - UTMP logs of IPv4 addresses and IPv6 addresses
Therefore, the UTMP log structure is fundamentally broken for IPv6 addresses that span 4 bytes or fewer (excluding trailing zeroes), as shown in Table 2.
IP address | Result |
IPv4 | The IPv4 address is correctly parsed. |
IPv6 <= 4 bytes | The IPv6 address is incorrectly interpreted as an IPv4 address. |
IPv6 > 4 bytes | The IPv6 address is correctly parsed. |
Table 2 - The broken IPv6 implementation (ut_addr_v6) of the UTMP structure
To address this, we contributed a solution[11] that combines the result of the host (ut_host) and IP address (ut_addr_v6) fields to determine if the IP address is an IPv4 or IPv6 address.
During investigations, forensic artefacts and logs are transformed to a timeline of events. This way, analysts no longer view artefacts in isolation but rather within a timeline, providing a comprehensive view of events that occurred on a system.
To investigate these timelines, we use Timesketch, a collaborative timeline analysis tool developed by Google. As highlighted in the first chapter, our goal is to perform incident response at scale. However, during the first year of using Timesketch, we discovered that it is not sufficiently scalable for large investigations. It works well for investigations with around ten systems, but it cannot handle hundreds of systems. Hence, it cannot be scale based on the demands of an incident response case.
During incident response cases, we encountered three primary issues with how Timesketch (Google) stores data, as outlined in Table 3.
Subject | Issue | Impact |
Transform data | Uploaded timelines are stored on a disk by the Timesketch webserver, after which a Timesketch worker transforms the data in the timelines and uploads them to Elasticsearch. | This increases the processing time of timelines and adds unnecessary complexity to the transformation process. |
Data storage | Timesketch supports a single Elasticsearch index and does not support an index alias or data streams. | Once an index is created, it cannot be further scaled by increasing the number of shards. |
Data format | Timesketch uses the message field to summarize the fields and values of an event, as part of the user interface. | This results in Elasticsearch documents containing duplicate data, increasing the size of each document by 50-100%. |
Table 3 – Primary scalability issues with Timesketch
To address these issues, we forked Timesketch[12] on 26 January 2023 and implemented solutions to resolve the three identified issues, as outlined in Table 4.
Subject | Timesketch (Hunt & Hackett) | Scalability |
Transform data | As part of the extract, transform and load (ETL) pipeline, timelines are transformed and loaded into Elasticsearch using Logstash. This method removed Timesketch as the intermediary. Further details are provided in our blog post ‘Scalable forensics timeline analysis using Dissect and Timesketch’ | The number of Elasticsearch nodes and their memory capacity can be increased, enabling more efficient scaling. |
Data storage | The support for Elasticsearch index aliases in conjunction with index templates was added. | Based on the index’s size, a new index can be used by the index alias, as configured in the index template, allowing up to 90 TB+ of storage. |
Data format | The message field was removed, and values of events are dynamically shown in the user interface. | Reduces the size of Elasticsearch documents by 50-100%, improving storage efficiency and overall performance. |
Table 4 - Overview of scalable solutions of Timesketch (Hunt & Hackett) fork
This blog post walked you through our journey of leveraging open-source software, shedding light on the obstacles we faced during incident response cases and illustrating how we transformed challenges into opportunities.
More importantly, we hope to encourage the incident response community to move beyond simply using tools and to actively contribute to the open-source projects they rely on. Our own experience has shown the immense value of deeply understanding how tools work, automating where possible, and actively participating in a community that drives innovation together.
Throughout our incident response work, we encountered the following three key challenges, which we where able to turn into opportunities:
By sharing our journey, we aim to strengthen the collective capabilities of the incident response community and drive the evolution towards more effective incident response solutions.