Parsing Atop log files with Dissect

At Hunt & Hackett we are huge fans of the Dissect[1] incident response framework, as stated in our previous blog post[2], which can be used to process investigation material and perform forensic analysis of timelines. This framework is part of our cloud-based Incident Response Lab, rendering it a fast and scalable solution for processing investigation material. In addition, it allows the CERT team to feed the knowledge gathered from incident response cases into repeatable methods such as parsers, which can be re-used during new cases and by the rest of the incident response community. 

During incident responses cases, the team encounters log files and binary formats that could preferably be parsed to a structured data format. This is where cstruct comes in handy. Cstruct[3] is a Dissect module that implements parsing for C-like structures used by the programming language C, to parse binary data to a structured data format using Python. This blog will walk you through our process of rapid prototyping during the incident and our post-incident activities to convert the prototype to a more robust solution that can be used by the entire Dissect community. 

Atop Logging

During an incident response case handled by Hunt & Hackett, it was necessary to process and investigate Atop log files found on Linux systems. The Atop tool[4] is a performance monitoring tool for Linux that is capable of reporting the activity of all processes, including the use of CPU, memory, disk and network layers of every process and thread. Although Atop logs data intermittently, resulting from how Atop was initiated, the available data was used to enrich timelines based on other logs and forensics artifacts available on a Linux system. For the most basic information, it showed the process executions with their respective command lines and the corresponding timestamp.  

Atop logs files to the Linux logging directory with the current date as suffix of the file name e.g. /var/log/atop/atop_20240201. The log files are stored in the Atop binary file format. As a result, parsing the strings to a structured data format using default string manipulation tools was not possible.  The Atop tool itself also offers the ability to parse the log files using the following command:  

atop -PPRG -r <file>

However, we couldn’t control the output format to obtain what we required for incident response purposes.  

Atop Parser

As with most ideas that need to be implemented, the best place to start is with the code and the documentation. Since Atop is an open-source tool[5] it was possible to use the structure definitions in the source files and lift them right into cstruct to start parsing the log files. 

Reading the main[6] page proved useful, as this let us know that Atop has binary raw files which it is  capable of parsing to display their content. This also provided a first indication that when browsing through the code for structures, one should keep an eye out for references to ‘raw’ or ‘binary’. If you read through the list of files on Github you’ll notice two files that are related to parsing the Atop binary format: 

  • https://github.com/Atoptool/atop/blob/fdf3526bd35c1a84dd11bb73110c1a1f4148e39d/rawlog.h 
  • https://github.com/Atoptool/atop/blob/fdf3526bd35c1a84dd11bb73110c1a1f4148e39d/rawlog.c 

In addition, the ‘cat’ files also caught our attention, mainly because this indicates that regular binary concatenation is not sufficient and the developer implemented some code to specifically do this for the Atop binary file format. 

  • https://github.com/Atoptool/atop/blob/fdf3526bd35c1a84dd11bb73110c1a1f4148e39d/atopcat.c 

When reading the three files it is possible to obtain a pretty clear picture of how the Atop binary file format is set up. Besides the code itself with the write and read logic, it also helps that Gerlof Langeveld, the developer of the tool provides a visual overview of the file format at the top of the rawlog.h header file: 

/* 

** structure describing the raw file contents 

** 

** layout raw file:    rawheader 

** 

**                     rawrecord                           \ 

**                     compressed system-level statistics   | sample 1 

**                     compressed process-level statistics / 

** 

**                     rawrecord                           \ 

**                     compressed system-level statistics   | sample 2 

**                     compressed process-level statistics / 

** 

** etcetera ..... 

*/

The references between the structures are as follow: 

  • The rawheader specifies the size of the rawrecord; 
  • The rawheader specifices the size of the uncompressed process-level (tstat) structure; 
  • The rawrecord specifies the size of the compressed system-level (sstat) and compressed process-level (tstat) statistics. 

There is much more information that aids in other sizes, but the above is the main gist needed to parse the file format. All that needs to be done is: 

  1. Read the header and get the size of the raw record; 
  2. Read the raw record and get the sstat, tstat sizes; 
  3. Read the compressed sstat; 
  4. Read the compressed tstat; 
  5. Go back to #2. 

After completing the five steps above, it is possible to decompress the relevant data that is stored in tstat.  The use of cstruct makes life a lot more pleasant when you have to deal with C-structures.  

During the investigation, Atop logging was processed using a proof-of-concept of the Dissect Atop parser. After the case was finished, Hunt & Hackett improved the concept and the parser was contributed to the Dissect project. The collaboration with the Dissect authors and our contribution can be followed on the pull request that we made: https://github.com/fox-it/dissect.target/pull/108 

The main improvements in the plugin concern checking the binary files for the version and ensuring only binary files are parsed that match the supported version. The readability and the maintainability of the code was also improved. As mentioned before, when using Atop itself to parse the log files you have no control over the output. By converting the proof-of-concept code into a Dissect plugin, it is now possible to ensure that the output matches the flow records as used by Dissect, which makes it easier and more versatile to use this in our Incident Response Lab processing pipeline. The Atop Dissect Target plugin[7] output can be seen below:

target-query -f atop -t / --limit 1 -q | rdump -L 

[reading from stdin] 

--[ RECORD 1 ]-- 

       hostname = dummy 

         domain = None 

             ts = 2024-02-01 09:50:04+00:00 

        process = systemd 

        cmdline = /lib/systemd/systemd auto noprompt splash --system --deserialize 43 

           tgid = 1 

            pid = 1 

           ppid = 0 

           ruid = 0 

           euid = 0 

           suid = 0 

          fsuid = 0 

           rgid = 0 

           egid = 0 

           sgid = 0 

          fsgid = 0 

           nthr = 1 

         isproc = True 

          state = S 

         excode = -2147483648 

          elaps = 0 

       nthrslpi = 1 

       nthrslpu = 0 

        nthrrun = 0 

           ctid = 0 

           vpid = 0 

    wasinactive = False 

      container = 

       filepath = /var/log/atop/atop_20240201 

[...] 

 

Conclusion

The Atop Dissect Target plugin supports versions 2.6 and 2.7 of Atop. After creating the plugin, versions 2.8, 2.9, 2.10[8] were released with a new binary file format that is incompatible with previous versions.

Moving forward, we will continue to improve the Dissect framework during our Incident Response cases. By sharing our work with the wider cybersecurity community, we hope to improve our collective capabilities and contribute to the development of more effective incident response solutions.

 

References

  1. https://github.com/fox-it/dissect
  2. https://www.huntandhackett.com/blog/scalable-forensics-timeline-analysis-using-dissect-and-timesketch
  3. https://github.com/fox-it/dissect.cstruct
  4. https://www.atoptool.nl/
  5. https://github.com/Atoptool/atop
  6. https://linux.die.net/man/1/atop
  7. https://github.com/fox-it/dissect.target/blob/main/dissect/target/plugins/os/unix/log/atop.py
  8. https://github.com/Atoptool/atop/releases/tag/v2.10.0

 

Keep me informed

Sign up for the newsletter