TL;DR: General purpose compression algorithms are pretty good. Can we beat them with a low effort optimization in compressing ETL files? Even if we could, it doesn’t mean we should.
Event Tracing for Windows is outright amazing. If you are a developer on Windows (native, managed, or even web) and have never used or (God forbid) heard of it, I can tell you that you are missing out on a very useful technology. I’m not an industry veteran nor an ETW specialist, but I already solved many incredibly complex customer issues using it (and its GUI analyzer, called the “Windows Performance Analyzer”) throughout my career. If you are not familiar with ETW, I suggest you start reading some of Bruce Dawson’s blog posts about investigations he made using it.
Anyways, this blog post will not be about ETW in general, but its binary output files of extension “.etl”. If you’ve ever taken a trace with the Windows Performace Recorder (or xperf), you know that these files can end up being pretty huge. Run system-wide tracing with sampled profiling and context switch recording for a few minutes, and we are already talking about gigabytes. Let’s see if we can beat general purpose compression algorithms using semantic knowledge about the content of ETL files.