top | item 38538111

All my favorite tracing tools

355 points| trishume | 2 years ago |thume.ca

40 comments

order

crdavidson|2 years ago

I wrote Spall, one of the lightweight profilers mentioned in the post. I loved the author's blogpost on implicit in-order forests, it was neat to see someone else's take on trees for big traces, pushed me to go way bigger than I was originally planning!

Thankfully, eytzinger-ordered 4-ary trees work totally fine at 165+ fps, even at 3+ billion functions, but I like to read back through that post once in a while just in case I hit that perf wall someday.

Working on timestamp delta-compression at the moment to pack events into much smaller spaces, and hopefully get to 10 billion in 128 GB RAM sometime soon (at least for native builds of Spall).

Thanks for the kick to keep on pushing!

criddell|2 years ago

If you work on Windows applications, check out Event Tracing for Windows (ETW). The best place to start is Bruce Dawson’s blog:

https://randomascii.wordpress.com/2015/09/24/etw-central/

yakubin|2 years ago

In my opinion, the best way to interact with ETW is through DTrace. Microsoft’s GUIs like WPA-Xperf are so buggy and unreliable that using them feels utterly futile. DTrace on Windows on the other hand is very usable.

maccard|2 years ago

If you're working with ETW traces, SuperLuminal [0] (no affiliation just a happy customer) is leaps and bounds ahead of the built-in ETW viewer.

[0] https://superluminal.eu/

RicoElectrico|2 years ago

Isn't ETW a total trainwreck from a developer usability standpoint? Or so my colleagues (and the interwebs) tell me.

Veserv|2 years ago

A pretty good overview of open source solutions in the space.

Missing out on one of the most useful areas for tracing which is time travel debugging. There are a number of interesting solutions there taking advantage of hardware trace, instrumentation, and deterministic replay. Even better when you get full visualization integration so you can do something like zoom in from a multiple minute trace onto a suspicious 200 ns function and then double click on it which will then backstep to that exact point in your program with the full reconstruction of memory at that time so you can debug from that point.

trishume|2 years ago

Do you know of anyone who's built that kind of time travel debugging with a trace visualization in the open outside of Javascript? I know about rr and Pernosco but don't know of trace visualization integration for either of them, that would indeed be very cool. I definitely dream of having systems like this.

hibbelig|2 years ago

Is there a time traveling debugging solution for Java?

jeffrallen|2 years ago

The author mentions dtrace in passing. If you're into "load bearing rants", check out bcantrill's recent rant on bpftrace silently losing events and why dtrace won't do that.

trishume|2 years ago

I haven't actually used bpftrace myself, only BCC. I can totally imagine it being more janky than DTrace, BCC is pretty janky even if I also think it's cool. In my eBPF tracing framework I had to add special handling counters to alert you if it ever lost any events, plausible bpftrace didn't do that.

danobi|2 years ago

What kind of events were being lost, and under what conditions? I'd like to see if it can be fixed.

kristjansson|2 years ago

The "you can feel like lights flickering on" one?

Always_Anon|2 years ago

Dtrace is a generation behind eBPF. There's a reason why the tracing community has moved on to eBPF and is no longer interested in dtrace.

kqr|2 years ago

> I wanted to correlate packets with userspace events from a Python program, so I used a fun trick: Find a syscall which has an early-exit error path and bindings in most languages, and then trace calls to that which have specific arguments which produce an error.

Wow. This is some great engineering. Obviously that's what you'd do, but I'd never think of it in a thousand years!

ElijahLynn|2 years ago

What a great way to recruit! The ending pitch to join Tristan at Anthropic, if I were competent enough in this area, is very alluring! Tristan does a great job covering the content about the types of things one would be working on.

p.s. I think the blog post could use more screengrabs of the traces. Great first pass at it though, and screengrabs can be added over time!

felixrieseberg|2 years ago

I wish the industry had a better answer for deterministically profiling the execution cost of JavaScript. Attempts were made in Chromium by hooking into Linux perf, but that change has since been removed.

If anyone has any tips on how to trace JavaScript (not just profile by time, but deterministically measure the cost of it in CI), I'd love to hear tips!

zubairq|2 years ago

Some great tools in here, thanks!