brendangregg's comments

brendangregg | 1 year ago | on: No More Blue Fridays

Yes, we know eBPF must attach to equivalent events to Linux, but given there are already many event sources and consumers in Windows, the work is to make eBPF another consumer -- not to invent instrumentation frameworks from scratch.

Just to use an analogy: Imagine people do their banking on JavaScript websites with Google Chrome, but if they use Microsoft Edge it says "JavaScript isn't supported, please download and run this .EXE". I'm not sure we'd be asking "if" Microsoft would support JavaScript (or eBPF), but "when."

brendangregg | 1 year ago | on: No More Blue Fridays

Right, and we wanted to talk about all security solutions and not make this about one company. We also wanted to avoid shaming since they have been seriously working on eBPF adoption, so in that regard they are at the forefront of doing the right thing.

brendangregg | 1 year ago | on: Capturing Linux SSL/TLS plaintext without a CA certificate using eBPF

Right; disabling eBPF doesn't solve this. And the bigger point is that this kind of eBPF is still super-user only.

Apart from the more exotic facilities, the critical facilities that would be hard to disable include LD_PRELOAD for interposers/shims (as you mentioned), and gdb for just setting breakpoints on crypto functions. And if neither of those existed, then I may have to edit openssl code and recompile my own edited version. And if that wasn't allowed (signed libraries) then maybe I'd edit the application code or binaries.

brendangregg | 1 year ago | on: Linux Crisis Tools

My preference is tools that give a rolling output as it let you capture the time-based pattern and share it with others, including in JIRA tickets and SRE chatrooms, whereas top's generally clear the screen. atop by default also sets up logging and runs a couple of daemons in systemd, so it's more than just a handy tool when needed, it's now adding itself to the operating table. (I think I did at least one blog post about performance monitoring agents causing performance issues.) Just something to consider.

I've recommended atop in the past for catching short-lived processes because it uses process accounting, although the newer bpf tools provide more detail.

brendangregg | 1 year ago | on: The return of the frame pointers

It's not ridiculous at all. Who are you?

You are backing away from your other positions, for example:

> I fail to understand the reasoning of it "being simple" or "microbenchmarkey". It's far from the truth I think.

Do you now agree that TPC-B is too simple and microbenchmarky? And if not, please tell me (as I'm working on the problem of industry benchmarking in general) what would it take to convince someone like you to stop elevating obsoleted benchmarks like TPC-B? Is there anything?

brendangregg | 1 year ago | on: The return of the frame pointers

I didn't pull this argument out of nowhere, please read the direct comment I was replying to. Your position is also completely untenable: this benchmark was obsoleted by its creators 29 years ago, who very clearly say it is obsolete, and you're arguing that it isn't because it "still runs."

I'm guessing that this discussion would be more productive if you would please say who you are and the company you work for. I'm Brendan Gregg, I work for Intel, and I'm well known in the performance space. Who are you?

brendangregg | 1 year ago | on: The return of the frame pointers

I think you missed the context of what I was responding to, which was about whether databases could even have micro-benchmarks.

You also missed the word "Obsolete" splattered all over the website you sent me, and the text that TPC-B was "Obsolete as of 6/6/95".

brendangregg | 1 year ago | on: The return of the frame pointers

For a busy 64-CPU production JVM, I tested Google's Java symbol logging agent that just logged timestamp, symbol, address, size. The c2 compiler was so busy, constantly, that the overhead of this was too high to be practical (beyond startup analysis). And all this was generating was a timestamp log to do symbol lookup. For DWARF to walk stacks there's a lot more steps, so while I could see it work for light workloads I doubt it's practical for the heavy production workloads I typically analyze. What do you think? Have you tested on a large production server where c2 is a measurable portion of CPU constantly, the code cache is >1Gbyte and under heavy load?

brendangregg | 1 year ago | on: The return of the frame pointers

If I call the same "get statistics" command over and over in a loop (with zero queries), or 100% the same invalid query (to test the error path performance), I believe we'd call that a micro-benchmark, despite involving a full database. It's a completely unrealistic artificial workload to test a particular type of operation.

The pgbench docs make it sound microbenchmark-y by describing making the same call over and over. If people find that this simulates actual production workloads, then yes, it can be considered a macro-benchmark.

brendangregg | 1 year ago | on: The return of the frame pointers

From the docs: "pgbench is a simple program for running benchmark tests on PostgreSQL. It runs the same sequence of SQL commands over and over"

While it might call itself a benchmark, it behaves very microbenchmark-y.

The other numbers I and others have shared have been from actual production workloads. Not a simple program that tests same sequence of commands over and over.

brendangregg | 2 years ago | on: Ubuntu 24.04 LTS will enable frame pointers by default

I don't know where 1-2% comes from, but for many scale production workloads I studied it was so close to 0% that it was tough to measure beyond noise on the cloud. That's not to say that 1-2% is wrong, but that it's likely someone's workload and other people see less.

Helping people find ~30-3000% perf wins, helping debugging and automated bug reports, is huge. For some sites it may be like 300 steps forward, one step back. But it's also not the end of the road here. Might we go back to frame pointer ommision one day by default if some other emerging stack walkers work well in the future for all use cases? It's a lot of ifs and many years away, and assumes a lot of engineering work continues to be invested for recoving a gain that's usually less than 1%, but anythings possible.

There's a couple of problems with an apt reinstall. One is that people often don't work on performance until the system is melting down -- many times I've been handed an issue where apt is dreadfully slow due to the system's performance issue and just installing a single package can take several minutes -- imagine reinstalling everything, it could turn the outage into over an hour! The other is I'd worry that reinstalling everything introduces so many changes (updating library versions) that the problem could change and you'd have no idea which package update changed it. If there was such an apt reinstall command, I know of large sites (with experience with frame pointer overheads) that would run it and then build their BaseAMI so that it was the default. Which is what Ubuntu is doing anyway.

brendangregg | 3 years ago | on: I bought a CO2 monitor and it broke me

I've had a few CO2 meters, and learned not to trust any <$100 as they don't really work. I look for meters that use non-dispersive infrared diffusion sensors (NDIRs), like the Aranet4. I've found the TIM10 desktop model from co2meter.com to be accurate (AFAICT), uses NDIRs, and only US$139.

I also have other air quality meters. (I collect measuring devices.) I wish there was a do-it-all air meter.

brendangregg | 3 years ago | on: iOS Ships Dvorak, Finally

Why not dvorak one-handed (both right and left)? I've tried it when I've injured one hand badly, and it's optimized for one-handed typing.

brendangregg | 3 years ago | on: Is bin-opening in cockatoos leading to an innovation arms race with humans?

Random cockatoo/FAANG story: I had an all-hands meeting earlier this year at Netflix, with me remote in Sydney, and a cockatoo knocked on my glass office door wanting to be let in -- so I'm saying "stop it! stop it!" when I realized my mic was on, and I'd said that to the whole org as my new manager was speaking. I quickly explained to everyone that I wasn't criticizing the meeting, oh no, I was actually talking to a bird that was knocking on the door.

I then felt it best to post a video of the bird knocking so people didn't think I was crazy. They knock with their beak: tap tap tap. Gets annoying when they do it at 6am to wake you up.

Recently I have noticed cockatoos raiding the bins in Sydney, it's definitely a thing.

page 2