javierhonduco's comments

javierhonduco | 3 months ago | on: Anthropic acquires Bun

Wondering to what degree this was done to support Anthropic’s web crawler. Would assume that having a whole JS runtime rather than just a HTTP client could be rather useful. Just hypothesising here, no clue what they use for their crawler.

javierhonduco | 4 months ago | on: ARM Memory Tagging: how it improves C/C++ memory safety (2018) [pdf]

I am incredibly happy that Apple has added MTE support to the latest iPhones and perhaps the M5 chips as well (?). If that’s the case I don’t think any other personal computers have anything close to Apple machines in terms of memory safety and related topics (Secure Enclave etc).

Hope other vendors will ship MTE in their laptop and desktop chips soon enough. While I’m very positive about x86_64 adding support for this (ChkTag), it’ll definitely take a while…

In my opinion a worthwhile enough reason to upgrade but feels like a waste given my current devices work great.

javierhonduco | 1 year ago | on: Navtive FlameGraphViewer

This is pretty cool work.

Something that’s been on my mind recently is that there’s a need of a high-performance flame graph library for the web. Unfortunately the most popular flame graph as a library / component, basically the react and d3 ones, work fine but the authors don’t actively maintain them anymore and their performance with large profiles is quite poor.

Most people that care about performance either hard-fork the Firefox profiler / speedscope flame graph component or create their own.

Would be nice to have a reusable, high performance flame graph for web platforms.

javierhonduco | 1 year ago | on: Okay, I Like WezTerm

Haven’t had the chance to play with WezTerm just yet but wanted to share that the author is an incredibly smart, friendly, and humble.

Had the opportunity to work on a project together at work some years back and I can only aspire to be 1/10th as good of an engineer as him. A true hacker.

javierhonduco | 1 year ago | on: No More Blue Fridays

It is not, programs that are accepted are proved to terminate. Large and more complex programs are accepted by BPF as of now, which might give the impression that it's now Turing complete, when it is definitely not the case.

javierhonduco | 1 year ago | on: I have no constructor, and I must initialize

It’s easy to miss this in large codebases. Having to check every single struct initalisation whenever a field is added is not practical. Some folks have mentioned that linters exist to catch implicit initialisation but I would argue this shouldn’t require a 3rd party project which is completely opt-in to install and run.

javierhonduco | 1 year ago | on: I have no constructor, and I must initialize

Personally I’m not a fan of Go’s default zero-initialisation. I’ve seen many bugs caused by adding a new field, forgetting to update constructors to intialise these fields to “non-zero” values which caused bugs. I prefer Rust’s approach where one has to be explicit.

That being said it’s way less complex than C++’s rules and that’s welcomef.

javierhonduco | 1 year ago | on: Fault tolerance and resilience patterns for Go

This looks incredibly comprehensive, thanks for sharing!

Should have added that I read this book in 2016, and the first edition is even older, so there’s naturally been lots of new (and exciting) developments in this area!

javierhonduco | 2 years ago | on: The return of the frame pointers

Overall, I am for frame pointers, but after some years working in this space, I thought I would share some thoughts:

* Many frame pointer unwinders don't account for a problem they have that DWARF unwind info doesn't have: the fact that the frame set-up is not atomic, it's done in two instructions, `push $rbp` and `mov $rsp $rbp`, and if when a snapshot is taken we are in the `push`, we'll miss the parent frame. I think this might be able to be fired by inspecting the code, but I think this might only be as good as a heuristic as there could be other `push %rbp` unrelated to the stack frame. I would love to hear if there's a better approach!

* I developed the solution Brendan mentions which allows faster, in-kernel unwinding without frame pointers using BPF [0]. This doesn't use DWARF CFI (the unwind info) as-is but converts it into a random-access format that we can use in BPF. He mentions not supporting JVM languages, and while it's true that right now it only supports JIT sections that have frame pointers, I planned to implement a full JVM interpreter unwinder. I have left Polar Signals since and shifted priorities but it's feasible to get a JVM unwinder to work in lockstep with the native unwinder.

* In an ideal world, enabling frame pointers should be done on a case-by-case. Benchmarking is key, and the tradeoffs that you make might change a lot depending on the industry you are in, and what your software is doing. In the past I have seen large projects enabling/disabling frame pointers not doing an in-depth assessment of losses/gains of performance, observability, and how they connect to business metrics. The Fedora folks have done a superb and rigorous job here.

* Related to the previous point, having a build system that enables you to change this system-wide, including libraries your software depends on can be awesome to not only test these changes but also put them in production.

* Lastly, I am quite excited about SFrame that Indu is working on. It's going to solve a lot of the problems we are facing right now while letting users decide whether they use frame pointers. I can't wait for it, but I am afraid it might take several years until all the infrastructure is in place and everybody upgrades to it.

- [0]: https://web.archive.org/web/20231222054207/https://www.polar...

javierhonduco | 2 years ago | on: The return of the frame pointers

There's always room for improvement, for example, Samply [0] is a wonderful profiler that uses the same APIs that `perf` uses, but unwinds the stacks as they come rather than dumping them all to disk and then having to process them in bulk.

Samply unwinds significantly faster than `perf` because it caches unwind information.

That being said, this approach still has some limitations, such as that very deep stacks won't be unwound, as the size of the process stack the kernel sends is quite limited.

- [0]: https://github.com/mstange/samply

page 1