emfree's comments

emfree | 7 years ago | on: Why you can have millions of goroutines but only thousands of Java threads

Goroutine stacks are in fact allocated on the heap. All the details are in here: https://github.com/golang/go/blob/master/src/runtime/stack.g...

emfree | 8 years ago | on: Profiling Go Applications with Flamegraphs

A nice writeup, thanks. There are a few variations on this workflow that I've found useful in practice; perhaps they'll be helpful to some folks:

- Linux perf can profile unmodified Go programs. This is handy when your application doesn't expose the /debug/pprof endpoint. (http://brendangregg.com/FlameGraphs/cpuflamegraphs.html#perf has detailed instructions)

- Recent versions of https://github.com/google/pprof include a flamegraph viewer in the web UI. This is handy when you want a line-level flamegraph instead of a function-level flamegraph.

emfree | 8 years ago | on: PostgreSQL HA cluster failure: a post-mortem

Use cases for ProxySQL: many.

- failover

- query routing (e.g., for sharded deployments)

- caching

- workload stats/metrics

- query rewriting

etc.

emfree | 8 years ago | on: Treating performance as a product: The technical story of Asana’s rewrite

Curious: What's your strategy for measuring application performance? Would love to hear more details on how you're tracking the effect of your efforts.

emfree | 8 years ago | on: Go 1.9 is released

You piqued my curiosity :) A comment in the source for the release notes (https://github.com/golang/go/blob/master/doc/go1.9.html#L922) points to the relevant change: https://go-review.googlesource.com/c/go/+/34310, which in turn links to this issue: https://github.com/golang/go/issues/13086

emfree | 8 years ago | on: Sysbench for MySQL 5.0, 5.1, 5.5, 5.6, 5.7 and 8

I think the author is specifically evaluating low-concurrency in-memory workloads here. The previous post describes why regressions for those workloads might be "not a surprise": https://smalldatum.blogspot.com/2017/05/the-history-of-low-c...

emfree | 9 years ago | on: Ruby `reject!`

One of the older posts eloquently discusses why "accidentally quadratic" behavior is both so recurring and so insidious: http://accidentallyquadratic.tumblr.com/post/113840433022/wh...

emfree | 9 years ago | on: Scuba: Diving into Data at Facebook [pdf]

I'll second the post above -- if you miss Scuba, honeycomb.io is for you. https://honeycomb.io/blog/2016/11/honeycomb-faq-in-140-chars...

emfree | 9 years ago | on: Learning from a Year of Security Breaches

These are such great comments, thanks for sharing your insights. For folks looking for other options, I'd also mention https://honeycomb.io, perhaps the most promising newcomer in this space. It's essentially Facebook's Scuba for the rest of us.

emfree | 9 years ago | on: Signals of API Health and Performance in Cloud-Native Applications

Looks cool. Instrumenting at the network layer is certainly a promising approach. Are you recording latency distributions, and not just averages? The screenshots only show mean and median latency, which isn't enough to spot many anomalies.

emfree | 9 years ago | on: Join-Idle-Queue: Load Balancing Algorithm for Scalable Web Services (2011)

> But in web services you often care more about the tail-end latency, the p90, p99 etc.

For sure. I think Theorem 2 in the paper implicitly addresses the latency distribution in this scheme. They're saying that in the limit of a large system, the queue length distribution at a single backend server depends only on the service time distribution (how long it takes to actually process each job) and the service discipline. So if for example job sizes are exponentially distributed and handled in FIFO order, then the wait time distribution is also exponential.

It would certainly be nice to see a more explicit discussion of the tail latency, especially in the simulations the authors did.

emfree | 9 years ago | on: How does gdb work?

Great question. I wondered the same thing a while ago, and tried to build one using SystemTap (https://github.com/emfree/pystap). Couple reasons why this isn't too easy:

* "Python" in general might mean you're on Linux/Windows/whatever, and it might mean CPython, PyPy, or some other runtime. But any out-of-process instrumentation is gonna have to be pretty platform/runtime specific.

* Even if we restrict ourselves to, say, CPython on Linux, the interpreter's internals aren't super friendly to this sort of inspection from the outside. You have to rely on and also work around implementation details.

Example: to get a Python call stack, you want to look at `PyThreadState_Current` (basically the same idea as `ruby_current_thread` in that excellent linked post of Julia's, I think). But this happens to be null whenever the GIL is released, e.g. when doing network I/O, and then you're kind of out of luck. So you'll already have trouble usefully profiling a single-threaded I/O-intensive program.

* Oh and you pretty much need debug symbols in your CPython binary (I think? Tell me if this isn't true!). Most production CPython builds don't have them. So you have to get the right binary, and rebuild any application dependencies with C extensions. Not hard but annoying.

There is potential though! With some work, we definitely could have a better story for out-of-process Python profiling a la Linux perf.

emfree | 10 years ago | on: Supersingular elliptic curve isogeny Diffie-Hellman 101

Yep! In this case, I think you end up constructing, slightly more specifically, the isogeny whose kernel is exactly the cyclic subgroup generated by the point R (i.e., phi(S) is 0 iff S is a power of R). There are explicit formulas ("Vélu's formulas") that let you compute an isogeny from its kernel. Looks like the paper goes into some depth about how to do that computation efficiently, and how to ensure that you choose a cryptographically suitable point R.

emfree | 10 years ago | on: Heroku Kafka

Thanks for the insightful comment!

> The alternative if you are at a company with the resources to do so (mine is), is to build something that fits your use case better than Kafka

I'd love to hear more about this :) What did you end up doing differently from Kafka? How's it working out for you?

emfree | 10 years ago | on: Random Walks: the mathematics in 1 dimension

Here's a reference I found for one way to do it: http://www.math.nus.edu.sg/~matsr/ProbII/Lec6.pdf (Theorem 2.1). You define the Green's function G(x, y) = \sum_n Pr_x(S_n=y), where x and y are 3-vectors and Pr_x(S_n=y) is the probability that an n-step random walk starting at x ends up at y. If you have an infinite random walk starting at 0, then G(0, 0) is the expected number of times that the walk returns to 0. That's what the mathworld link calls u(3). You can use Fourier inversion to compute G(0, 0) -- the link gives the gnarly details. It's pretty cool.

emfree | 10 years ago | on: Profiling Python in Production

Author of the post here. That's a good question. I don't know if this approach is objectively better, but it has a few nice features.

* We generally favor free/open source solutions where practical.

* It is quite a bit cheaper in dollar terms.

* The actual code to make this work is very lightweight. By doing it yourself, you have total control, and can extend or tweak to get exactly the data you want. Being able to easily add bespoke instrumentation is really powerful. To give an example from one of our use cases (IMAP sync), let's say you wanted to cohort your data by mail provider. I.e., you suspect that the workload profile when syncing against server A is significantly different than syncing against server B, and you want to know for sure. It's pretty easy to take your codebase and your instrumentation, and add that by inspecting some thread-local context at runtime. Might be hard to do with an off-the-shelf commercial tool.

emfree | 10 years ago | on: How We Deploy Python Code

Ansible works well for us, although we use it in a somewhat different way than most folks. We previously wrote about our approach here, if you're curious: https://nylas.com/blog/graduating-past-playbooks

emfree | 11 years ago | on: Inbox — The next-generation email platform

Hi, Inbox engineer here. Beyond the contextIO feature set, we support creating drafts, sending mail, and client sync, so you can use the API to really build full-fledged mail clients. The Inbox sync engine indexes all the data, so the API's performance isn't limited by that of the mail provider.