emfree | 7 years ago | on: Why you can have millions of goroutines but only thousands of Java threads
emfree's comments
emfree | 8 years ago | on: Profiling Go Applications with Flamegraphs
- Linux perf can profile unmodified Go programs. This is handy when your application doesn't expose the /debug/pprof endpoint. (http://brendangregg.com/FlameGraphs/cpuflamegraphs.html#perf has detailed instructions)
- Recent versions of https://github.com/google/pprof include a flamegraph viewer in the web UI. This is handy when you want a line-level flamegraph instead of a function-level flamegraph.
emfree | 8 years ago | on: PostgreSQL HA cluster failure: a post-mortem
- failover
- query routing (e.g., for sharded deployments)
- caching
- workload stats/metrics
- query rewriting
etc.
emfree | 8 years ago | on: Treating performance as a product: The technical story of Asana’s rewrite
emfree | 8 years ago | on: Go 1.9 is released
emfree | 8 years ago | on: Sysbench for MySQL 5.0, 5.1, 5.5, 5.6, 5.7 and 8
emfree | 9 years ago | on: Ruby `reject!`
emfree | 9 years ago | on: Scuba: Diving into Data at Facebook [pdf]
emfree | 9 years ago | on: Learning from a Year of Security Breaches
emfree | 9 years ago | on: Signals of API Health and Performance in Cloud-Native Applications
emfree | 9 years ago | on: Join-Idle-Queue: Load Balancing Algorithm for Scalable Web Services (2011)
For sure. I think Theorem 2 in the paper implicitly addresses the latency distribution in this scheme. They're saying that in the limit of a large system, the queue length distribution at a single backend server depends only on the service time distribution (how long it takes to actually process each job) and the service discipline. So if for example job sizes are exponentially distributed and handled in FIFO order, then the wait time distribution is also exponential.
It would certainly be nice to see a more explicit discussion of the tail latency, especially in the simulations the authors did.
emfree | 9 years ago | on: How does gdb work?
* "Python" in general might mean you're on Linux/Windows/whatever, and it might mean CPython, PyPy, or some other runtime. But any out-of-process instrumentation is gonna have to be pretty platform/runtime specific.
* Even if we restrict ourselves to, say, CPython on Linux, the interpreter's internals aren't super friendly to this sort of inspection from the outside. You have to rely on and also work around implementation details.
Example: to get a Python call stack, you want to look at `PyThreadState_Current` (basically the same idea as `ruby_current_thread` in that excellent linked post of Julia's, I think). But this happens to be null whenever the GIL is released, e.g. when doing network I/O, and then you're kind of out of luck. So you'll already have trouble usefully profiling a single-threaded I/O-intensive program.
* Oh and you pretty much need debug symbols in your CPython binary (I think? Tell me if this isn't true!). Most production CPython builds don't have them. So you have to get the right binary, and rebuild any application dependencies with C extensions. Not hard but annoying.
There is potential though! With some work, we definitely could have a better story for out-of-process Python profiling a la Linux perf.
emfree | 10 years ago | on: Supersingular elliptic curve isogeny Diffie-Hellman 101
emfree | 10 years ago | on: Heroku Kafka
> The alternative if you are at a company with the resources to do so (mine is), is to build something that fits your use case better than Kafka
I'd love to hear more about this :) What did you end up doing differently from Kafka? How's it working out for you?
emfree | 10 years ago | on: Random Walks: the mathematics in 1 dimension
emfree | 10 years ago | on: Profiling Python in Production
* We generally favor free/open source solutions where practical.
* It is quite a bit cheaper in dollar terms.
* The actual code to make this work is very lightweight. By doing it yourself, you have total control, and can extend or tweak to get exactly the data you want. Being able to easily add bespoke instrumentation is really powerful. To give an example from one of our use cases (IMAP sync), let's say you wanted to cohort your data by mail provider. I.e., you suspect that the workload profile when syncing against server A is significantly different than syncing against server B, and you want to know for sure. It's pretty easy to take your codebase and your instrumentation, and add that by inspecting some thread-local context at runtime. Might be hard to do with an off-the-shelf commercial tool.
emfree | 10 years ago | on: How We Deploy Python Code
emfree | 11 years ago | on: Inbox — The next-generation email platform