ebroder | 9 years ago | on: Service discovery at Stripe
ebroder's comments
ebroder | 10 years ago | on: Now Open – AWS Asia Pacific (Seoul) Region
ebroder | 10 years ago | on: Stripe: Open Source
The original headline feature of unilog was that it wouldn't block writes if the disk filled up. multilog does - if it can't write a line to disk, it stops ingesting data off of stdin, which eventually causes the application to hang writing to stdout.
unilog sends you an email and starts dropping log lines, which we decided better matched the tradeoffs we wanted to make - losing logs sucks, but not as much as blocking your application until you figure out how to free up disk.
ebroder | 10 years ago | on: Stripe: Open Source
This actually took a fair amount of digging! We've been using some version of unilog for over 4 years now (longer than I've been at Stripe), and we'd mostly forgotten why we switched. What follows is more the result of historical exploration and guesswork than authoritative statement of original truth.
I'm fairly confident that the impetus for unilog was timestamp prefixes for our log lines. We wanted timestamps (so that we weren't dependent on all applications adding them). multilog is capable of doing writing out timestamps, but it formats them with TAI64N. We wanted something more human-parseable.
Once we had it, we started adding other features. These days, I'd say the most useful thing unilog does for us is buffer log lines in memory. We would occasionally see disk writes on EC2 hang for long enough that the in-kernel (64k) pipe buffer would fill up and cause applications to stall.
ebroder | 11 years ago | on: Unit tests fail when run in Australia
ebroder | 11 years ago | on: HAProxy 1.5
ebroder | 12 years ago | on: How to safely invoke Webhooks
If your client library supports SNI, you're likely going to send the wrong SNI hostname (we had to disable Ruby's SNI feature because sending an IP address in the SNI field broke a handful of sites), and cert verification generally becomes quite tricky.
We're in the process now of switching to a simple go-based HTTP proxy, which tries to pass through connections as faithfully as possible, but rejects connections to internal IP addresses. Here's the current implementation (though I'm still playing with it): https://gist.github.com/ebroder/ae9299e0078094211bde
This turns out to be way simpler - all HTTP client libraries have good support for proxies, and it doesn't interfere with anything about the SSL handshake (since HTTPS connections go through a CONNECT-based proxy).
We also looked at using Squid or Apache, but concluded that they were much heavier weight than we needed and it was difficult to make them behave as transparently as we wanted (e.g. they always want to add headers and such).
ebroder | 12 years ago | on: Humble Indie Bundle 9 released
ebroder | 12 years ago | on: Introducing Stripe UK
We can do that too! (We can't specifically do SEK yet, although we're working on it - should have it soon)
We've tried to make all of this as easy as possible - you can associate bank accounts in different currencies with your Stripe account. If you make a charge in a currency you have a bank account for, we'll transfer it directly; otherwise we'll convert to your account's default currency and transfer it to that currency's bank account.
ebroder | 12 years ago | on: Introducing Stripe UK
ebroder | 12 years ago | on: Introducing Stripe UK
ebroder | 13 years ago | on: Simplify your life with an SSH config file
This fixes the UI wart where your first ssh connection to a server has to stay open for the duration of all your others (or your others all get forcibly disconnected).
It's not perfect. If the name of a server changes but you already have a control socket, it'll use the socket and connect to the old server. And it also takes it a while to pick up networking changes that break your connectivity, though I've hacked around that with a script I keep running in the background (Linux only, at the moment; requires gir1.2-networkmanager-1.0):
#!/usr/bin/python
import os
from gi.repository import GLib, NMClient
def active_connections_changed(*args):
for sock in os.listdir(os.path.expanduser('~/.ssh/sockets')):
os.unlink(os.path.join(os.path.expanduser('~/.ssh/sockets'), sock))
c = NMClient.Client.new()
c.connect('notify::active-connections', active_connections_changed)
GLib.MainLoop().run()
There's some contention with my coworkers about whether ControlPersist is actually desirable given the tradeoffs, but I personally think it's a huge improvement.ebroder | 13 years ago | on: Facebook CTO Bret Taylor Departs (For Start-Ups Unknown)
ebroder | 13 years ago | on: Tell HN: Heroku is Down (update: recovering as of 10PM PST)
ebroder | 13 years ago | on: Exploring and Dynamically Patching Django/Python Using GDB
Normally, the "finish" command will interrupt any script you're in the middle of executing, so we have to do a bit of an ugly hack to make sure our script keeps running:
b PyEval_EvalFrameEx if strcmp(PyString_AsString(f->f_code->co_name), "handle_uncaught_exception") == 0
commands
disable
frame 3
python
gdb.execute('finish')
gdb.execute('shell git stash pop')
gdb.execute('call PyImport_ReloadModule(PyImport_AddModule("monospace.views"))')
gdb.execute('call PyImport_ReloadModule(PyImport_AddModule("monospace.urls"))')
gdb.execute('set $self = PyDict_GetItemString(f->f_locals, "self")')
gdb.execute('set $request = PyDict_GetItemString(f->f_locals, "request")')
gdb.execute('set $get_response = PyObject_GetAttrString($self, "get_response")')
gdb.execute('set $args = Py_BuildValue("(O)", $request)')
gdb.execute('set $rax PyObject_Call($get_response, $args, 0)')
gdb.execute('enable')
gdb.execute('c')
end
c
end
(Bah. Edited because I couldn't get monospace text working)
You could put those commands into a file and run "gdb -p <my django process> -x <my commands file>"
Of course, instead of shelling out to git stash pop, you'd probably want to pause so you could update the code. And you may need to reload more modules than just monospace.views and monospace.urls, depending on the change.
ebroder | 13 years ago | on: Exploring and Dynamically Patching Django/Python Using GDB
Using GDB, you don't have to change the app or remember to run it in a particular way.
ebroder | 15 years ago | on: How to write a filesystem in 50 lines of code
My target application was things like automounters, or the low-latency database querying sort of thing I mention in the actual blog post. Since I wanted to be able to have the filesystem structure change as it was accessed, I decided to make any sort of caching entirely an application-layer problem, not a RouteFS-layer problem.
I think it would be possible to extend RouteFS to handle this sort of case more gracefully. One option in particular might be to take advantage of python-fuse's stateful I/O feature (which lets you associate a Python object with open file descriptors in your filesystem [1]) so that reads from the same file don't result in the same lookup over and over again, although this certainly doesn't help for directories.
But in any case, I'd certainly love to see ideas for extending RouteFS to make it easier to make it more performant. Submissions in the form of patches are always excellent, but even suggestions for API changes would be welcome - feel free to open an issue on Github either way (http://github.com/ebroder/python-routefs/issues).
[1] See "Filehandles can be objects if you want" in http://fuse.cvs.sourceforge.net/viewvc/fuse/python/README.ne... for more information
We did consider Synapse (and Nerve[1], which they used to call SmartStack when used together) when we were building out our service discovery, and went with an alternative strategy for a couple of reasons.
Even though it's not directly in the line of fire, we weren't super excited about having to run a reliable ZooKeeper deployment; and we weren't excited about using ZooKeeper clients from Ruby, since it seems like the Java clients are far and away the most battle hardened. (IIRC Synapse only supported ZooKeeper-based discovery when it was initially released)
We wanted a more dynamic service registration system than Synapse seems to be designed for. Changing the list of services with Synapse seems to require pushing out a config file change to all of your instances, which we wanted to avoid.
Our configuration also looked less like SmartStack when we first rolled it out - we were primarily focused on direct service-to-service communication without going through a load balancer, and we expected to be querying the Consul API directly. The shape of Consul felt like it better fit what we were trying to do. We've ended up diverging from that over time as we learned more about how Consul and our software interacted in production, and what we have now ended up looking more similar to SmartStack than what we started with.
There aren't a _ton_ of different ways to think about service discovery (and more broadly, service-to-service communication). One of my coworkers wrote[2] about this some in the context of Envoy[3], which also looks a lot like SmartStack. It's not terribly surprising to me that a lot of them converge over time - e.g. the insight of trading consistency for availability is key.
[1] https://github.com/airbnb/nerve [2] http://lethain.com/envoy-design/ [3] https://lyft.github.io/envoy/