ebroder's comments

ebroder | 9 years ago | on: Service discovery at Stripe

(I work at Stripe, and helped with a lot of our early Consul rollout)

We did consider Synapse (and Nerve[1], which they used to call SmartStack when used together) when we were building out our service discovery, and went with an alternative strategy for a couple of reasons.

Even though it's not directly in the line of fire, we weren't super excited about having to run a reliable ZooKeeper deployment; and we weren't excited about using ZooKeeper clients from Ruby, since it seems like the Java clients are far and away the most battle hardened. (IIRC Synapse only supported ZooKeeper-based discovery when it was initially released)

We wanted a more dynamic service registration system than Synapse seems to be designed for. Changing the list of services with Synapse seems to require pushing out a config file change to all of your instances, which we wanted to avoid.

Our configuration also looked less like SmartStack when we first rolled it out - we were primarily focused on direct service-to-service communication without going through a load balancer, and we expected to be querying the Consul API directly. The shape of Consul felt like it better fit what we were trying to do. We've ended up diverging from that over time as we learned more about how Consul and our software interacted in production, and what we have now ended up looking more similar to SmartStack than what we started with.

There aren't a _ton_ of different ways to think about service discovery (and more broadly, service-to-service communication). One of my coworkers wrote[2] about this some in the context of Envoy[3], which also looks a lot like SmartStack. It's not terribly surprising to me that a lot of them converge over time - e.g. the insight of trading consistency for availability is key.

[1] https://github.com/airbnb/nerve [2] http://lethain.com/envoy-design/ [3] https://lyft.github.io/envoy/

ebroder | 10 years ago | on: Now Open – AWS Asia Pacific (Seoul) Region

I'd argue that m3 and c3 instances shouldn't be classified as "previous generation", since the m4 and c4 instances don't have any local ephemeral storage. Given the recent EBS incident in GovCloud, I think it's still pretty reasonable to be skeptical of EBS.

ebroder | 10 years ago | on: Stripe: Open Source

And an update! I talked with the engineer that wrote unilog originally.

The original headline feature of unilog was that it wouldn't block writes if the disk filled up. multilog does - if it can't write a line to disk, it stops ingesting data off of stdin, which eventually causes the application to hang writing to stdout.

unilog sends you an email and starts dropping log lines, which we decided better matched the tradeoffs we wanted to make - losing logs sucks, but not as much as blocking your application until you figure out how to free up disk.

ebroder | 10 years ago | on: Stripe: Open Source

(I'm an infrastructure engineer at Stripe)

This actually took a fair amount of digging! We've been using some version of unilog for over 4 years now (longer than I've been at Stripe), and we'd mostly forgotten why we switched. What follows is more the result of historical exploration and guesswork than authoritative statement of original truth.

I'm fairly confident that the impetus for unilog was timestamp prefixes for our log lines. We wanted timestamps (so that we weren't dependent on all applications adding them). multilog is capable of doing writing out timestamps, but it formats them with TAI64N. We wanted something more human-parseable.

Once we had it, we started adding other features. These days, I'd say the most useful thing unilog does for us is buffer log lines in memory. We would occasionally see disk writes on EC2 hang for long enough that the in-kernel (64k) pipe buffer would fill up and cause applications to stall.

ebroder | 11 years ago | on: Unit tests fail when run in Australia

Take a tcpdump and open it in Wireshark - you can't see the content of the requests, but the first packet will probably have an SNI header that tells you what hostname you're trying to connect to.

ebroder | 12 years ago | on: How to safely invoke Webhooks

For Stripe's webhooks, we currently use the mechanism described in this post (resolve the IP address, munge the URL we're connecting to, and manually add a Host header), but we've run into a lot of problems.

If your client library supports SNI, you're likely going to send the wrong SNI hostname (we had to disable Ruby's SNI feature because sending an IP address in the SNI field broke a handful of sites), and cert verification generally becomes quite tricky.

We're in the process now of switching to a simple go-based HTTP proxy, which tries to pass through connections as faithfully as possible, but rejects connections to internal IP addresses. Here's the current implementation (though I'm still playing with it): https://gist.github.com/ebroder/ae9299e0078094211bde

This turns out to be way simpler - all HTTP client libraries have good support for proxies, and it doesn't interfere with anything about the SSL handshake (since HTTPS connections go through a CONNECT-based proxy).

We also looked at using Squid or Apache, but concluded that they were much heavier weight than we needed and it was difficult to make them behave as transparently as we wanted (e.g. they always want to add headers and such).

ebroder | 12 years ago | on: Humble Indie Bundle 9 released

Sorry about that! Has to do with some anti-fraud tools we've built into Stripe Checkout. Should be fixed now, so you shouldn't see that again, Ghostery or no. Feel free to let me know ([email protected]) if you (or anyone) still sees it.

ebroder | 12 years ago | on: Introducing Stripe UK

(Reply to https://news.ycombinator.com/item?id=6219355 - can't reply directly)

We can do that too! (We can't specifically do SEK yet, although we're working on it - should have it soon)

We've tried to make all of this as easy as possible - you can associate bank accounts in different currencies with your Stripe account. If you make a charge in a currency you have a bank account for, we'll transfer it directly; otherwise we'll convert to your account's default currency and transfer it to that currency's bank account.

ebroder | 12 years ago | on: Introducing Stripe UK

If you're getting cheaper offers from other providers, you should get in touch with us ([email protected]). We can do volume discounts, but we can also just help you compare prices - many other providers have cheaper sticker prices, but because they frequently have other fees, we'll often come out cheaper than you think.

ebroder | 12 years ago | on: Introducing Stripe UK

That's actually no longer true - as of today for UK users, we can convert USD to GBP or EUR before transferring it to you. Just create the charge with a currency of "usd", and as long as you haven't manually setup a USD bank account (which you probably haven't), everything should just work.

ebroder | 13 years ago | on: Simplify your life with an SSH config file

If you have a new enough ssh client, I'd personally recommend setting ControlPersist yes.

This fixes the UI wart where your first ssh connection to a server has to stay open for the duration of all your others (or your others all get forcibly disconnected).

It's not perfect. If the name of a server changes but you already have a control socket, it'll use the socket and connect to the old server. And it also takes it a while to pick up networking changes that break your connectivity, though I've hacked around that with a script I keep running in the background (Linux only, at the moment; requires gir1.2-networkmanager-1.0):

  #!/usr/bin/python
  import os
  from gi.repository import GLib, NMClient
  def active_connections_changed(*args):
      for sock in os.listdir(os.path.expanduser('~/.ssh/sockets')):
          os.unlink(os.path.join(os.path.expanduser('~/.ssh/sockets'), sock))
  c = NMClient.Client.new()
  c.connect('notify::active-connections', active_connections_changed)
  GLib.MainLoop().run()
There's some contention with my coworkers about whether ControlPersist is actually desirable given the tradeoffs, but I personally think it's a huge improvement.

ebroder | 13 years ago | on: Exploring and Dynamically Patching Django/Python Using GDB

Hmm, it'd probably be doable. In particular, you can use the "commands" command to script what happens when you hit a breakpoint (a friend talked about this in a Ksplice GDB blog post: https://blogs.oracle.com/ksplice/entry/8_gdb_tricks_you_shou...)

Normally, the "finish" command will interrupt any script you're in the middle of executing, so we have to do a bit of an ugly hack to make sure our script keeps running:

b PyEval_EvalFrameEx if strcmp(PyString_AsString(f->f_code->co_name), "handle_uncaught_exception") == 0

commands

disable

frame 3

python

gdb.execute('finish')

gdb.execute('shell git stash pop')

gdb.execute('call PyImport_ReloadModule(PyImport_AddModule("monospace.views"))')

gdb.execute('call PyImport_ReloadModule(PyImport_AddModule("monospace.urls"))')

gdb.execute('set $self = PyDict_GetItemString(f->f_locals, "self")')

gdb.execute('set $request = PyDict_GetItemString(f->f_locals, "request")')

gdb.execute('set $get_response = PyObject_GetAttrString($self, "get_response")')

gdb.execute('set $args = Py_BuildValue("(O)", $request)')

gdb.execute('set $rax PyObject_Call($get_response, $args, 0)')

gdb.execute('enable')

gdb.execute('c')

end

c

end

(Bah. Edited because I couldn't get monospace text working)

You could put those commands into a file and run "gdb -p <my django process> -x <my commands file>"

Of course, instead of shelling out to git stash pop, you'd probably want to pause so you could update the code. And you may need to reload more modules than just monospace.views and monospace.urls, depending on the change.

ebroder | 13 years ago | on: Exploring and Dynamically Patching Django/Python Using GDB

You're right! You could do all of this with pdb, but only if you have enough foresight to run the app under pdb to begin with (which I definitely never do).

Using GDB, you don't have to change the app or remember to run it in a particular way.

ebroder | 15 years ago | on: How to write a filesystem in 50 lines of code

Yeah, that definitely can be a weakness of RouteFS's style.

My target application was things like automounters, or the low-latency database querying sort of thing I mention in the actual blog post. Since I wanted to be able to have the filesystem structure change as it was accessed, I decided to make any sort of caching entirely an application-layer problem, not a RouteFS-layer problem.

I think it would be possible to extend RouteFS to handle this sort of case more gracefully. One option in particular might be to take advantage of python-fuse's stateful I/O feature (which lets you associate a Python object with open file descriptors in your filesystem [1]) so that reads from the same file don't result in the same lookup over and over again, although this certainly doesn't help for directories.

But in any case, I'd certainly love to see ideas for extending RouteFS to make it easier to make it more performant. Submissions in the form of patches are always excellent, but even suggestions for API changes would be welcome - feel free to open an issue on Github either way (http://github.com/ebroder/python-routefs/issues).

[1] See "Filehandles can be objects if you want" in http://fuse.cvs.sourceforge.net/viewvc/fuse/python/README.ne... for more information

page 1