5M Bid request/s, 2ms max response time – The Road to Damascus

mej10|6 years ago

Interested to know what Rust was missing. I built an ad exchange last year and it has been great. I have been using nightly builds, mostly for access to async/await, and it has been very fast and stable.

I have had to submit a few pull requests to various projects along the way, but didn't find the ecosystem prohibitively lacking.

coolsunglasses|6 years ago

Would you mind sharing what libraries you are using?

sebastienros|6 years ago

TechEmpower's Plaintext scenario is currently limited at 7M RPS due to network limits, though it uses a 10Gb NIC. Knowing that the Plaintext scenario is a very simple HTTP request (standard headers) that returns "Hello World!", how close to network saturation are you with 5M in this case with only "2 Gigabit Ethernet cards"?

nullwasamistake|6 years ago

Did you consider Vert.X? it's built on Netty and has it's own Linux epoll driver, async, and fiber support. It's impossible to know if it would be faster, but likely comparable and way less work than rolling your own.

In techempower benchmarks it exceeds 2 million http requests/second and it's a full REST framework.

And if you use the fiber support through Quasar you can pretend most things are normal blocking code.

Have to tried it or is this a case of NIH?

reilly3000|6 years ago

I spent a few cycles in media buying and later in sell-side ad tech. Please do say what you will about advertising and its effects on the web, but I will say this: it is a world of fascinating tech. As a buyer I experienced janky pacing all of the time across various platforms, because this is a HARD problem. We had to manually adjust campaigns on a daily basis to ensure pacing worked properly. It was common to stop a campaign and overspend by hundreds of dollars while all of the caching spun down.

I'm fascinated to see they are running that all on a single node. Its a massive amount of state aggregated from billions of events that needs to be served at extremely low latency, but couldn't it be partitioned somehow??? Google Fi/Spanner and BigTable have certainly been developed to support these issues. I've been trying to dig up what infrastructure powers Google AdX, but I haven't found anything. AdWords seems to be tied to Spanner, but AdX is/was an entirely different beast. In any case I'm quite certain that it isn't running pacing on a single, gigantic node.

endymi0n|6 years ago

As an anecdotal data point, I once configured a test campaign on Doubleclick Bid Manager (now Google DV360) about two years ago that I needed some quick exposure on. So I set a budget cap of 100$ just for safety and didn‘t do any targeting, so I was effectively bidding on half the worlds‘ ad inventory. What I didn‘t check or notice was that pacing wasn‘t set to even, but to Flight ASAP.

Suffice to say, I spent 730$ within _seconds_, so fast actually Googles systems couldn‘t even switch off fast enough to prevent 7,3x overspend, and the only thing that prevented stupid me from a five digit spend was probably choosing an unusual ad size.

Fascinating stuff indeed :)

reinhardt|6 years ago

> Its a massive amount of state aggregated from billions of events that needs to be served at extremely low latency, but couldn't it be partitioned somehow???

The bidder/pacer state is not necessarily massive, and certainly it does not consist of all the gazillions of past events. Depending on the strategy/bidding model, it can range from a few MB to several GBs, something that can fit in a beefy node.

> Google Fi/Spanner and BigTable have certainly been developed to support these issues.

I doubt any external store can be used with so low latency constraints (2-10ms) and high throughput (millions RPS). Perhaps Aerospike but even that is a stretch to put it in the hot-path. At this scale you're pretty much limited to fetch the state in memory and update it asynchronously every couple of minutes/hours.

Source: I also work in ad tech.

pas|6 years ago

> Google Fi/Spanner

For anyone else confused it's probably Google F1 and Spanner.

ggregoire|6 years ago

Glad to see this kind of stuff written with a strongly typed language instead of Python.

Guthur|6 years ago

Why, they wasted months evaluating with an obsession for statically typed languages that have so far not produced anything more quickly or markedly better than that which others are producing with less pedantic languages.

w3clan|6 years ago

Is this complete note? You didn't mention what you end up using.

Is it golang or pony or F$? CoreFX mention in the end confused me more.

rkallos|6 years ago

I work with Wael. Development is still ongoing. One implementation uses Golang, the other uses F# with a library that wraps libuv for faster network performance. Pony was used to write the stress-testing client for both implementations.

tracker1|6 years ago

> "I didn't want to rewrite everything from scratch, and definitely, I didn't want to handle all edge cases for epoll. My choice was to use libuv. The architecture I opt for: use 16 cores out of 40 for networking, having 16 'uv_loop' each running on its own thread. Callbacks will be passed from F# to each 'uv_loop' instance. The event loop will call them after parsing the bid request in C11."

Looks like libuv directly in C11? (not F# as before edit).

insulanian|6 years ago

The language is called F# (pronounced F-sharp), not F$.

philliphaydon|6 years ago

F$???? Is that a typo?

csdreamer7|6 years ago

Curious if there are numbers for other languages in high performance applications.

I am learning Clojure so I would like to know if anyone knows of the highest performant applications written in it.

eliasson|6 years ago

> (scala? That's another story for another day)

This makes me curious - was it the language or the runtime characteristics?

Nextgrid|6 years ago

It’s kind of sad that all this engineering effort was spent to essentially make the internet a worse place for everyone and waste users’ time and attention.

Imagine if a crime syndicate would brag about their efforts to make their worldwide criminal activities more efficient.

_cs2017_|6 years ago

I can totally see where you're coming from. But major engineering achievements require efforts of many skilled people, who often like to be paid really well for their work. And the way the world works today is that a lot of big money is in the fields that are of questionable value to the society: advertising, finance, military, etc. And even in the fields that seem at first glance to be socially valuable, like health care, most of money comes not from healing people but from playing the game of "rip off public or private coverage providers".

Therefore I think the best we can hope for is that engineering breakthroughs achieved in profit driven fields will gradually leak into other fields where they can actually be used to improve people's lives.

mochomocha|6 years ago

... So when it's Google or FB blogging about technology originally developed to serve ads, it's hype and cool... But when the authors are more honest about the motivation behind developing a certain piece of tech, it's "kind of sad"?

packetslave|6 years ago

Do you legitimately think the world would be a better place if gmail, youtube, flickr, reddit, EVERY search engine, and basically every web content site disappeared?

Because that's what happens if you don't have web advertising. Free things disappear without revenue.

Or maybe you'd prefer to go back to the days of randomly-targeted or "PUNCH THE MONKEY" ads. Because THAT'S what happens without ad auctions and targeting.

The reality is: advertisers and ad-supported sites WANT to show you a relevant ad that you're likely to click (modulo obvious bad actors). That's how they get paid. Anything else is, by definition, "[wasting] users' time and attention."

legohead|6 years ago

I'd love to read what a crime syndicate does to improve their activities. Doesn't mean I agree with them... but no doubt it's really interesting and I might learn something from it.

teej|6 years ago

Ads aren’t crime and ads aren’t universally net negative.

llamataboot|6 years ago

The Best Minds of My Generation Are Thinking About How To Make People Click Ads (or serve them efficently)

stingraycharles|6 years ago

Take a look at finance / trading. Things are even worse over there, but they don’t blog about it.

tgtweak|6 years ago

I kind of like that some of the money from this industry is resulting in learnings and improvements to open source.

Found this article great, not many places to see 5m req/s, let alone on a single node.

I'm really interested in hearing more about those databases!

bob809|6 years ago

[deleted]

67 comments