top | item 2309990

Why did OKCupid write their own web server?

154 points| mhp | 15 years ago |answers.onstartups.com | reply

104 comments

order
[+] tseabrooks|15 years ago|reply
My only real complaint about the nature of HN is the (as perceived by me) strong anti-C++ bent, like the ones in these comments, from what seems like people who don't have lots of experience using it in the work place. Am I imagining this anti-C++ bias? Am I wrong and all the people bashing C++ have tons of C++ experience?

I'll be the first to point out the numerous flaws C++ has but it just feels like folly to make fun of the ugly chick at the party without realizing everyone at the party is covered in warts... (The is a metaphor for all programming languages having problems)

[+] tptacek|15 years ago|reply
(I'm a recovering C++ dev).

C++ has problems unique to C++: a deceptive illusion of abstraction (what Spolsky would call "leaky" abstractions) makes a bunch of idioms in the language dangerous, including virtually all "smart" pointers, iterators, exceptions, allocators, and arrays. It is uniquely difficult to write reliable code in C++.

Add to that the superficial but more practical and common complaints against C++: the ghastly compile times, the header file dependency hell that forces every class into a kabuki dance of "pImpls" and nested classes, the error messages rendered in ancient Sumerian... you know I could go on, but you get the idea.

C++ is just not a very good language. C is a fine language. If you are building a system for which C's abstractions are inadequate --- and I'll stipulate that such systems exist --- you're better off with C and a very good glue language (Lisp and Lua are two good, popular choices) than you are with a uniform C++ implementation.

[+] nandemo|15 years ago|reply
Besides tptacek's reply, see also Yossi Kreinin's C++ FQA (Frequently Questioned Answers):

http://yosefk.com/c++fqa/faq.html

Subjectively I don't feel HN as particularly anti-C++, it's more that people are younger and started programming in languages like Python and Ruby so they don't have a pro-C++ bias. If anything I feel that many people on HN are anti-Java. Personally I'm thankful for Java's existence because it managed to replace C++ in a lot of places.

[+] rwmj|15 years ago|reply
With a tiny team of very smart programmers, starting from scratch, C++ can certainly be made to work.

I wrote a web server in C from first principles back in 2001* for many of the same reasons that OKCupid seem to have done - and with a team of one good programmer that worked out really well.

* http://web.archive.org/web/20070705191513/http://www.annexia...

[+] wglb|15 years ago|reply
My strong anti-C++ bent comes from having used in in demanding production situations for 12 years. It is a beast. When Scott Meyers of http://www.amazon.com/Effective-Specific-Addison-Wesley-Prof... fame reads a book on C++ template programming and is surprised, no, astonished! at some of the things done there, and attempts to write auto_ptr and fails at least twice.

It takes way to long to learn (probably 2 years for a developer working with it 8 hours a day), and the grown ups don't like it either: http://www.amazon.com/Coders-Work-Reflections-Craft-Programm...

This isn't even an ugly chick.

[+] ww520|15 years ago|reply
There are anti-C++ bias in HN. More people here are working on the Web related projects where C++ is not popular.
[+] starpilot|15 years ago|reply
Just saying that this thread has been fascinating and illuminating for a novice C++/C programmer like myself. It reads like a decent article comparing the two languages, though I'm no expert and can't judge for sure.

This is why I read HN.

[+] rhizome|15 years ago|reply
C++'s reputation was negatively affected by Microsoft's studied dominance of university CS curricula. Back in the VB days C++ was the step up and I still see it in job ads on par with Java (Sun's side of the curriculum equation), though by now I merely read it as a veiled call for university graduates who learned programming only in school, coderbots.
[+] unknown|15 years ago|reply

[deleted]

[+] ErrantX|15 years ago|reply
The questioned is phrased very much in the now e.g. Isn't the technology stack basically a commodity at this point? and

And the answer highlights why you should never retrospectively "judge" design choices several years after the fact.

[+] cheez|15 years ago|reply
I've had an opportunity to chat with some people tangentially related to OKC and I think that they have definitely done some very cool stuff. There are some things that the OKWS architecture does very well. As I understand it, there is a bit of "Rails envy" but they seem to copy good ideas very quickly. That being said, I think that if they were to start again, they would try and use commodity technology.

But, these guys are really fucking brilliant and productive. Immensely... I feel like a chump in comparison.

[+] cschep|15 years ago|reply
Yes. How often do I say, "I wish this thing ____" or.. "oh that's just not possible." They just decided to write a bunch of c++ to make it possible. Awesome!
[+] rubashov|15 years ago|reply
Well these days writing an http server in C++ is about 30 lines of code using POCO or boost.asio. If your business model was "serve a shit-ton of dynamic requests super cheap" it might make just as much sense as ever to build the whole thing in C++ embedded directly in a C++ webserver.
[+] gdulli|15 years ago|reply
OKCupid is the only web site I can think of that has a regular, not occasional, pattern of being very slow and simply not responding to an unusual number of requests. Hitting F5 to reload a page just to get it to show up instead of the Firefox server unavailable message is a regular part of my usage of the site.

Even though all sites have bugs, broken links, what have you, I don't know any other site that's given me such an expectation that it will be unresponsive for a significant number of page views for any given session over a long term period. Even the sites that started development circa 2003.

[+] rosser|15 years ago|reply
I can only speak for myself, but I've pretty much never seen OkC perform like you're describing, and I've maintained an intermittently active account since '04.
[+] maxtaco|15 years ago|reply
We rarely get complaints such as this one. Would you mind helping me debug it?
[+] absconditus|15 years ago|reply
Have you never used reddit?
[+] starpilot|15 years ago|reply
I've used okc for years and have never noticed this.
[+] soulclap|15 years ago|reply
Sounds like you have never experienced Facebook like I do. I only spend about five minutes on it daily and it is all reloads and multiple clicks.
[+] dustingetz|15 years ago|reply
"general rpc servers for solving specific problems using in memory data structures (e.g., who qualifies for a match search given dozens of constraints and millions of users; what your match score is with 10,000 qualifying people, given you've all answered hundreds of different questions each on average) ... Great tech is available now, serving is cheaper, and you probably don't have the computational workload OkCupid does."

I know people who have used OKC before. OKC users in my social class (male, white, educated) ignore the match percentages, because the SNR is really low. They just plow through all the search results of people to find good pictures and interesting profiles.

So, I'd speculate that match-percentages are a marketing thing, and that they know they made a weak business decision which required lots of computation and now they're stuck with it.

I'm probably wrong. Maybe the long-tale users pay attention to match-percentage.

[+] neild|15 years ago|reply
I signed up for OKC, lurked for a while, and then sent a message to the person at the top of my match list.

We're still together, five years later.

I'm probably an outlier, but hey--match percentage works some of the time!

[+] gfunk911|15 years ago|reply
I don't ignore the match percentages at all. Obviously they aren't perfectly correlated, but they're pretty solid. People with very low match numbers are basically 100% duds.
[+] DrStalker|15 years ago|reply
I think of the match percentage as the equivalent of having HR look at tech resumes before forwarding them on to me; it's not perfect, but it saves me from reading through completely irrelevant profiles.
[+] starpilot|15 years ago|reply
You probably are wrong, and you imply you've never used okc. Match percentages matter to me highly since they certainly indicate users I'd get along with, based on my interactions with other users. Their data also show that messages response rates correlate strongly with match percentages.
[+] tobias3|15 years ago|reply
If you have a small team and everybody has much C++ experience you can pull this off. Otherwise one person who doens't have the discipline to do the manual memory management right can crash the whole server. Don't try it at home ;) use an VM-language instead which can recover from such errors.

(I wrote some C++ webapps myself)

[+] maxtaco|15 years ago|reply
Manual memory management is for the birds. Use something like Boost's shared pointers and you never need to worry about it.
[+] Johngibb|15 years ago|reply
I'd think they'd begin transitioning to a higher level language now that there are many options available. It's gotta be a burden at this point to be (1) maintaining their own web server and (2) developing new features in C++. I'd way rather use ruby/python (or even .Net) and fall to C++ for the really performance intensive stuff.

(Disclaimer: I interviewed @ OKCupid in 2007)

[+] jacques_chester|15 years ago|reply
I remember this coming up at reddit a few months ago[1]. At the time I downloaded and read the paper on the design[2]. It all made sense to me because my own thinking had been heading in the same direction.

OKWS is less a web server than it is an architecture of servers. It's the difference between sendmail and qmail/postfix.

It has nice security and performance properties because each service is run as a separate user, with a separate process. Logging is handled by an independent daemon. Request demultiplexing is handled by a simple daemon that binds to port 80. Actual HTTP parsing is handled by a shared library that services link to.

[1] http://www.reddit.com/r/programming/comments/exkk3/ok_webser... [2] http://pdos.csail.mit.edu/~max/docs/okws.pdf

[+] itsnotvalid|15 years ago|reply
Any languages could possibility be made to work. They did make it to work with things like SFSLite[1] which looked like coroutines or fibers (actually Stratified JavaScript, but that is not something common) would solve for async callbacks. However, those are something that is acting as extensions to the core language.

One of the biggest problem C++ has is the fact that core language has too many stuff but still lacking things that people really want to use. It's certainly workable, and the results are fast since it is compiled very well. However 'workable' does not mean 'a pleasure to work'.

[1]: http://www.okws.org/doku.php?id=sfslite:tame2:tutorial , http://www.okws.org/doku.php?id=sfslite

[+] guelo|15 years ago|reply
I imagine one big downside of this custom stack is that they will have a hell of a time doing any sort of integration with Match.com.

Hiring and training is also probably more difficult, though that has got be a huge boon to OKCupid engineers since Match cannot afford to lose them.