HTTPS on Stack Overflow: The End of a Long Road

mrunkel|8 years ago

At $previous_job we once turned on HTTPS for our entire customer website and online store, only to have our customer support team be bombarded by phone calls claiming that our "website was down."

After much teeth gnashing and research, we determined that a large segment of our user base was still using WinXP and the encryption protocols we offered weren't available to them.

We didn't think this would be a problem because the current version of the software wasn't compatible with WinXP any longer.

There was some debate internally whether the better fix was to including the legacy encryption protocols or just leave the HTTP version of the site running and use Strict-Transport-Security to move capable browsers to HTTPS.

In the end we had to include the legacy protocols so those customers could use our online store.

madmax108|8 years ago

At $current_job we're currently in the middle of the same thing, but took the precaution of checking logs to see which customers use older encryption protocols (we're B2B), and have given them X months to upgrade their systems before we make the switch on our side.

The logic that was communicated to them was that as a service provider, security a prime concern for us (as it should be for them as well), so we can't keep lagging on this forever. Currently, we have $single_digit merchants we're still waiting to make the switch.

It's made the whole switch process much easier and made customers actually appreciate our pro-activeness in this! :)

mattmanser|8 years ago

I know it's hindsight and all that, but why didn't you check your website analytics first? Seems a fairly massive assumption that should have taken 10 seconds to check.

gatlinnewhouse|8 years ago

> We didn't think this would be a problem because the current version of the software wasn't compatible with WinXP any longer.

> There was some debate internally whether the better fix was to including the legacy encryption protocols or just leave the HTTP version of the site running and use Strict-Transport-Security to move capable browsers to HTTPS.

Where can I read about this? Is there any way to display a special "Your browser is outdated" page for the users on WinXP?

Sorry if these seem like basic questions. I am just curious and would like to hear some expert advice.

adventured|8 years ago

Out of curiosity - roughly what year was that and what percentage of the customer base would you say was still on Windows XP at the time?

sofaofthedamned|8 years ago

Had a similar one at my last role. It was a HTML5 remote desktop thing with websockets, TLS1.2 etc etc. Got a bug report from a user that it didn't work in Safari. We didn't have a Mac in the office to test with, so asked the user for more details.

"Oh no, this isn't a Mac, it's Windows"

This is a user of a highly secure system, containing user PII, who expected to use it on a 5 year old browser with XP.

~bangs head~

technion|8 years ago

The frustrating point about a similar experience was...

You can support HTTP and the occasional knowledgeable person will suggest you should upgrade. Or you can force TLS with SSLv3 enabled, and suddenly you'll hit a flood of people letting you know you're about to be hacked, based on online scanners. Often complete with requests for a bug bounty.

jfroma|8 years ago

The other problem with Windows XP and https is SNI. You cant serve more than one domain with different ssl certificates from the same IP address, you either use SANs or different IP addresses. This does not only affect IE on XP but every browser.

unknown|8 years ago

[deleted]

quintin|8 years ago

Did you use any alerting mechanism that informed you about the how many % were affected by this?

unknown|8 years ago

[deleted]

lmm|8 years ago

> The password to our data center is pickles. I didn’t think anyone would read this far and it seemed like a good place to store it.

You ought to have more confidence in your writing. BRB stealing all your servers.

DamonHD|8 years ago

Only if you get there first. And it may in fact be pickles2.

phsource|8 years ago

This is incredibly detailed; in short, CDNs, cookies/authentication , tons of subdomains, and 3rd-party/user-generated content make it a pain to move onto HTTPS.

I was chatting with a non-engineer friend about why it's hard to estimate how long tasks often take, and this seems like a prime illustration: the dependencies are endless.

I also love the Easter egg:

"The password to our data center is pickles. I didn’t think anyone would read this far and it seemed like a good place to store it."

niklasrde|8 years ago

Just know the username and you can log onto https://stackoverflow.com/admin.php

dzdt|8 years ago

Stack Exchange is no longer available from my workplace due to this change. We have a strict no-posting-code-fragments policy, and SE was viewed as too risky to allow without some restriction in place to make it read only. Before HTTPS, the IT department had worked out such a read-only restriction by blocking the SE login with firewall rules. But with HTTPS that kludge is no longer possible, so the site is blocked.

nfriedly|8 years ago

You should try this link from home: https://stackoverflow.com/jobs

boyter|8 years ago

Same thing happened to me at a workplace once. They blocked StackOverflow, GitHub, Bitbucket, Sourceforge, CodePlex and Google Code.

I told them all estimates go up by 2 years since we would need to reimplement everything. It ended up being unblocked a week later.

teraflop|8 years ago

Leaving aside all the reasons why this policy is super dumb (which I'm sure others will cover quite adequately), I guess your IT department can't figure out how to create their own CA certificate and do SSL interception?

kentt|8 years ago

Banning SE really doesn't go far enough then does it? Perhaps any site with a text box should be forbidden.

Shoop|8 years ago

What sort of company do you work at? Why can't everyone just be told not to post code?

maddyboo|8 years ago

Wow, that sounds ridiculous. What's the reasoning behind that policy?

knodi123|8 years ago

Why don't they just recompile chromium without support for the textarea element, make that the only officially permitted browser, and call it a day? :-)

kbart|8 years ago

Sorry, but such policy is just stupid. There are many, many ways one could get a snapshot of code without posting it online. I respect SE for their decision to make things right, not kneel down against costumers and their faulty "security" practices which can be often seen.

danielbarla|8 years ago

I honestly wonder how exactly places like this want to enforce policies like this. Do they allow you to take a phone into your workplace? Aren't they scared you will take a photo and upload the code fragment?

BinaryIdiot|8 years ago

Damn, that's even more strict than when I worked in the IC as a government contractor. I don't know how you'd get anything done, realistically.

glandium|8 years ago

Do they realize their employees can use 4G to access SE?

unknown|8 years ago

[deleted]

shaunrussell|8 years ago

If the architecture and code quality is good you should be able to open source your code and not have any security vulnerabilities.

You need to find a new job.

gbrayut|8 years ago

If you like working on these kinds of projects, the SRE team at Stack Overflow is hiring and we allow remote work full time! https://stackoverflow.com/jobs/143725/site-reliability-engin...

tomschlick|8 years ago

Just a reminder, HTTPS isn't enough. Be sure to turn the other security knobs with headers...

https://securityheaders.io/?q=https%3A%2F%2Fstackoverflow.co...

Nick-Craver|8 years ago

Yep - we're aware. I thought about putting in our Content-Security-Policy-Report-Only findings about what all would break, but the post was already a tad long. It's quite a long list of crazy things people do.

As the headers go, here's my current thoughts on each:

- Content-Security-Policy: we're considering it, Report-Only is live on superuser.com today.

- Public-Key-Pins: we are very unlikely to deploy this. Whenever we have to change our certificates it makes life extremely dangerous for little benefit.

- X-XSS-Protection: considering it, but a lot of cross-network many-domain considerations here that most other people don't have or have as many of.

- X-Content-Type-Options: we'll likely deploy this later, there was a quirk with SVG which has passed now.

- Referrer-Policy: probably will not deploy this. We're an open book.

vbernat|8 years ago

Many headers presented here are questionable. X-Frame-Options should be replaced by CSP frame-ancestors. X-XSS-Protection: 1 is the default since a long time for browsers supporting it and Chrome blocks by default since two releases. Referrer-Policy is a matter of choice. It's a useful information for the target site as long as the referrer doesn't contain sensitive information. IMO, most sites shouldn't set this header.

skc|8 years ago

Every site I've put in there gets a failing grade. From Google to Apple to Slashdot etc.

Wonder what the point is then.

mixmastamyk|8 years ago

Helpful site, but all these headers will slow down a site that doesn't need them. Too bad they aren't defaults. Hopefully http2 mitigates that enough.

kalleboo|8 years ago

Note to self: Use subdirectories, not subdomains in the future

quicklyfrozen|8 years ago

The other issue with subdomains is that some customers will insist on typing "www." in front of every domain. Since the wildcard cert won't match, those customers will see an error.

TorKlingberg|8 years ago

I feel like TLS certificates are fundamentally misdesigned there. It should be possible to have a wildcard certificate that matches all subdomains under a domain, no matter how many layers deep.

z3t4|8 years ago

browsers use domains for everything from connection limits to data storage. if you use folders everything will be shared.

baby|8 years ago

TLS kills this kind of "cool" features which is kind of sad :( Unless you can afford wildcard certs.

What's the argument behind LetsEncrypt not doing that? Extended Validation stuff?

tomschlick|8 years ago

Side question: any plans for IPv6?

tepmoc|8 years ago

Here recent detailed answer from Nick

https://meta.stackoverflow.com/questions/348223/stack-overfl...

alienth|8 years ago

Not any immediate plans. Decent amount of development is necessary there. There are so many places in our various systems that work with IP addresses, and many of them don't support v6 addresses.

astrodust|8 years ago

Given the scale of Stack Overflow, you'd think they could set up AAAA records that point to a proper TLS 1.3+ server and leave the peasants on IPv4 going to one that's more...accommodating.

fareesh|8 years ago

Despite the "Google gives a boost to https" reasoning, which comes from Google itself, in practice I've read several first-hand accounts of how traffic (non XP) dropped significantly right after the switch.

gub09|8 years ago

It would be better if scripts like jquery were not encrypted. This forces users to use e.g. a google service instead of caching/hosting the scripts themselves or getting them from another CDN. I do not understand why so many people do not consider the privacy implications of every single webpage requiring calls to google services. There are ways to avoid this, but it gets a lot more complicated when that requires MITM methods for SSL. Please: use a non-tracking CDN, host it yourself, or at least leave it HTTP.

janwillemb|8 years ago

Wow, I didn't expect this ("switching" to HTTPS) to be so hard.

Ajedi32|8 years ago

It very much depends on the complexity and scale of your site. StackOverflow is a bit of an extreme case.

For example, if instead of having hundreds of domains serving millions of users with tons of user-generated content you're just serving static content from a single server on a small site, the entire process for you might actually be as simple as just running `certbot-auto` on the production server.

I suspect the difficulty of switching for most sites will fall somewhere between these two extremes.

irrational|8 years ago

Yeah, we've been working on this for about a year (not continually, but as we have time to try to work through the problems). We do use subdomains though, so that is part of the problem. We keep feeling like we are getting close, but then we run into another issue. It's like a rabbit hole that has no bottom.

jontro|8 years ago

Regarding the section "Mistakes: APIs and .internal"

Why wouldn't they use split horizon DNS for this? Seems like the perfect use case

Nick-Craver|8 years ago

Split horizon would point you at the same data center, rather than the writeable one. So that's more of a .local than a .internal. We discussed this, but ultimately the AD version we're on (pre-2016 Geo-DNS) it's not actually supported the way you'd need, and it's a nightmare to debug.

We'd consider it for a .local, when the support it properly there in 2016. Even subnet prioritization is busted internally, so that's a bit of an issue. Evidently no one tried to use a wildcard with dual records on 2 subnets before (we prioritize the /16, which is a data center) and it's totally busted. Microsoft has simply said this isn't supported and won't be fixed. A records work, unless they're a wildcard. So specifically, the <star>.stackexchange.com record which we mirror internally at <star>.stackexchange.com.internal for that IP set is particularly problematic.

TL;DR: Microsoft AD DNS is busted and they have no intention of fixing it. It's not worth it to try and work around it.

quintin|8 years ago

Has anyone tried running Fastly behind Cloudflare? Are the tradeoffs worth it?

pbarnes_1|8 years ago

Why would you want to double your CDN costs for negligible benefit?

tyingq|8 years ago

Is there some reason other than cost to do that? Curious.

user5994461|8 years ago

Funny how the main reason for lack of SSL is said to be the lack of support from 3rd party services... and the first service quoted is ads.

https://nickcraver.com/blog/2013/04/23/stackoverflow-com-the...

therealdrag0|8 years ago

Funny how?

jcadam|8 years ago

I work at a government facility. Stack Overflow and github are now both blocked (in addition to all social media and webmail). But Hacker News is apparently ok.

Zekio|8 years ago

Your blog posts are always an interesting read

souenzzo|8 years ago

How many questions on stackoverflow to these migration?

merb|8 years ago

sadly that haproxy (which stack overflow uses) does not support http/2 directly, you need to terminate it via nginx or anything else.

unknown|8 years ago

[deleted]

bullen|8 years ago

I said it before and I'll say it again: HTTPS is a waste of electricity.

174 comments