top | item 33280605

The HTTP crash course nobody asked for

902 points| g0xA52A2A | 3 years ago |fasterthanli.me

141 comments

order

Joker_vD|3 years ago

> HTTP/1.1 is a delightfully simple protocol, if you ignore most of it.

As someone who had to write a couple of proxy servers, I can't express how so sadly accurate it is.

chrismorgan|3 years ago

And this is why I expect HTTP/2 and HTTP/3 to be much more robust in the long term: the implementations are harder to write, and you won’t get anywhere without reading at least a some spec, whereas HTTP/1 is deceptively simple with therefore a lot of badly incorrect implementations, often with corresponding security problems.

SamuelAdams|3 years ago

I feel like this applies to many technologies. Made me think of the bootstrapping, “I-can-build-that-in-a-weekend” crowd.

The initial problem is usually easy to solve for, it’s all the edge cases and other details that makes something complex.

cookiengineer|3 years ago

> As someone who had to write a couple of proxy servers, I can't express how so sadly accurate it is.

Chunked transfer/content encoding problems still give me nightmares...

Donckele|3 years ago

“By contrast, I think about Bluetooth a lot. I wish I didn't.”

LOL, yes same here. Can’t wait for Bluetooths b̶a̶l̶l̶s̶ baggage to be chopped.

danuker|3 years ago

How is WiFi so much more reliable than Bluetooth?

I installed a web server on my phone and send files this way much faster (and Android -> Apple works):

https://f-droid.org/en/packages/net.basov.lws.fdroid/

I wish there were a standard for streaming (headphones could connect to your network via WPS, and stream some canonical URL with no configuration needed).

leinadho|3 years ago

The humorous style is very refreshing, if only my networking lecturers had been more witty I might remember more of this

X-Istence|3 years ago

> This is not the same as HTTP pipelining, which I will not discuss, out of spite.

That is cause HTTP pipelining was and is a mistake and is responsible for a ton of http request smuggling vulnerabilities because the http 1.1 protocol has no framing.

No browser supports it anymore, thankfully.

mgaunard|3 years ago

Isn't "HTTP pipelining" just normal usage of HTTP/1.1?

Anyone that doesn't support this is broken. My own code definitely does not wait for responses before sending more requests, that's just basic usage of TCP.

yfiapo|3 years ago

> We're not done with our request payload yet! We sent:

> Host: neverssl.com

> This is actually a requirement for HTTP/1.1, and was one of its big selling points compared to, uh...

> AhAH! Drew yourself into a corner didn't you.

> ...Gopher? I guess?

I feel like the author must know this.. HTTP/1.0 supported but didn't require the Host header and thus HTTP/1.1 allowed consistent name-based virtual hosting on web servers.

I did appreciate the simple natures of the early protocols, although it is hard to argue against the many improvements in newer protocols. It was so easy to use nc to test SMTP and HTTP in particular.

I did enjoy the article's notes on the protocols however the huge sections of code snippets lost my attention midway.

proto_lambda|3 years ago

> I feel like the author must know this

The author does know this, it's a reference to a couple paragraphs above:

> [...] and the HTTP protocol version, which is a fixed string which is always set to HTTP/1.1 and nothing else.

> (cool bear) But what ab-

> IT'S SET TO HTTP/1.1 AND NOTHING ELSE.

I_complete_me|3 years ago

That was an excellent, well-written, well-thought out, well presented, interesting, humorous, enjoyable read. Coincidentally I recently did a Rust crash course so it all made perfect sense - I am not an IT pro. Anyhows, thanks.

pohuing|3 years ago

I highly recommend taking a look at the other writeups on fasterthanli.me they're almost all excellent

mihneawalker|3 years ago

I'd like to ask you what crash course on Rust did you take, as there are quite a few out there, and it would help if someone recommends a certain course.

becquerel|3 years ago

After the string of positive adjectives, I was expecting the second half of your comment to take a sharp turn into cynicism. Thank you for subverting my expectations by not subverting my expectations!

q-base|3 years ago

I will piggyback on your comment as I totally agree. I am amazed at the amount of work that must go into not just writing the article itself but all the implementations along the way. Really amazing job!

Andys|3 years ago

I learned HTTP1 pretty well but not much of 2.

Since playing with QUIC, I've lost all interest in learning HTTP/2, it feels like something already outdated that we're collectively going to skip over soon.

fasterthanlime|3 years ago

I tend to agree with you there, however the thing I'm replacing does HTTP/2, and HTTP/3 is yet another can of worms as far as "production multitenant deployment" goes, so, that's what my life is right now.

As far as learning goes, I do think HTTP/2 is interesting as a step towards understanding HTTP/3 better, because a lot of the concepts are refined: HPACK evolves into QPACK, flow control still exists but is neatly separated into QUIC, I've only taken a cursory look at H3 so far but it seems like a logical progression that I'm excited to dig into deeper, after I've gotten a lot more sleep.

masklinn|3 years ago

FWIW HTTP/3 very much builds upon / reframes HTTP/2’s semantics, so it might be useful to get a handle on /2, as I’m not sure all the /3 documentation will frame it in /1.1 terms.

pcthrowaway|3 years ago

HTTP1 is definitely outdated (it was expeditiously replaced by HTTP 1.1), but I'd argue ignoring HTTP/2 might be more like ignoring IPv4 because we have IPv6 now

Joker_vD|3 years ago

It's pretty much a transport-level protocol, just like QUIC.

Icathian|3 years ago

Amos' writing style is just so incredibly good. I don't know anyone else doing these very long-form, conversational style articles.

Plus, you know, just an awesome dev who knows his stuff. Huge fan.

mcspiff|3 years ago

https://xeiaso.net/ is equally great content in a similar style in my opinion. Different area of topics a bit, but I enjoy both very much

juped|3 years ago

If you're using OpenBSD nc already, just use nc -c for TLS.

stevewatson301|3 years ago

Depending on your version of nc, -c is for sending CRLFs or executing sent data as commands. You might be looking for ncat instead.

rpigab|3 years ago

This is awesome, didn't read all of it yet, but I will for sure, I use HTTP way too much and too often to ignore some of these underlying concepts, and when I try to look it up, there's always way too much abstraction and the claims aren't proven to me with a simple example, and this article is full of simple examples. Thanks Amos!

est|3 years ago

I hope there's a h2 or TLS crash course.

fasterthanlime|3 years ago

Against my better judgement, the article /does/ go over H2 (although H3 is all the rage right now).

For TLS, I recommend The Illustrated TLS 1.3 Connection (Every byte explained and reproduced): https://tls13.xargs.org/

antonvs|3 years ago

> Where every line ends with \r\n, also known as CRLF, for Carriage Return + Line Feed, that's right, HTTP is based on teletypes, which are just remote typewriters

Does it need to be pointed out that this is complete bullshit?

a1369209993|3 years ago

Well, I've definitely seen a lot of people claim (generally not word-for-word) that using a pointlessly-overlong encoding of newline that exists to cater to the design deficiencies of hardware from the nineteen-sixties is not bullshit, so... maybe? But only for rather mushy values of "need".

kortex|3 years ago

It's not totally right, but it's not totally wrong, either, kind of like the way the dimensions of the space shuttle booster are directly affected by the size of a pair of Roman war horses' asses.

CRLF was used verily heavily and thus got baked into a lot of different places. Namely, it conveniently sidesteps the ambiguity of "some systems use CR, others use LF" by just putting both in, and since they are whitespace, there's not much downside other than the extra byte.

Beyond that, there are many other clear and obvious connections between Hypertext Transfer Protocol and teletype machines. Many early web browsers were expected to be teletype machines [0]. So while it might be a bit of a stretch, I'd say this is far from "complete bullshit".

[0] - http://info.cern.ch/hypertext/WWW/Proposal.html#:~:text=it%2...

tripa|3 years ago

Kind of.

Which part of it do you think is wrong?

sireat|3 years ago

Is HTTP always the same protocol as HTTPS - given the same version - and ignoring the encryption from TLS?

Theoretically yes, but in practice?

I've done my share of nc testing even simpler protocols than HTTP/1.1

For some reason the migration to HTTPS scared me despite the security assurances. I could not see anything useful in wireshark anymore. I now had to trust one more layer of abstraction.

st_goliath|3 years ago

> Is HTTP always the same protocol as HTTPS - given the same version - and ignoring the encryption from TLS?

> Theoretically yes, but in practice?

Yes, that's the whole point of encapsulation. The protocol is blissfully unaware of encryption and doesn't even have to be. It has no STARTTLS mechanism either.

Your HTTPS traffic consists of a TCP handshake to establishes a TCP connection, a TLS handshake across that TCP connection to exchange keys and establish a TLS session, and the exact, same HTTP request/response traffic, inside the encrypted/authenticated TLS session.

The wonderful magic of solving a problem by layering/encapsulating.

> I could not see anything useful in wireshark anymore

Wireshark supports importing private keys for that, see: https://wiki.wireshark.org/TLS

dochtman|3 years ago

For 1.1 and 2, the byte stream is the same for TCP vs TLS over TCP. For 3, it uses one stream per request over a QUIC connection which is always encrypted.

Too|3 years ago

The protocol is the same, but semantics in the applications can differ. Secure cookies only working on https to give one example.

mannyv|3 years ago

As far as i can tell the host header is pointless, because if it's ssl/tls you won't be able to read it and route it. That's what sni is for. If you aren't tls then you don't need it, unless you hit the server as an ip. But then why would you do that?

LukeShu|3 years ago

It's for one server/IP serving multiple hostnames. For instance, the same physical server at 45.76.26.79 serves both www.lukeshu.com and git.lukeshu.com with the same instance of Nginx. Once Nginx decrypts the request, it needs to know which `server { … }` block to use to generate the reply.

With TLS+SNI, this is redundant to the name from SNI. But we had TLS long before we had SNI, and we had HTTP long before we had TLS, and both of those scenarios need the `Host` header.

Too|3 years ago

Proxies doing TLS termination, with multiple servers behind.

mahdi7d1|3 years ago

I didn't ask but I needed it.

mannyv|3 years ago

Also, never trust the content length. It's been that way since before http was finalized. Use it as guidance, but don't treat it as canonical.

mannyv|3 years ago

When doing http by hand, it's better to do http/1.0 because that tells the server you (and it) can't do anything exciting.

mustak_im|3 years ago

Yay! this is going to be a great read for the weekend!

danesparza|3 years ago

More articles should be written in the style of this article. Thank you for this.

stefs|3 years ago

most of his articles are written in this style. they're great!

tinglymintyfrsh|3 years ago

    GET / HTTP/1.0\r\n\r\n 
Still works with many websites.

mlindner|3 years ago

Is there a way to get this guide without the annoying side-commentary?

tomcam|3 years ago

Funny and very helpful. Thank you.

cph123|3 years ago

For a crash course would the code examples have been better in something like Python rather than Rust?

fasterthanlime|3 years ago

My whole thing is that I'm teaching Rust /while/ solving interesting, real-world problems (instead of looking at artificial code samples), so, if someone wants to write the equivalent article with Python, they should! I won't.

rk06|3 years ago

Nope, that’s the author’s favourite language. A regular reader would expect rust to be used like in previous articles