Why is it not called HTTP/1.2? Or, how will clients (or servers) tell the difference between a peer implementing RFC 2616 (HTTP/1.1 old version) and RFC 723x (HTTP/1.1 new version)?
Could it be that there is so much software hardcoded to look for "HTTP/1.1" that a "HTTP/1.2" string would break them all?
Because it does not really change the protocol. It clarifies details and implements spec fixes (e.g. matches the spec better to actual real-world use).
The argument is that the two should be able to transparently interop together, and specifically that RFC 723[0-5] simply codifies the way HTTP/1.1 already works in the real world.
It's now suggested to use the about:blank uri in the
Referer header when no referer exists, to distinguish
between "there was no referrer" and "I don't want to
send a referrer".
For the sake of privacy would it not be better if there was no such distinction. Basically now any privacy conscious services need to add 'about:blank' as the referrer when users do not want to have their behaviour categorised and fingerprinted?
If a user doesn't want to send the referrer when there is no referrer, no referrer should be sent. This then allows sites to distinguish between direct traffic from users that don't block referrers and traffic with blocked referrers. I wouldn't expect this to be a significant concern, because the volume of actual direct traffic is not very large.
If the target URI was obtained from a source that does not have its
own URI (e.g., input from the user keyboard, or an entry within the
user's bookmarks/favorites), the user agent MUST either exclude the
Referer field or send it with a value of "about:blank".
These specs clarify 2616. 'Major' may have been a poor choice of words, but if you were 2616 compliant, you should be largely compliant with these specs as well.
I don't think splitting it up like that is such a good idea; now, instead of searching through one file, I have to remember that there are several and look through them all, just for one conceptual protocol. (TCP has a similar issue, although most of it is still in 793.)
As for the extra verbosity, I'm not sure what to think; while some things may be specified more precisely, standards should also attempt to be concise and to-the-point. Some of the sentences in the new RFCs seem almost parenthetical (e.g. look at the description of GET.)
OTOH, that means I don't have to dive through the minutiae of response message format when I'm just looking for the basic header stuff. All the important concerns (core, caching, conditional requests, auth and forwarding) get their own RFC and are thus easier to skim and search through. Although 308 and Range (and Prefer) also getting their RFCs is a bit weird. Likewise, syntax and routing get RFC 3270 so if you're implementing a client or server the reading experience should be much tighter,
The clarifications are very welcome but I wish it included embedded unit-less progress information on chunk encoding without having to rely on a side channel [0] (shameless plug, but any progress — ha ha — on this front would be fine)
The thing I'm most surprised by is the change in the default cacheability of 404 responses from uncacheable to cacheable.
Though I guess since defaulting to cacheable doesn't mean that responses must be cached, so you can still be compliant with RFC7231 by never caching 404s.
"A response received with [a status code other than 200, 203, 206, 300, 301 or 410] MUST NOT be returned in a reply to a subsequent request unless there are cache-control directives or another header(s) that explicitly allow it."
As someone new to HTTP, what would be the most pragmatic way to read through this RFCs in the context of building web applications or HTTP APIs but not to the level of wanting to implement a http server or http client? for example: order of reading, what can be avoided, what is not widely use or implemented, the basics of the protocol, etc...
You can certainly skip most of "Message Syntax and Routing". That's the stuff that concerns server and client implementers that just have tcp sockets to work with.
I would absolutely read "Semantics and Content". It's a really good idea to be aware of "Conditional Requests", and you only really have to read "Caching", "Range requests" and "Authentication" if you need to know about those features.
A crawler like that should typically only do GET requests. The 308 is really mainly useful for HTTP clients doing for example a PUT or POST request on some url, and the server wants the client to repeat that exact request on for example a different server.
Search engines largely handle all redirects the same, because they know nobody uses them correctly. If you aren't seeing the behavior you want, you can use Webmaster Tools or metadata to fix it.
[+] [-] 0x0|11 years ago|reply
Could it be that there is so much software hardcoded to look for "HTTP/1.1" that a "HTTP/1.2" string would break them all?
[+] [-] masklinn|11 years ago|reply
[+] [-] Lukasa|11 years ago|reply
We'll see how well that goes.
[+] [-] cyphunk|11 years ago|reply
[+] [-] robryk|11 years ago|reply
[+] [-] kccqzy|11 years ago|reply
[+] [-] TheLoneWolfling|11 years ago|reply
Yet another way to fingerprint.
[+] [-] gamegoblin|11 years ago|reply
I'm glad this new spec apparently resolves a lot of ambiguities. I hated reading 2616 and some specs it depended on (email, URI, etc).
[+] [-] treve|11 years ago|reply
[+] [-] profil|11 years ago|reply
[+] [-] userbinator|11 years ago|reply
As for the extra verbosity, I'm not sure what to think; while some things may be specified more precisely, standards should also attempt to be concise and to-the-point. Some of the sentences in the new RFCs seem almost parenthetical (e.g. look at the description of GET.)
[+] [-] masklinn|11 years ago|reply
[+] [-] colanderman|11 years ago|reply
[+] [-] __david__|11 years ago|reply
[+] [-] colanderman|11 years ago|reply
[+] [-] grk|11 years ago|reply
[+] [-] purephase|11 years ago|reply
It will likely be awhile before widespread adoption, but to see a standard move forward in such a seemingly small, but considerable way is great.
Kudos to those involved. I can't imagine it was an easy feat.
[+] [-] lloeki|11 years ago|reply
[0]: https://github.com/lloeki/http-chunked-progress/blob/master/...
[+] [-] mjs|11 years ago|reply
RFC2616:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13...
"A response received with [a status code other than 200, 203, 206, 300, 301 or 410] MUST NOT be returned in a reply to a subsequent request unless there are cache-control directives or another header(s) that explicitly allow it."
RFC7231:
http://tools.ietf.org/html/rfc7231#section-6.5.4
"A 404 response is cacheable by default"
[+] [-] derengel|11 years ago|reply
[+] [-] treve|11 years ago|reply
You can certainly skip most of "Message Syntax and Routing". That's the stuff that concerns server and client implementers that just have tcp sockets to work with.
I would absolutely read "Semantics and Content". It's a really good idea to be aware of "Conditional Requests", and you only really have to read "Caching", "Range requests" and "Authentication" if you need to know about those features.
[+] [-] muaddirac|11 years ago|reply
[+] [-] BorisMelnik|11 years ago|reply
[+] [-] treve|11 years ago|reply
[+] [-] dsl|11 years ago|reply
[+] [-] fafner|11 years ago|reply
[+] [-] mrottenkolber|11 years ago|reply
[1] http://mr.gy/software/httpd0/