top | item 3937604

TCP Sucks

137 points| reacocard | 14 years ago |bramcohen.com

56 comments

order

jandrewrogers|14 years ago

Designing networking protocols that are robust in mathematical sense is unbelievably difficult. In fact, we humans have only found optimal solutions in a few cases if you dig through the mathematics literature. Many real-world networking protocol design scenarios do not have a known non-pathological implementation. Furthermore, there is a large number of decentralized protocol designs that we can prove to have many poor qualities. To bludgeon the equine, people that can significantly advance our understanding of such things tend to win Nobel prizes and similar. It is that difficult.

That said, TCP is not the best we can design given everything we know about designing network protocols. It was good enough for the people that designed it at the time, and possibly (my chronology is fuzzy) was approximately as good as the mathematics would have reasonably allowed when it was developed. We can make it work well enough in many cases -- the economics of inertia. Other narrow use cases are better solved differently but are not general solutions.

It is one of those problems that sounds like it should be easy to solve on the surface but turns into a bloody epic challenge once you start to dig into it. I am not offering a solution, just noting that very few people can.

paulsutter|14 years ago

Hold on a minute

> possibly (my chronology is fuzzy) was approximately as good as the mathematics would have reasonably allowed

Just because Van Jacobsen's papers spew forth great volumes of mathematics, doesn't mean there is any robust mathematics behind TCP. Read to the end of his paper "Congestion Avoidance and Control". Read past all of the impressive plots graphs and equations. Read to the conclusions.

"The 1-packet increase has less justification than the 0.5 decrease. In fact, it’s almost certainly too large."

This statement shows how little formal consideration went into the entire algorithm. The 1-packet increase is not simply too big or too small, it just doesn't make any sense. For starters, how big is the packet? Oh, it isn't defined anywhere. Even if we just go with the de facto internet packet size of 1542 bytes (you know, the old limit for 10Mbit Ethernet)...

Could that one packet increase per roundtrip make equal sense for a 10Gbit path to India with a 400ms round trip time, as it does for a 56Kb link between Berkeley and MIT (his test case)? Of course it doesn't make sense. And it gives lie to any notion that there is a formal underpinning for TCP. They tweaked it until it worked, and then put on a nice mathematical show to feel better about it.

Quoth Van Jacobsen: "We have simulated the above algorithm and it appears to perform well". Oh now I feel better.

Second point, which Braham is covering: TCP makes the assumption that router queue lengths are reasonable. TCP says, fill up the router queues until they drop packets. But router queues have been getting longer and longer as memory gets cheaper. These queues can create additional seconds of delay to layer on top of the 10ms-400ms speed of light delays we see on the internet itself.

EDIT: In that 10Gb to India example, it takes TCP literally DAYS to fill up the pipe because of that "1 packet per roundtrip" window increase. Days, by the way, of no incidental packet loss, because it all gets reset on a loss.

EDIT: I spent 5 years of my life working on the fact that latency was never really factored into the design of most network protocols.

bramcohen|14 years ago

Did you even read the article?

its_so_on|14 years ago

To people who don't know what the parent is talking about.

Take a simple example.

http://en.wikipedia.org/wiki/Two_Generals_Problem

My summary:

The two generals problem proves that, if there exists any nonzero probability of packet loss, two people cannot even coordinate to both have a state 1 at sunrise tomorrow (attack!!!) or both have the state 0 if it is not 100% mathematically guaranteed that they both believe this has been coordinated (since an uncoordinated attack will be a catastrophic loss for them).

(In other words, the guarantee must be such that by sunrise tomorrow state 0-1 or 1-0, in other words one general thinking the attack has been coordinated with certitude and attacks, but the other general thinking the attack has not been confirmed with certitude and does not attack, must be a mathematical impossibility.)

Take a simple approach. The following packets are all encrypted, but any or all may be lost.

1) first general sends: "Let's both be in state 1 tomorrow (coordinated attack). Since an uncoordinated attack is so catastrophic to us, I will only enter state 1 if I receive your reply. Please include the random number 25984357892 in your reply. As soon as I get this the attack is ON. If I don't get such a packet within the hour I will assume this post was intercepted (lost), and I will send another. I will remain in state 0 until I receive that packet."

2) second general sends: "Got your packet with 25984357892. This is my acknowledgment! I will attack as well. In case you don't get this, I know you won't attack thinking I didn't get your message, so I am sending this message continuously."

Great. But what if all messages from the second to the first are intercepted. Now the first thinks all of HIS were intercepted (has received no acks) and doesn't attack, but the second one does. Failure.

So, we have to emend 2) to:

2) second general sends: "Got your packet with 25984357892. This is my acknowledgment! I will attack as well. In case you don't get this, I know you won't attack thinking I didn't get your message, so I am sending this message continuously. In case you don't get any of THESE messages, however, I will not attack. Therefore acknowledge ANY of them with random number 458972984323..."

Ooops. What if all the first general's ack's of the acks are intercepted or lost? (Perhaps the first general is able to send messages until receiving (2), but just as the first general gets 2) conditions change and the general no longer has any of his messages delivered.)

Now the first general thinks he has acknowledged the ack, but the second general doesn't even know if his ack-(cum-request-for-an-ack-back) message was even delivered...

and so it goes...

Of course, in practice you can simply say: "Let's do this for a certain number of acks of acks of acks, 3 let's say, and then just keep sending the same ack to each other, assuming that if the connection was reliable enough to get to three deep, then it will be reliable enough for one of the final acks to make it through." That's a false assumption (mathematically - what guarantee do you have that if 3 of your encrypted messages made it across, at least one of the next 217 that you send by sunrise all with the same message will), but a reasonable one.

So it is not a practical problem. This is a mathematical problem. Although you cannot even do something as simple as "let's agree to both be in state 1 (or neither if we fail to agree), OK?" over a less than guaranteed reliable connection, if the connection has any reliability at all you can get to within a practical level.

once you reliaze that, PROVABLY, you can't even do the most mundane things no matter what, the mathematics the parent is talking about do not seem all that interesting anymore. :)

viraptor|14 years ago

Ok, maybe I'm missing something, but reading the article I see some weird ideas:

RED is hard to deploy, so let's change the base protocol instead. - how does that make sense? Everyone would have to start using new libraries and for backward compatibility we'd have to preserve the tcp layer too. That means standards like http would have to get extensions to use SRV records or suffer delays while utp availability is probed.

There's also a complaint that RED will drop packets once the queue is full. I don't get that at all - it will always happen...

In addition I get an impression there is some tension/implied superiority between us (people doing uTP) and them (ones doing RED). Why does it look so ugly? There's a known problem, there's an interesting solution for new software (uTP) and some plan to migrate old protocols transparently (RED). When did that turn into some bizarre conflict and why?

bramcohen|14 years ago

BitTorrent is using uTP just fine, which is only, you know, most of the upload from consumer internet connections, and we're working on getting the same things crammed into TCP with LEDBAT, but that's a slow process.

I wasn't complaining about RED dropping packets, just describing how it works.

As for the tension, my point is that my solution works and the other one doesn't. If you want to know why the person I quoted was being such a dismissive jerk, you'll have to ask him.

moultano|14 years ago

> When did that turn into some bizarre conflict and why?

Man you weren't kidding. I tried looking up uTP on wikipedia hoping to come away with some technical understanding. It's full of passive aggressive statements that cite forum posts as their support, with no information on how it actually works. Maybe some of the folks in this thread could go fix that.

caf|14 years ago

  The solution is for the end user to intervene, and tell all
  their applications to not be such pigs, and use uTP instead
  of TCP. Then they’ll have the same transfer rates they
  started out with, plus have low latency when browsing the
  web and teleconferencing, and not screw up their ISP when
  they’re doing bulk data transfers. 
That still doesn't address the problem when you have many users behind the same queue, some of whom care only about throughput and not latency. You need a scheme which will work when all of those users are acting selfishly.

scott_s|14 years ago

My thought was more fundamental than that: any solution which involves asking users to request different transport protocols is not going to solve the problem. There are far more users who have no idea what a "transport protocol" is than those who do.

With that said, I enjoyed the post. It's an interesting problem, and I do find the base idea attractive: allowing applications to opt to be background traffic.

bramcohen|14 years ago

They can have their throughput without running a denial of service on their net connection. If you assume that they'll DOS themselves for the fun of it anyway, then I can't help you.

volatile|14 years ago

The author seems to claim that is is implausible for a router vendor to sell a router that drops more packets.

  The marketing plan is that the because router
  vendors are unwilling to say ‘has less memory!’ as a
  marketing tactic, maybe they’d be willing to say
  ‘drops more packets!’ instead. That seems implausible.
Yet he concludes by suggesting the router should drop all the packets.

  The best way to solve that is for a router to notice
  when the queue has too much data in it for too long,
  and respond by summarily dropping all data in the
  queue. /snip/ Of course, I’ve never seen that proposed 
  anywhere…
Based on his earlier reasoning, that would also be implausible.

bramcohen|14 years ago

That's what you would do IF you were going to be serious about making the router drop packets in a way which actually helps. I don't expect it to happen any time soon.

nuje|14 years ago

It seems to me that on the IP level the net has been in a technological paralysis for some time.

We can't get RED or IPv6 deployed, and and the IETF doesn't seem to get anything useful happen these days.

edit: anyone else remember when layer 3 had a bright future ahead of it, IPv6 and end-to-end IPSec (with keys in the DNS) were just around the corner...

bramcohen|14 years ago

uTP carries the bulk of all BitTorrent transfers at this point. This would seem to imply a certain level of success.

endymi0n|14 years ago

Having uTP running against UTP as an alternative network connection means is a rather unfortunate naming - I imagine a lot of people getting confused to the max, especially as it's pronounced the same. So pretty pretty please: Give the protocol a GOOD name first, then we're talking business! ;-)

josefonseca|14 years ago

I think the catchy title was meant to grab attention to an important present day issue.

But TCP actually does not suck, it's been there for longer than I have and served us pretty darned well up until now.

Never forget that when the TCP protocol was designed, the biggest concern we had was that a nuke would land on top of our heads at any minute and the network should keep working. Also, the "Internet" was thought to be a small niche network of networks among the military and academics.

I guess this is all well known, it's just my reaction to the editorialized title.

msbhvn|14 years ago

Just to be clear, "TCP Sucks", despite successfully running the majority of the global Internet traffic. "TCP Sucks" so bad we're basically going to copy a lot of it: window based congestion control, SACK, timestamps, ability to add new options, etc. "TCP Sucks" because it is not perfect and has an issue, an issue that requires router / switch upgrades. We're going to fix that by breaking backward compatibility with _tons_ of applications and requiring an OS update on _every_ client and/or application. All this assuming our relatively new and unproven thing is as good as TCP in all other ways and fixes this issue of TCP perfectly.

Hmmm. Me thinks that TCP does not suck so much.

romaniv|14 years ago

Sounds like "it kind of worked so far, so let's use it forever, with multiple layers of band-aids if necessary". Besides, unles I'm missing something, the two protocols can be used side-by-side, I.e. you can slowly phase one out by the other where necessary.

bramcohen|14 years ago

Did you read the article or just the headline?

hristov|14 years ago

Shameless plug:

Extremetcp.com is the solution to the congestion problems of TCP. The best part of ExtremeTCP is that it is not a new protocol. It is TCP. It just uses clever algorithms at the sender side to send data while avoiding congestion. (Since TCP does not actually specify which algorithms one should use as long as one avoids congestion, ExtremeTCP is a perfectly legal version of TCP).

Yes, I am involved with this. If interested in testing, please send an email to the contact address in the website.

geoffschmidt|14 years ago

It's hard to evaluate your claim that ExtremeTCP is "the" solution, given that your website (1) does not compare your solution to the significant amount of literature and prior art in this space (eg, TCP Vegas, which is already implemented in the Linux kernel), (2) doesn't make any claims about friendliness to TCP Reno, which is the hardest part of retrofitting a new congestion control strategy onto the public Internet.

sams99|14 years ago

there is also http://www.fastsoft.com/home/ which professes to do the same, personally I worry about any non-documented non-public congestion control protocols, many years have been spent in the academia researching this subject ... it is easy to be "fastest" - just disable congestion control altogether - trouble is tons will break. In order for me to use a different congestion control algorithm in production I would need some experts to review the protocol to ensure I am being a good web citizen and not breaking the internet.

paulsutter|14 years ago

There are many implementations of improved congestion control for TCP, several of which are implemented in the Linux kernel.

It's trivially easy to modify congestion control to get arbitrarily fast performance in high bandwidth-delay environments. I can tell you from experience that implementing fast performance in extremely lossy environments is harder.

And hardest of all is to come up with a solution that works on a network that is shared with the common congestion control implementations, and that works with billions of end nodes.

I spent years working on this. Feel free to reach out to me on my email address, I'd love to share with you my experience of commercializing such a product.

dfc|14 years ago

Is the shout-out the mere mentioning of bittorrent? Or does nick weaver elude to bram some other way in the full article?

bramcohen|14 years ago

Who do you think made the development he's referring to happen?

thespin|14 years ago

I'm surprised he doesn't mention CurveCP. He's taken ideas from that author before (e.g. netstrings, which you'll find in the .torrent file format).

TCP does suck. If you try to use it for lots of short lived connections. And that pretty much sums up how it's being used nowadays, most the time.

For single, long term connections, TCP is fine.

bramcohen|14 years ago

uTP isn't based on CurveCP, and CurveCP is nowhere near as mature as uTP is.

cbsmith|14 years ago

There's also UDT...

dfc|14 years ago

Is "from that author" some sort of passive aggressive jab at djb?

krtcut|14 years ago

[deleted]

Alind|14 years ago

[deleted]

shaggyfrog|14 years ago

> Of course, I’ve always used TCP using exactly the API it provides, and even before I understood how TCP worked under the hood gone through great pains to use the minimum number of TCP connections used to the number which will reliably saturate the net connection and provide good piece diffusion.

BitTorrent must not not have any books on copyediting.