« Note the download isn’t going to hover exactly at 1.0K/sec — > the actual download speed as reported by wget is an average over time. In short, you’ll see numbers closer to an even 1.0K/sec the longer the transfer. In this example, I didn’t wait to download an entire 4.2GB file, so the 10.5K/s you see above is just wget averaging the transfer speed over the short time I left wget running. »
Wrong: you see 10KB/s download speed because you are not throttling the incoming packets but the outgoing packets!
So, what you are really doing is rate limiting outgoing ACK packets to 1KB/s, which delays your outgoing ACKs, hence preventing the remote server from sending you data at full throttle.
You can see that with a tool like "bwm-ng" to see Rx/Tx speed, and you'll see exactly 1KB/s on Tx, and something variable on Rx.
The incoming rate will fluctuates depending on the TCP window setting used by the server which translate to how many non-acknowledged packets are allowed to be sent by the server.
> Wrong: you see 10KB/s download speed because you are not throttling the incoming packets but the outgoing packets!
Yep. TC's default is to policy outgoing traffic, which in OP's example is a bunch of TCP ACKs essentially. Instead, they should be using ingress keyword, something like described here:
Caveat emptor: ingress rate-limiting is hard. Long story short, it all boils down to what you do with non-confirming packets: There are two alternatives, and both are rather sub-optimal. You can either buffer/delay packets in kernel space (default, which leads to bufferbload and memory waste), or drop (which author linked above opted for, which leads to excessive retransmits and bandwidth waste).
If you don't need hierarchical classes[1], just use tbf (token bucket filter) instead of htb (hierararchy token bucket) - it's more efficient, more compact, and gives you access to delay in the same discipline as well. Compare:
# htb
tc qdisc add dev eth0 handle 1: root htb default 11
tc class add dev eth0 parent 1: classid 1:1 htb rate 1kbps
tc class add dev eth0 parent 1:1 classid 1:11 htb rate 1kbps
[1] i.e. stuff like "I would like to have tcp/80 limited to 10 mbit/s, tcp/443 limited to 15 mbit/s, while sum of above should never exceed 20mbit/s, and tcp/80 should get priority when competing for that shared 20mbit/s"
EDIT: changed to "burst 2k". Having burst lower than interface MTU will delay large packets essentially forever.
https://github.com/urbenlegend/netimpair is a tool which implements the techniques that are described in the article, and more. In particular, jitter (variance) is often forgotten about when testing.
Huge variations in jitter that occurs randomly is a hard thing to reliably simulate. For example simulating a US/48-states consumer grade Ku or Ka-band VSAT service ($85-115/mo) which is a highly oversubscribed TDMA network.
When capacity in your particular spot beam is good, latency could be 550 to 600 ms end to end. When it's bad it could be 1300 or 1700ms and will jitter around randomly anywhere in between those two figures.
one of my first gigs was troubleshooting an application failure over a low bandwidth dedicated link - everything worked fine until the traffic volume resulted in scheduling jitter due to buffer latency; the resulting added overhead of continual connection renegotiation operations killed what was rest of the link the link and caused the failure to propagate into the application itself..
thankfully this oversight led to me as an intern needing to perform the experiment to 'prove' the point, and a subsequent job offer.. so at least there was that :)
Does that work also for websocket connections? I'm asking because last time I checked it, I strongly got the impression this feature only works for simple http requests, but I could be wrong.
The tc command is great except for the weird command structure. I really like the comcast tool as a wrapper for tc: https://github.com/tylertreat/comcast
It makes it much easier to do throttle the way you want.
The comcast wrapper is very easy to use because its capabilities are so limited that it is essentially useless. It can accomplish some throttling, but it cannot produce a usefully accurate simulation of a Comcast connection or any other commonly congested bottleneck.
BTW, I've no idea of the technicalities of this, but I travel frequently to places with terrible internet (and of course billions of people live in such places), and many web experiences degrade considerably.
If some supercool app or videos don't work, sure, no problem, but if reading some programming documentation (with maybe 30KB of actual text) or a bank statement (with maybe 2KB of actual information) or even getting a restaurant address/phone number doesn't work because of monstrously huge sites with much back and forth (what's the technical term here...), that's frustrating.
So please please please do try to make your sites usable over bad connections :-)
When working in locations with really terrible net connections, one of the things I have resorted to is a VNC session that is a 1920x1200 desktop (256 color) tunneled inside SSH. In this setup the workstation's VNC client connects to a port on localhost that is SSH forwarded to the remote host. With the right SSH settings for timeout and keepalive it can be surprisingly usable. Then open up whatever application you have that is terrible on high-jitter/high-latency connections inside Chrome or Firefox, or as a native desktop app on the machine that is hosting the VNC session.
> So please please please do try to make your sites usable over bad connections :-)
This is only going to happen if people are accurately simulating the true problem. The instructions given in the article and the measures implemented by wrappers like the comcast tool mentioned in another comment do not simulate anything remotely realistic. Testing against a simulation like this can help verify that the obvious changes like reducing image sizes and the total number of requests for a page will help, but it won't fully show how things like the CDG and BBR TCP congestion control methods help.
I never see tools or articles like this discussing emulating bufferbloat, only static high latency. In the real world, satellite connections are relatively uncommon, and 500ms latency is usually instead due to excess buffering on a link that can deliver low latency when not saturated.
Also, don't the instructions in this article only apply to outbound traffic?
If you're trying to simulate poor network conditions, you need to have a better understanding than this of what causes poor network performance, and how to properly emulate it.
There is a way to do ingress traffic manipulation, you have to divert it via virtual device. For years this was https://github.com/imq/linuximq/wiki/WhatIs a most common way to do it.
It's a decent introduction, but a few years out of date. The discussion of bufferbloat needs to be updated to account for the BQL mechanism to automatically manage driver buffering, and the sections on more recent AQMs need to be fleshed out. In particular, fq_codel or cake should be recommended over the Rube Goldberg HTB+SFQ+PFIFO_FAST setup described.
The Linux network stack is perfectly capable of simulating bufferbloat and other network degradation, in either direction. This article simply fails to mention the relevant modules.
[+] [-] choocroot|9 years ago|reply
Wrong: you see 10KB/s download speed because you are not throttling the incoming packets but the outgoing packets!
So, what you are really doing is rate limiting outgoing ACK packets to 1KB/s, which delays your outgoing ACKs, hence preventing the remote server from sending you data at full throttle.
You can see that with a tool like "bwm-ng" to see Rx/Tx speed, and you'll see exactly 1KB/s on Tx, and something variable on Rx.
The incoming rate will fluctuates depending on the TCP window setting used by the server which translate to how many non-acknowledged packets are allowed to be sent by the server.
[+] [-] gmazza|9 years ago|reply
Yep. TC's default is to policy outgoing traffic, which in OP's example is a bunch of TCP ACKs essentially. Instead, they should be using ingress keyword, something like described here:
http://blog.stevedoria.net/20050906/ingress-policing-with-li...
Caveat emptor: ingress rate-limiting is hard. Long story short, it all boils down to what you do with non-confirming packets: There are two alternatives, and both are rather sub-optimal. You can either buffer/delay packets in kernel space (default, which leads to bufferbload and memory waste), or drop (which author linked above opted for, which leads to excessive retransmits and bandwidth waste).
[+] [-] gmazza|9 years ago|reply
[1] i.e. stuff like "I would like to have tcp/80 limited to 10 mbit/s, tcp/443 limited to 15 mbit/s, while sum of above should never exceed 20mbit/s, and tcp/80 should get priority when competing for that shared 20mbit/s"
EDIT: changed to "burst 2k". Having burst lower than interface MTU will delay large packets essentially forever.
[+] [-] secure|9 years ago|reply
[+] [-] walrus01|9 years ago|reply
When capacity in your particular spot beam is good, latency could be 550 to 600 ms end to end. When it's bad it could be 1300 or 1700ms and will jitter around randomly anywhere in between those two figures.
[+] [-] cat199|9 years ago|reply
one of my first gigs was troubleshooting an application failure over a low bandwidth dedicated link - everything worked fine until the traffic volume resulted in scheduling jitter due to buffer latency; the resulting added overhead of continual connection renegotiation operations killed what was rest of the link the link and caused the failure to propagate into the application itself..
thankfully this oversight led to me as an intern needing to perform the experiment to 'prove' the point, and a subsequent job offer.. so at least there was that :)
[+] [-] tyingq|9 years ago|reply
[+] [-] amelius|9 years ago|reply
[+] [-] JelteF|9 years ago|reply
It makes it much easier to do throttle the way you want.
[+] [-] wtallis|9 years ago|reply
tc is complicated because it does more things.
[+] [-] FabHK|9 years ago|reply
If some supercool app or videos don't work, sure, no problem, but if reading some programming documentation (with maybe 30KB of actual text) or a bank statement (with maybe 2KB of actual information) or even getting a restaurant address/phone number doesn't work because of monstrously huge sites with much back and forth (what's the technical term here...), that's frustrating.
So please please please do try to make your sites usable over bad connections :-)
[+] [-] walrus01|9 years ago|reply
[+] [-] wtallis|9 years ago|reply
This is only going to happen if people are accurately simulating the true problem. The instructions given in the article and the measures implemented by wrappers like the comcast tool mentioned in another comment do not simulate anything remotely realistic. Testing against a simulation like this can help verify that the obvious changes like reducing image sizes and the total number of requests for a page will help, but it won't fully show how things like the CDG and BBR TCP congestion control methods help.
[+] [-] wtallis|9 years ago|reply
Also, don't the instructions in this article only apply to outbound traffic?
If you're trying to simulate poor network conditions, you need to have a better understanding than this of what causes poor network performance, and how to properly emulate it.
[+] [-] betaby|9 years ago|reply
[+] [-] betaby|9 years ago|reply
[+] [-] signa11|9 years ago|reply
[+] [-] wtallis|9 years ago|reply
[+] [-] FabHK|9 years ago|reply
http://jvns.ca/blog/2017/04/01/slow-down-your-internet-with-...
[+] [-] basemi|9 years ago|reply
It has delay, loss, duplication, corruption, re-ordering, rate control.
And take a look at this SO thread: [http://stackoverflow.com/questions/130354/how-do-i-simulate-...]
[+] [-] saagarjha|9 years ago|reply
[+] [-] feld|9 years ago|reply
[+] [-] daemonna|9 years ago|reply
[+] [-] grizzles|9 years ago|reply
[+] [-] networktesting|9 years ago|reply
[+] [-] throwayedidqo|9 years ago|reply
Linux network stack isn't designed for this. The best easy thing to use is BSD's Dummynet pipes.
[+] [-] wtallis|9 years ago|reply