Skimmed article, didn't read paper if there was one. The article talks about data center links. These are rarely the failure mode, the failure mode is most often a bad config push and that then brings about a gray failure. And while a route may exist, because of priority levels and the way routes are announced, a part of the network is down even though there exists a physical path. This is a solution to a highly constrained model, not actual cloud computing.
For sure putting another complex system in front of any sort of traffic routing increases the failure modes, but the article makes a nod towards "signal quality" as the metric for traffic shifting.
> Failure probabilities were obtained by checking the signal quality of every link every 15 minutes. If the signal quality ever dipped below a receiving threshold, they considered that a link failure.
If the signal quality could be a higher level construct (Layer 7 errors), this could route around bad config pushes if they are constrained. I'm not going to pretend that this is definitely feasible, but at least that was my first thought.
It just appears to apply VaR to failure rates, which is hardly a 'wall street secret'.
The authors also completely misunderstand how VaR works - it is the minimum value at risk, not the maximum.
Value at Risk (VaR) [33] captures precisely these bounds.
Given a probability threshold β (say β = 0.99), VaRβ provides a probabilistic upper bound on the loss: the loss is less than VaRβ with probability β
It is actually a probabilistic lower bound on the loss.
>for a target percentage of time—say, 99.9 percent—the network can handle all data traffic, so there is no need to keep any links idle. During that 0.01 percent of time, the model also keeps the data dropped as low as possible.
[+] [-] sitkack|6 years ago|reply
[+] [-] cfors|6 years ago|reply
> Failure probabilities were obtained by checking the signal quality of every link every 15 minutes. If the signal quality ever dipped below a receiving threshold, they considered that a link failure.
If the signal quality could be a higher level construct (Layer 7 errors), this could route around bad config pushes if they are constrained. I'm not going to pretend that this is definitely feasible, but at least that was my first thought.
[+] [-] jamez1|6 years ago|reply
https://people.csail.mit.edu/ghobadi/papers/teavar_sigcomm_2...
[+] [-] jamez1|6 years ago|reply
The authors also completely misunderstand how VaR works - it is the minimum value at risk, not the maximum.
Value at Risk (VaR) [33] captures precisely these bounds. Given a probability threshold β (say β = 0.99), VaRβ provides a probabilistic upper bound on the loss: the loss is less than VaRβ with probability β
It is actually a probabilistic lower bound on the loss.
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] Buge|6 years ago|reply
99.9 + 0.01 != 100
[+] [-] thinkloop|6 years ago|reply