top | item 25750098

(no title)

therealwardo | 5 years ago

there is a good chance that redundant streams would have helped here assuming that the player could load the manifest to fail over, but that manifest sadly hits Fastly. the problem with redundancy is that it really only works if you are redundant all the way up and down the serving stack. in our standard serving path (even without redundant streams) we automatically select the optimal CDN for chunk serving which does take into account availability, but during this incident we measured Fastly as about 5% less available than normal which wasn't enough to trigger a full automatic drain off of them. turns out, a 5% failure rate when you are serving 50 chunks to enable a video playback means 1 of them will likely fail and you get a playback error or in the best case, rebuffering.

sadly, Mux's manifest servers have not been redundant because the edge logic is a pile of Fastly's VCL. as mmcclure says, we're working on adding another CDN that can run our manifest serving logic so that we can be fully redundant.

I am really looking forward to a standard and portable edge platform that makes redundancy at this application layer easier. wasm here we come?

discuss

order

No comments yet.