(no title)
rlyshw | 4 years ago
>Using various content delivery networks, Mux is driving HTTP Live Streaming (HLS) latency down to the lowest levels possible levels, and partnering with the best services at every mile of delivery is crucial in supporting this continued goal.
In my experience, HLS and even LLHLS are a nightmare for latency. I jokingly call it "High Latency Streaming", since it seems very hard to (reliably) obtain glass-to-glass latency in the LL range (under 4 seconds). Usually Latency with cloud streaming gets to at least 30+s.
I've dabbled with implementing WebRTC solutions to obtain Ultra Low Latency (<1s) delivery but that is even more complicated and fragmented with all of the browsers vying for standardization. The solution I've cooked up in the lab with mediasoup requires an FFMPEG shim to convert from MPEGTS/h264 via UDP/SRT to MKV/YP9 via RTP, which of course drives up the latency. Mediasoup has a ton of opinionated quirks for RTP ingest too, of course. Still I've been able to prove out 400ms "glass-to-glass" which has been fun.
I wonder if Mux or really anyone has intentions to deliver scalable, on cloud or on prem solutions to fill the web-native LL/Ultra LL void left by the death of flash. I'm aware of some niche solutions like Softvelum's nimble streamer, but I hate their business model and I don't know anything about their scalability.
keithwinstein|4 years ago
The trick, which maybe you don't want to do in production, is to mux the video on a per-client basis. Every wss-server gets the same H.264 elementary stream with occasional IDRs, the process links with libavformat (or knows how to produce an MP4 frame for an H.264 NAL), and each client receives essentially the same sequence of H.264 NALs but in a MP4 container made just for it, with (very occasional) skipped frames so the server can limit the client-side buffer.
When the client joins, the server starts sending the video starting with the next IDR. The client runs a JavaScript function on a timer that occasionally reports its sourceBuffer duration back to the server via the same WebSocket. If the server is unhappy that the client-side buffer remains too long (e.g. minimum sourceBuffer duration remains over 150 ms for an extended period of time, and we haven't skipped any frames in a while), it just doesn't write the last frame before the IDR into the MP4 and, from an MP4 timestamping perspective, it's like that frame never happened and nothing is missing. At 60 fps and only doing it occasionally this is not easily noticeable, and each frame skip reduces the buffer by about 17 ms. We do the same for the Opus audio (without worrying about IDRs).
In our experience, you can use this to reliably trim the client-side buffer to <70 ms if that's where you want to fall on the latency-vs.-stall tradeoff curve, and the CPU overhead of muxing on a per-client basis is in the noise, but obviously not something today's CDNs will do for you by default. Maybe it's even possible to skip the per-client muxing and just surgically omit the MP4 frame before an IDR (which would lead to a timestamp glitch, but maybe that's ok?), but we haven't tried this. You also want to make sure to go through the (undocumented) hoops to put Chrome's MP4 demuxer in "low delay mode": see https://source.chromium.org/chromium/chromium/src/+/main:med... and https://source.chromium.org/chromium/chromium/src/+/main:med...
We're using the WebSocket technique "in production" at https://puffer.stanford.edu, but without the frame skipping since there we're trying to keep the client's buffer closer to 15 seconds. We've only used the frame-skipping and per-client MP4 muxing in more limited settings (https://taps.stanford.edu/stagecast/, https://stagecast.stanford.edu/) but it worked great when we did. Happy to talk more if anybody is interested.
[If you want lower than 150 ms, I think you're looking at WebRTC/Zoom/FaceTime/other UDP-based techniques (e.g., https://snr.stanford.edu/salsify/), but realistically you start to bump up against capture and display latencies. From a UVC webcam, I don't think we've been able to get an image to the host faster than ~50 ms from start-of-exposure, even capturing at 120 fps with a short exposure time.]
soylentgraham|4 years ago
On the web i was latency down by just sending nalus, and decoding the h264 with a wasm build of broadway, but now with webcodecs (despite some quirks), thats even simpler (and possibly faster too, but depends on encoding with b-frames etc) Of course trying to get lowest latency video, I'm not paying attention to sound atm :)
slhck|4 years ago
GeneticGenesis|4 years ago
We do offer LL-HLS in an open beta today [1], which in the best case will get you around 4-5 seconds of latency on a good player implementation, but this does vary with latency to our service's origin and edge. We have some tuning to do here, but best case, the LL-HLS protocol will get to 2.5-3 seconds.
We're obviously interested in using WebRTC for use cases that require more real-time interactions, but I don't have anything I can publicly share right now. For sub-second streaming using WebRTC, there are a lot of options out there at the moment though, including Millicast [2] and Red5Pro [3] to name a couple.
Two big questions comes up when I talk to customers about WebRTC at scale:
The first is how much reliability and perceptual quality people are willing to sacrifice to get to that magic 1 second latency number. WebRTC implementations today are optimised for latency over quality, and have a limited amount of customisability - my personal hope is that the client side of the WebRTC will become more unable for PQ and reliability, allowing target latencies of ~1s rather than <= 200ms.
The second is cost. HLS, LL-HLS etc. can still be served on commodity CDN infrastructure, which can't currently serve WebRTC traffic, making it an order of magnitude cheaper than WebRTC.
[1] https://mux.com/blog/introducing-low-latency-live-streaming/ [2] https://www.millicast.com/ [3] https://www.red5pro.com/
majormajor|4 years ago
But that place pulling down the feed usually isn't the streaming service you're watching! There are third parties in that space, and third party aggregators of channel feeds, and you may have a few hops before the files land at whichever "streaming cable" service you're watching on. So even if they do everything perfectly on the delivery side, you could already be 30s behind, since those media files and HLS playlist files have already been buffered a couple times since they can come late or out of order at any of those middleman steps. Going further and cutting all the acquisition latency out? That wasn't something really commonly talked about a few years ago when I was exposed to the industry. It was complained about once a year for the Super Bowl, and then fell down the backlog. You'd likely want to own in-house signal acquisition and build a completely different sort of CDN network.
Last I talked to someone familiar with it, the way stuff that cares about low latency (like streaming video game services) does it is much more like what you talk about with custom protocols.
thrashh|4 years ago
And it broke all my stuff because I was relying on low latency. And I remember reading around at the time — not a single person talked about the loss of a low latency option so I just assumed no one cared for low latency.
torginus|4 years ago
slimscsi|4 years ago
You can deliver all your video via WebRTC with lower latency, but your bandwidth bill will be an order of magnitude higher.
rlyshw|4 years ago
I imagine the (proprietary) stadia implementation is highly tuned to that specific implementation, with tons of control over the video source (cloud GPUs) literally all the way down to the user's browser(modern chrome implementations). Plus their scale likely isn't in the tens of thousands from a single origin. Even still, I continue to be blown away by the production latency numbers achieved by game streaming services.
And my use-case is no use-case or every use-case. I'm just a lowly engineer that has seen this gap in the industry.
itisit|4 years ago
giantrobot|4 years ago
Stick with satellite distribution? You're going to have a devil of a time scaling any sort of real-time streaming over an IP network. Every hop adds some latency and scaling pretty much requires some non-zero amount of buffering.
IP Multicast might help but you have to sacrifice bandwidth for the multicast streams and have support all down the line for QoS. It's a hard problem which is why no one has cracked it yet. You need a setup with real-time capability from network ingest, through peering connections, all the way down to end-user terminals.