Ultra Low Latency WebRTC Streaming – Open-Source Media Server

[+] kwindla|7 years ago|reply

This looks to be nicely architected. Looking forward to digging in.

In answer to the questions about TURN, this approach won't be lower latency than TURN but will scale better. TURN servers are just (nearly) passive relays, so the sending client needs to set up as many outbound streams as there are receiving clients in the session.

The advantage to the TURN approach is that you can do end-to-end encryption.

The server this post is about "forwards" streams. It's in a class of infrastructure traditionally called a "Selective Forwarding Unit." So the sending client can send one stream, and the SFU copies the stream and sends it to the receiving clients. The server needs more CPU and outgoing bandwidth as the receiver count grows, but the sending client doesn't need much of either.

Once you have streams piping through a server, you can do lots of other things, of course. This server can transcode to several formats, including HLS and RTMP, and can copy streams internally across a cluster to scale more than a single machine can.

Media servers are really fun to work on: lots of small(-ish) hard problems that touch low-level network protocols, memory and cpu optimization, architecture (because eventually you want a clean plugins interface, etc.).

If you're interested in this stuff, check out Mediasoup, an open source WebRTC SFU with a very nice design in which all the low-level stuff is c++ and all the high-level interfaces are exposed as nodejs objects.[0]

[0]-https://mediasoup.org/

[+] tpetry|7 years ago|reply

The sad part about selective forwarding units (SFU) is you loose end-to-end-encryption. So you only have encryption between you and the sfu and sfu to your participant. For hosted solutions this means a server could record all your conferences.

Therefore i am waiting for the PERC standard. It will allow to send the audio/video to server only once and the server will retransmit to all peers - everything still end-to-end encrpted.

[+] sansnomme|7 years ago|reply

How's the performance and features of mediasoup compared to this? I have been looking for an easy to use DFU.

[+] kodablah|7 years ago|reply

What makes this approach lower latency than, say, a centralized TURN server in one of those clouds?

[+] adreamingsoul|7 years ago|reply

From my understanding, the current challenge with ultra-low-latency is scale. Will it scale to 100, 1000, 100,000, concurrent users?

At the moment, everyone in the industry is working to figure out how to provide a viewer experience that is similar, or better than the current expectations of live streaming.

Live streaming is hard. Ultra-low-latency is just as hard.

However, I'm still not sure why we need ultra-low-latency. Will it improve the live experience? Maybe. Aside from live-action sports or significant events, do we as content consumers really need ultra-low-latency?

[+] maxmcd|7 years ago|reply

I think even with turn there's lots of chatter back and forth. Connected clients can NACK for frames and ask for a bandwidth reduction. I think the expected optimal implementation is that the group of WebRTC participants normalize towards an optimal bitrate.

For 1->n streaming you just want very high quality broadcast->server and then presumably you want the downstream clients to not have the option to send packets back to the broadcaster. Transcoding is hard though, so consumers might be stuck with just the source video quality. I assume this lib just passes the frames on from the source (twitch did just that through almost all of its pre-acquisition growth).

Some of my details might be a little off, but I did spend a few months trying to build something very similar to this with gstreamer. WebRTC was hard to grok.

[+] mekya|7 years ago|reply

Actually it is not lower latency than centralized TURN server. The difference from TURN is that Ant Media Server behaves like peer to each connected clients. Not relaying the video/audio as TURN. Additionally, it can create adaptive bitrates on the server side. It can record streams and control other things on server side.

[+] rkagerer|7 years ago|reply

500ms is low latency? What's usual?

[+] BeefySwain|7 years ago|reply

The best Vimeo/Livestream can do is about 30 seconds from the time the frame hits their server, as a reference.

[+] isostatic|7 years ago|reply

As far as I'm concerned, 500ms is normal latency (once setting aside network delay -- clearly a 200ms one way trip from Europe to NZ will add to that)

I'd say ultra low latency would be sub-frame, so <40ms. Low latency on the 1 frame to half a second, normal upto say 1.5s, and high latency beyond that.

But that's for point to point or multicast circuits, and ISPs don't like multicast as they can't charge for it.

[+] unknown|7 years ago|reply

[deleted]

[+] rasz|7 years ago|reply

Twitch random 1080@60 6K viewers stream

>Latency To Broadcaster: 7.71 sec.

>Latency Mode: Low Latency

[+] naikrovek|7 years ago|reply

Multiple seconds.

[+] gugagore|7 years ago|reply

There's also possible latency savings from having the video encoder be aware of network conditions on a per-frame basis: http://ex.camera but I don't think it's found its way to the "mainstream".

[+] hexane360|7 years ago|reply

There's also this which tried to do something similar: https://snr.stanford.edu/salsify/

Edit: It's the same author

[+] dboreham|7 years ago|reply

And here I was blissfully unaware that I'm experiencing reality 30 seconds delayed. Had no idea, other than obviously there's some # frames delay needed to do motion compression, and some transit latency.

[+] unknown|7 years ago|reply

[deleted]

[+] mgamache|7 years ago|reply

I've used Wowza (video streaming server) for years and this is a direct competitor. The pricing is a little higher for Wowza, but Wowza is a mature product with tons of options for web streaming. The weakness of Wowza has been its support for WebRTC. I am using it, but it's not easy to stream from RTSP/RTMP to WebRTC. It also doesn't scale out for WebRTC

[+] selim17|7 years ago|reply

Test AMS Live Demo in https://antmedia.io/livedemo/ AMS Github Pages: https://github.com/ant-media/Ant-Media-Server Thanks.

[+] rv11|7 years ago|reply

It will help apps which require interaction from users, like google stadia's game streaming. for other use cases, I think one does not care even if the feed is delayed by few seconds.

[+] IshKebab|7 years ago|reply

Another interesting use case someone mentioned here before is the world cup - you don't want a penalty shoot-out ruined by your neighbours cheering a few seconds before you see the goal.

A few seconds would also such for live streaming where the streamer interacts in chat.

[+] MuffinFlavored|7 years ago|reply

How does one-to-many streaming work? If 256 clients connect to one streamer, how does the streamer not need to send his/her stream 256 times?

[+] unknown|7 years ago|reply

[deleted]

[+] pault|7 years ago|reply

That's what the server is for.

[+] fabioyy|7 years ago|reply

The problem is that webrtc is a very intensive CPU ( each connection have its own encription ). How many users can this solution handle per server?

[+] maxmcd|7 years ago|reply

It looks like they support RTMP and a few other streaming methods. Maybe it's just WebRTC for ingress and then it converts to other formats. Given the advertised 500ms of latency I imagine it's not WebRTC->WebRTC.

[+] ec109685|7 years ago|reply

How is encryption expense different from normal https streaming from a cdn?

[+] DiseasedBadger|7 years ago|reply

I've heard enough of this anti-anonymity, anti-privacy protocol.

68 comments