top | item 28253955

(no title)

vqc | 4 years ago

(cofounder here)

@saulrh,

To clarify, are you describing something to the effect of: [whatever WakaSaba is doing] => Browser Extension => Webapp version of $EnterpriseVideoconferencingSuite?

This is not something we have explored. If you had to hazard a guess, what $EnterpriseVideoconferencingSuite would be most amenable to this sort of work?

Thanks!

vince@wakasaba.com

discuss

saulrh|4 years ago

Pretty much, yes. WakaSaba's major offering - at least, that's what the headline and demo video focused on - seems to be a new control toolkit for video meetings, and I feel like it should be possible to get a good chunk of that without doing all the hard to work to develop, host, and maintain your own videoconferencing suite. Like, say I wanted to advance my slideshow without walking over to the keyboard on the podium; I could build a fancy smart remote that talks to the cloud with its own GSM radio and interfaces with a novel presentation suite specifically designed to integrate with my remote, or I could make my remote a bluetooth keyboard that can only press the left and right arrow keys and then keep PowerPoint focused.

I don't know what enterprise conferencing suite would be easiest to talk to. It'd be straightforward if I wanted to use a local video stream, proxy `navigator.mediaDevices.getUserMedia` and generate synthetic click/keyboard events as necessary [1] or make like the remote and pretend to be a keyboard, but the really interesting thing would be to run WakaSaba on a second device and use it to control my "primary" presence. I took a quick spin through some different videoconferencing suites to see what looked doable:

FaceTime: No API, no client SDK, no browser client, dead end.

Google Meet: No API, no client SDK, but a web client. If they're using html5 video elements (which I haven't checked) you could probably intercept the streams, correlate them with the user information displayed on each card, and then associate them, but without an API or client SDK you wouldn't be able to do anything on the first device from the second device. Probably the easiest PoC for a first-device demo (that is, run the hypothetical WabaSaba extension on the same machine you're controlling), given that it's a webapp first and there're a ton of extensions that augment/tweak it to copy code from.

Discord: Very nice API - you can even implement audio clients! - but the video API isn't documented and looks like a hand-rolled solution in Elixir [2]. They do have a browser client, though, so if they're using HTML5 video elements and you can capture the source, might be your best bet. That said, I'm not sure the API lets you remote-control a second session. If you can capture the video stream this looks like your second-best bet for a first-device PoC.

Zoom: Like Discord - APIs and client SDKs that look like they'd permit remote-control, but can't see any support for video so you'd have to intercept video from their webapps.

Oh wow. Okay. I found a product that wants to do half of this - otter.ai does live transcription of meeting audio - and they seem to have given up on it entirely and just abused the analog loophole. Sooooo maybe the only way to get remote-control from a second device would be to reach out to a videoconferencing provider and ask for a privileged integration like otter seems to have gotten with Zoom, lol.

1. Grab this chrome extension and inspect the code, that's what they're doing to inject their overlays: https://chrome.google.com/webstore/detail/visual-effects-for...

2. https://medium.com/tenable-techblog/lets-reverse-engineer-di...

3. https://help.otter.ai/hc/en-us/articles/360060292793-Transcr...