Sony released some software a couple of months ago that lets you use most of their DSLRs as webcams with USB. My goodness, paired with a fast lens, what a difference to my MacBook webcam, even with these ml blurred backgrounds!
It's only 720p and around 15fps but real shallow dof, very little sensor noise, autofocus works. Well worth trying if you have a Sony camera from the last few years.
Sensor size and good optics still wins. Having said that,the effort and detail gone into this feature is very impressive, enjoyed the blog post. Also webassembly SIMD looks super cool, looking forward to a new class of webapps using wasm.
I recently tried to get a setup similar to this with a Fujifilm X-T20 I had lying around, remembering that Fujifilm announced similar software. Alas, that software only works with their higher end models.
I ended up getting a $10 HDMI USB capture stick from Aliexpress. I get a perfect 1080p/60fps signal, and at least on Linux it worked out of the box with Zoom.
The only problem now is that most of my meetings start with "wow, why do you look like you're on TV?"
Canon did too! Definitely a huge upgrade over a typical webcam.
I'm using my old T1i which can be had for less than $50 these days, plus you can pick up a 18-55mm kit lens for like $20 and the video quality blows away any webcam, especially for the same price. Also recommend battery->power adapter.
Canon and Nikon do too. In practice, the quality bump is nice, but we are still talking of a fairly low res/bit rate when it gets through Zoom so the end result is fairly underwhelming. As far as what the other people see on their wnd.
Woot. Thanks for pointing this out - I looked for a solution a while back and it seemed like I had to get a separate capture card to connect my Sony DSLR. Will go check this out now.
(I ended up having to buy a little logitech webcam, which has been fine, but being able to pick my lens etc is awesome!)
I use my Android (Redmi Note 8 Pro) primary cam (720p I think) using Droidcam and it works like a charm on Linux.
I also tried gPhoto2/ffmpeg and virtual cam driver with Nikon D5200 (USB) on Linux but I prefer the Redmi since I do not have a decent low light lens for my DSLR.
Having used both Zoom and Meet extensively now for the past 6 months, my experience is:
1/ Your internet connection, especially upload bandwidth and latency matter a lot.
2/ Zoom's desktop app performs very well, but its web version is atrocious. Not just because of the dark patterns they use to force you to install the desktop app, but also its performance is terrible compared to its desktop version, as well as worse than almost everything else. Unfortunately, I don't trust them and refuse to use their desktop app on anything but my iPad.
3/ Meet used to be bad like Zoom on web 6 months ago, but has improved a lot and is slowly approaching Zoom desktop in performance. I have noticed that Meet on my work GSuite calls at work perform much better than on my personal account. This might be explained by #1 above I.e. my family has worse internet connections than my coworkers, but I am not sure if all improvements have been rolled out to personal accounts.
> 1/ Your internet connection, especially upload bandwidth and latency matter a lot.
I moved to a new house, and the quality of my video calls dropped dramatically. Constant freezing and dropouts. It was extremely frustrating to try to participate in a meeting. I could receive fine, but anytime I spoke out, I would drop out within minutes.
Speed tests showed plenty of bandwidth, but my modem statistics showed high upstream power levels, occasionally out of the allowed range, and lots of "uncorrectable" packets.
I finally got a Comcast technician in to look at it (yay for business-class support), and they replaced the cable from the pole all the way to the first splitter in the basement, and since then it's been flawless. 100/15 Megabit service has been totally adequate for our needs, so long as it's reliable and the latency is low enough.
It kills me that our city isn't putting in conduits or fiber while doing utility work, though. The whole time that was happening, there were gas contractors opening the street and running new supply lines to every house, but not putting in any extra conduits or dark fiber. The construction sounds were almost like being back in the office...
>1/ Your internet connection, especially upload bandwidth and latency matter a lot.
It grates me when people claim DSL/cable qualifies as sufficiently good broadband in the US because of the lack of upload bandwidth and slow latency (can add packet loss in here too). The situation is so bad that you can't even find how much upload bandwidth so called "broadband" cable ISPs offer.
The experience on symmetric fiber connections is noticeably improved, and we can have a house with a whole group of people streaming video up and down simultaneously without a hiccup. Such as in times of work from home and school from home.
Disclosure: I work on Google Cloud (but not Meet).
For the last item, personal accounts (only?) default to send and receive video at lower resolution (360p). So if you meant that the quality is lower, you can set it on both sides to 720p.
Edit: I don’t think Meet remembers those settings though, so you have to do it every time (and show your family members how to do so).
Meet certanly rolls out improvements for GSuite before public ones. I think there's even a GSuite setting of "release channel" where you can control how early you get these improvements.
I refuse to install Zoom. They have removed the dark pattern, and the "join via browser" option is almost immediately available. If you have it installed, now is a good time to uninstall it.
The example video clips in the post look nothing like me and my team's view when using the new feature. Most of the time half of our hair gets blurred or replaced and hand gestures will cause either our hands or head to disappear.
I can vouch for this. I haven’t really needed the background blur feature personally, but I’ve tried it and both myself, colleagues, and friends — pretty much everyone I’ve talked to that has used it — loathe Google Meet’s background blur, and prefer Zoom’s by far.
In my experience, it doesn’t completely cover the background most of the time, and if you move at all, as you point out, it can’t keep up.
Kind of funny to see Google engineering blogging about it when it feels extremely half baked.
This makes me sad, because in all other areas, I think Meet excels well beyond the competition.
At least for background blur the latency there is enough to make it almost unusable: easily over 100ms. This is with latest stable Chrome on a relatively recent Ryzen/Nvidia system. Maybe background replacement will do better once it rolls down to regular Google Meet (too lazy to log into my Google <del>Apps</del> <del>Suite</del> Workspace) :-) However, everything else about Google Meet is great and I wish I could make all my Zoom friends switch.
It seems to have gotten a little better recently, but my experience matches yours. It really struggles when I wear over-ear headphones - they sort of phase in and out of existence.
The other thing I've noticed is the background blur absolutely annihilates my CPU. To the point where I would rather just turn off my camera if I don't want my background visible.
They have their example video clips, but they also provide data. They say that in their better model, They get an IoU of 93.8% This means 6.2% of pixels are misclassified. Either it's your hair getting cut off or the background is leaking through. 6.2% of an image is a fair bit considering your head is probably 30% of the frame.
I'm wondering why they didn't just use standard CV techniques like background subtraction? Does their technique work with a dynamic background as well?
Aside: Imagine you’re driving down the road and you need to make a right turn. Well, for some reason the steering wheel is stowed away and disappeared! You need to hover your hand around the center console in a specific area to be able to expose it. Out comes the steering wheel and now you can make a right turn.
Google UX/UI team: Please fucking make the mute/unmute button visible at all times.
Isn't this sort of a Fizz Buzz for a UX/UI design professional? I don't mean to demean anyone, but I see this sort of a thing literally everywhere. Hiding important and absolutely crucial information (that can make or break your product) in the name of minimalism. Coming out of a company that has one of the highest hiring bars for software engineering, and yet, their products have such an awful UX/UI. This isn't an exception, it is a pattern.
More important than the button is the status indicator - I need to know if the call is muted or not. Even better, promote it to an OS-level icon/badge/overlay. If my mic is actively in use, please make it blindingly obvious.
The only software that gets vide-co right is probably Discord
I used MS Teams and zoom and both are decent (ms teams works fine for school)
but it's insanely unbelievable that this kind of software lacks of features that gaming communities had probably 20 years ago
PUSH TO TALK is probably one of the most important features of any voice software. The lack of it is big WTF.
It gives you 100% control over when you're talking and you don't have to alt-tab between programs in order to "mute" yourself.
You can bind it to e.g MOUSE3 (scroll-push) and it works fine with other programs, games and stuff. Switching between muted/unmuted is different thing.
From somebody who uses/used ventrilo, mumble, teamspeak and nowadays discord for like last 12 years for hours per day, almost everyday.
It’s even worse on touch devices. You have to touch the bottom screen to get the controls to appear. Accident touch twice in the wrong location and you can hang up.
The mute/unmute changes position and can be hidden in a top bar that slides out.
In some fullscreen situations there is no button to get out of fullscreen. Sometimes double-click works, sometimes it doesn't. Recently I could not even alt-tab away, basically my computer got 'locked' by zoom.
Speaking of mute/unmute I've not yet found a way to get Google Hangouts (same thing as Meet?) to play nice in situations where simultaneous interpretation is involved. Our company works in Japanese and English and we typically have a second meeting running in parallel for interpretation. This setup almost works, I say almost because I've yet to find a way of muting the audio in one meeting so I can properly listen to the other. I can't leave the first meeting either because often I'll also want to see the presentation slides. Currently I'm working around this by muting my MacBook and joining the second meeting on my phone.
Perhaps I'm missing something obvious (or a Chrome plugin that will allow me to mute based on the page URL rather than site). In the unlikely event that a Googler is reading this I'm not asking for yet another product or complicated new piece of functionality aimed at this specific use case. Just a mute button for audio. Thanks!
A major motivation why I got a StreamDeck was to be able to put a big fat mute button that "physically" kills the microhone level at the source.
It renders a big cross through the microphone when muted.
Simple, yet insanely effective UI (#).
Best thing ever.
#) Especially when compared to the mess that is Google Meet. My favourite "feature" of theirs is how when someone is presenting, it's impossible to view the presentation as just another stream - no they have to make it dominate everything, meaning it's so hard to see the other team members.
And it can be extremely hard to see who's talking when viewing a lot of cameras at the same time. And for whatever reason the quality turns to a blurry mess a far cry from 720p just way too often. (I have fibre internet).
When did you recently use Meet? I just used it yesterday with a gaming session with friends and the console for the mute / unmute was visible at all times. I even just tried it right now.
While you're at it, always display a vu-meter. It gives feedback on what is transmitted and thus can alert a user whether they are being heard or not. It's the most basic of sound recording tools, and was a standard part of recording equipment for over half a century for good reason.
And if you need minimalism, offer a toggle for that. But I think most people should have it forced on them, would save anyone a lot of trouble -- just think about all the aggregate time lost talking into a muted mike by all users.
We are in the era of three seashells. There is no turning back from this. Soon you won't be able to find the power button for anything tech industry related.
MS Teams has finally changed this on their video calls. Ah the hours I spent telling colleagues 'If you move your mouse around, you should see a black bar appear somewhere near the middle llof the screen'.
Happy to see ML become mainstream. In the future, I don't think ML will be a separate field of programming. It'll just be "programming," the same way webdev is.
There's a tendency to think of ML as "not programming," or something other than just plain programming. But as the tooling matures, that'll go away.
(Lisp used to be considered "AI programming," till it became useful in many other contexts.)
ML will become a library. It has about as much to do with programming as a compiler. You don't need to know what it does, you just need to know how to make it do things. The problem with ML currently is that nobody really knows how to do things and that you have a million parameters that need tuning and most algorithms need continuous improvement and fine tuning to the use case. There is nothing "mainstream" about ML at this point, except that everyone wants to use it.
In maybe a decade, it might be found in standard libraries of programming languages and on top of things like `Math.abs`, we will have `ML.textToSpeech("Hello world")`, or `ML.isCat(image)`, etc. However, the problem I see with that is that no matter how far we wind the clock forward, we will only be able to put the most simplistic use cases into a library. `ML.isCat()` could be one of those, since most humans will be able to image categorization, it stands to reason that you could put this into a library. However, most industry application involved highly customized ML algorithms that are optimized for a very specific use-case. So there will always be a need for a research team in big companies at least. Maybe smaller companies will try to build their stuff by chaining libraries together.
AI is learning existing patterns from input/outputs.
Programming is setting up patterns to turn your inputs into desired outputs. Most often it's just plumbing data around with some transformations.
What you're talking about is using AI as programming tools. It's still programming, but using pre-trained models as part of the plumbing.
I am going to admit that Nvidia Broadcast looks absolutely amazing to me. It's likely to be the reason why my next GPU won't be AMD's new, even though it appears to deliver much more bang for the buck.
I already have RTX Voice now and it's the best thing ever.
No, because tech people want software that works, has good UX etc. This is a PR piece for people that prefer software that has cutsie little backgrounds.
I thought the whole point of having a video call is to see who you are talking to, and their environment to further enhance the effectiveness of the conversation.
If you are in your kitchen, or under a tree, I definitely would like to see that because that environment will have an effect on how we communicate.
Sometimes people may not be comfortable sharing their backgrounds, and may not have convenient alternatives. For example, if you have a bed in the background it can be awkward and you might want to blur that out.
I don't bother, but then I live in my own home and my background is an empty study.
I have coworkers who are in house shares with 5 other adults all trying to work from home around tiny desks. Background blur for them is a nice way to hide some of the chaos of their living arrangements.
If the apartment is a mess in general. Table full of empty cans of beer. A dildo on a chair. Your wife randomly walking by in her underwear (not sure whether this would be unblurred?).
In the above scenarios, if I'm not certain there aren't going to be ackward things in behind me, I'd want to blur or set a custom background. Back against a wall also works which is what a lot of people seem to be doing.
> In the current version, model inference is executed on the client’s CPU for low power consumption and widest device coverage.
Naively I would think model inference done server side would have the lower CPU power (from the client point of view) and widest device coverage (client does nothing more), what am I missing ?
It is done on the CPU instead of the GPU. GPU would seem like the natural choice for a convolution heavy model but was not used here for the mentioned reasons.
Some work needs to happen locally to show you a preview of what you're going to transmit, as it should for most video related work.
If the segmentation is done server-side, then you need to sync it to the sender and reflect that quickly in the preview. It's probably not a great experience, at least for a launch.
I wish my coworkers would stop using background blur.
It sucks and it’s distracting.
Your hair and hands pop in and out of blur. Sometimes part of your face will blur.
I don’t care if your workspace is messy or your kid walks in the room. I do care that we’re all being distracted by your weirdly blurred hair and hands.
Your co-workers have a reasonable expectation of privacy regarding their home life and family members.
Given that many had to start WfH with short notice meaning they couldn't relocate to circumstances enabling a dedicated home office space blurry hair and hands are a very reasonable compromise.
I find background blur even more distracting than background replacement. It's like my mind tries to picture the person that I am seeing in a particular environment and blur makes that process messy.
But that's not always true tho, I have seen background replacement all over people's face (and yes, I seem to be the only one who thinks that's wrong).
I don't think anyone is being distracted by blurred hair or hands. If your coworkers don't feel comfortable even turning on the camera, it shouldn't matter to you. Aside from edge cases like a modelling agency looking for fresh faces, you have zero right to demand how people choose to potray themselves in a VC call.
> Can we get a mute button visible at all times before 2024?
Is it just me or is the button visible at all times? I could see the button visible on the bottom of the screen at all times I used meet during a session with friends. I even tried it right now to make sure.
They mention SIMD support, but It's unclear to me in what capacity the GPU is leveraged. The hair segmentation example on the MediaPipe webpage suggests it's evaluating the graph on the GPU though.
The "Rendering Effects" section describes it in some detail: "Once segmentation is complete, we use OpenGL shaders for video processing and effect rendering" and some info on what that covers. (OpenGL parts runs on GPU)
It would be nice if there was a webcam on the market that took actual lenses so you could get free, legit depth of field. Paying $700 for a used DSLR that has a clean hdmi out is not appealing, especially when I have a mirrorless from the same company that could probably do the same with a firmware update (that will never come)
I think a cheaper solution would probably just be a depth sensing camera. Even a developer targeted Intel RealSense kit is only like $150. Consumer hardware could be much cheaper I imagine.
Once you have depth information integrated with a camera, then it should be pretty trivial to do background removal.
Whereas a 35mm f1.8 from Nikon is like $200 and whatever you mount it to is still going to need to do auto focusing and a bunch of other camera-y stuff to make it accessible to non photo geeks and then you’re going to need an off camera microphone so the entire call isn’t listening to your autofocus motor and...
Meet is business oriented and offers features that Hangouts does not, e.g. dialing in via phone. It also requires a G Suite account (or did before COVID, IIRC).
Here's a tip: take a picture of your real, actual background from the POV of your webcam, and set that as your meeting background.
Advantages: it looks natural, it covers whatever is going on behind you (in case you are not alone and people walks by, or if your living room is messy), and it blends better than fake backgrounds (because it's the same image behind it). I have a picture of my office that I use both at home and at my real office, and most people can't tell. And since I took the picture with my phone which has better resolution, my video feed looks better for cheap.
The single biggest missing feature compared to Zoom for my team is background noise cancellation. It's an unfortunate decision to limit it to Enterprise users.
I was going to point out that xnnpack was basically created by a single guy who also created qnnpack, and how amazing it is for the work of a single guy to have so much impact, then I realized he posted it! Congratz dude!
As in, the blurred background looks totally different (light:dark, shapes, etc.) to the unblurred background.
(I get that they’d need to do something funky to show blurred and unblurred backgrounds with the same foreground video, and faking it is likely easier than doing it programmatically, but this is just odd/sloppy.)
If you have a Windows computer with a RTX graphic card, you can use nvidia broadcast to get similar perks. It creates a virtual camera that you can select in whatever conference apps/browsers you are using.
There are some works on OBS to get the green screen AI working, so I hope we will get that on GNU/Linux one day.
The listed CPU usage / elapsed time for the features in this article is obscene. Only 62FPS = maxing out at least one core on a 60hz display, just to replace/blur a background. Kiss your laptop's battery goodbye. How is this worth it?
Why isn't Mediapipe built on gstreamer? Nvidia gets this right. If you're slinging frame buffers around, use an API that there is already an ecosystem for.
A few people commented that the foreground/background detection cannot keep up with movements fast enough. Here's an idea that might help, although I'm not sure if it can realistically be done:
When the video is encoded, the codec does motion estimation (among other things) to reduce the bandwidth required. So why don't we use the motion vectors from the video codec to modify the foreground/background mask in real time? Obviously this is going to create weird artifacts pretty soon, but it might just be good enough for a few frames before the ML model produces another accurate mask.
I have observed in the last couple months that whenever I create a Google Calendar invite with others, Google has started inserting a Google Meet conference as the location to meet.
It was one thing to ask/offer this as an option if you'd like to use it, but now Google is positioning it as if you had chosen that. So if you left it empty, because you usually use some other understood method with your friends/colleagues, now your participants are confused and think you wanted to use Google Meet.
I think that's going too far to get people to adopt your product.
I noticed this too, but I actually got a tooltip popup that notes this is something that can be disabled in the calendar settings. The specific checkbox is "Automatically add Google Meet video conferences to events I create"
Disclaimer: I work at Google but not on these products.
Edit: it seems the tooltip only appears the first time you try to add Meet. After that it doesn't appear and you have to go into settings.
It’s funny how Google pours time into things like this but the last person I know who uses a Google chat product just stopped because it’s less reliable than Zoom. Losing 15 minutes with someone trying to get the sound working counts more than a gimmick many people never notice, not to mention now even normal people don’t want to yet install another app because they expect it to be cancelled soon.
dharma1|5 years ago
It's only 720p and around 15fps but real shallow dof, very little sensor noise, autofocus works. Well worth trying if you have a Sony camera from the last few years.
Sensor size and good optics still wins. Having said that,the effort and detail gone into this feature is very impressive, enjoyed the blog post. Also webassembly SIMD looks super cool, looking forward to a new class of webapps using wasm.
piquadrat|5 years ago
I ended up getting a $10 HDMI USB capture stick from Aliexpress. I get a perfect 1080p/60fps signal, and at least on Linux it worked out of the box with Zoom.
The only problem now is that most of my meetings start with "wow, why do you look like you're on TV?"
kenhwang|5 years ago
I'm using my old T1i which can be had for less than $50 these days, plus you can pick up a 18-55mm kit lens for like $20 and the video quality blows away any webcam, especially for the same price. Also recommend battery->power adapter.
jmarcher|5 years ago
Quarrel|5 years ago
(I ended up having to buy a little logitech webcam, which has been fine, but being able to pick my lens etc is awesome!)
brainless|5 years ago
I also tried gPhoto2/ffmpeg and virtual cam driver with Nikon D5200 (USB) on Linux but I prefer the Redmi since I do not have a decent low light lens for my DSLR.
tsycho|5 years ago
1/ Your internet connection, especially upload bandwidth and latency matter a lot.
2/ Zoom's desktop app performs very well, but its web version is atrocious. Not just because of the dark patterns they use to force you to install the desktop app, but also its performance is terrible compared to its desktop version, as well as worse than almost everything else. Unfortunately, I don't trust them and refuse to use their desktop app on anything but my iPad.
3/ Meet used to be bad like Zoom on web 6 months ago, but has improved a lot and is slowly approaching Zoom desktop in performance. I have noticed that Meet on my work GSuite calls at work perform much better than on my personal account. This might be explained by #1 above I.e. my family has worse internet connections than my coworkers, but I am not sure if all improvements have been rolled out to personal accounts.
wffurr|5 years ago
I moved to a new house, and the quality of my video calls dropped dramatically. Constant freezing and dropouts. It was extremely frustrating to try to participate in a meeting. I could receive fine, but anytime I spoke out, I would drop out within minutes.
Speed tests showed plenty of bandwidth, but my modem statistics showed high upstream power levels, occasionally out of the allowed range, and lots of "uncorrectable" packets.
I finally got a Comcast technician in to look at it (yay for business-class support), and they replaced the cable from the pole all the way to the first splitter in the basement, and since then it's been flawless. 100/15 Megabit service has been totally adequate for our needs, so long as it's reliable and the latency is low enough.
It kills me that our city isn't putting in conduits or fiber while doing utility work, though. The whole time that was happening, there were gas contractors opening the street and running new supply lines to every house, but not putting in any extra conduits or dark fiber. The construction sounds were almost like being back in the office...
lotsofpulp|5 years ago
It grates me when people claim DSL/cable qualifies as sufficiently good broadband in the US because of the lack of upload bandwidth and slow latency (can add packet loss in here too). The situation is so bad that you can't even find how much upload bandwidth so called "broadband" cable ISPs offer.
The experience on symmetric fiber connections is noticeably improved, and we can have a house with a whole group of people streaming video up and down simultaneously without a hiccup. Such as in times of work from home and school from home.
boulos|5 years ago
For the last item, personal accounts (only?) default to send and receive video at lower resolution (360p). So if you meant that the quality is lower, you can set it on both sides to 720p.
Edit: I don’t think Meet remembers those settings though, so you have to do it every time (and show your family members how to do so).
k__|5 years ago
Meet was much worse than Zoom, even when I take the bad web interface of Zoom into account.
I ain't a fan of either, though.
izacus|5 years ago
zamalek|5 years ago
I refuse to install Zoom. They have removed the dark pattern, and the "join via browser" option is almost immediately available. If you have it installed, now is a good time to uninstall it.
jtokoph|5 years ago
samtheprogram|5 years ago
In my experience, it doesn’t completely cover the background most of the time, and if you move at all, as you point out, it can’t keep up.
Kind of funny to see Google engineering blogging about it when it feels extremely half baked.
This makes me sad, because in all other areas, I think Meet excels well beyond the competition.
EDIT: removed my general sentiment on Google
madars|5 years ago
gundmc|5 years ago
The other thing I've noticed is the background blur absolutely annihilates my CPU. To the point where I would rather just turn off my camera if I don't want my background visible.
cameldrv|5 years ago
chrischen|5 years ago
neilpanchal|5 years ago
Google UX/UI team: Please fucking make the mute/unmute button visible at all times.
systemvoltage|5 years ago
askvictor|5 years ago
tester34|5 years ago
I used MS Teams and zoom and both are decent (ms teams works fine for school)
but it's insanely unbelievable that this kind of software lacks of features that gaming communities had probably 20 years ago
PUSH TO TALK is probably one of the most important features of any voice software. The lack of it is big WTF.
It gives you 100% control over when you're talking and you don't have to alt-tab between programs in order to "mute" yourself.
You can bind it to e.g MOUSE3 (scroll-push) and it works fine with other programs, games and stuff. Switching between muted/unmuted is different thing.
From somebody who uses/used ventrilo, mumble, teamspeak and nowadays discord for like last 12 years for hours per day, almost everyday.
adrr|5 years ago
TacoSteemers|5 years ago
The mute/unmute changes position and can be hidden in a top bar that slides out. In some fullscreen situations there is no button to get out of fullscreen. Sometimes double-click works, sometimes it doesn't. Recently I could not even alt-tab away, basically my computer got 'locked' by zoom.
https://tacosteemers.com/articles/2020-10-16-ux-anti-pattern...
dawnerd|5 years ago
vosper|5 years ago
Not that you should have to install an extension to get basic UX
helpfulgoogler|5 years ago
tjpnz|5 years ago
Perhaps I'm missing something obvious (or a Chrome plugin that will allow me to mute based on the page URL rather than site). In the unlikely event that a Googler is reading this I'm not asking for yet another product or complicated new piece of functionality aimed at this specific use case. Just a mute button for audio. Thanks!
sundvor|5 years ago
It renders a big cross through the microphone when muted.
Simple, yet insanely effective UI (#).
Best thing ever.
#) Especially when compared to the mess that is Google Meet. My favourite "feature" of theirs is how when someone is presenting, it's impossible to view the presentation as just another stream - no they have to make it dominate everything, meaning it's so hard to see the other team members.
And it can be extremely hard to see who's talking when viewing a lot of cameras at the same time. And for whatever reason the quality turns to a blurry mess a far cry from 720p just way too often. (I have fibre internet).
amf12|5 years ago
himinlomax|5 years ago
And if you need minimalism, offer a toggle for that. But I think most people should have it forced on them, would save anyone a lot of trouble -- just think about all the aggregate time lost talking into a muted mike by all users.
leeoniya|5 years ago
nurettin|5 years ago
rplnt|5 years ago
But you will hit a dog probably, because the steering wheel suddenly blocks your view too.
three_seagrass|5 years ago
When I leave a meeting, can you please stop asking me for feedback every time and just take me back to the main meet screen?
It would be so easy just to put that small dialogue box on the main meet screen rather than prompt me to click the button to return.
wdr1|5 years ago
MarkyC4|5 years ago
chedabob|5 years ago
howlgarnish|5 years ago
Doesn't excuse the UI, but at least this lets you avoid using it!
gogopuppygogo|5 years ago
I bought an external microphone for my laptop with a hardware mute button.
on_and_off|5 years ago
eugmill|5 years ago
I still can't stand the bottom popping up and down and not being able to tell if I'm muted.
Angostura|5 years ago
swiley|5 years ago
sillysaurusx|5 years ago
There's a tendency to think of ML as "not programming," or something other than just plain programming. But as the tooling matures, that'll go away.
(Lisp used to be considered "AI programming," till it became useful in many other contexts.)
sltEvas|5 years ago
In maybe a decade, it might be found in standard libraries of programming languages and on top of things like `Math.abs`, we will have `ML.textToSpeech("Hello world")`, or `ML.isCat(image)`, etc. However, the problem I see with that is that no matter how far we wind the clock forward, we will only be able to put the most simplistic use cases into a library. `ML.isCat()` could be one of those, since most humans will be able to image categorization, it stands to reason that you could put this into a library. However, most industry application involved highly customized ML algorithms that are optimized for a very specific use-case. So there will always be a need for a research team in big companies at least. Maybe smaller companies will try to build their stuff by chaining libraries together.
m00x|5 years ago
What you're talking about is using AI as programming tools. It's still programming, but using pre-trained models as part of the plumbing.
ilikeerp|5 years ago
[deleted]
kerng|5 years ago
Anyone who uses the blue realizes that it's far lacking in quality from other offerings and Google Meet UI is very bad also.
Zoom, Teams, even WebEx are superior quality and usability wise.
lima|5 years ago
Zoom's web client is particularly terrible, and we can't install the desktop client for security reasons.
And the new background noise cancellation feature is magic.
rplnt|5 years ago
Out of these I'm really surprised how "not as horrible" MS Teams are. Loads of functionality and the UX is bearable.
sundvor|5 years ago
I already have RTX Voice now and it's the best thing ever.
https://www.nvidia.com/en-au/geforce/news/nvidia-broadcast-a...
thinkloop|5 years ago
Are they able to change the bg in the browser?
obilgic|5 years ago
toper-centage|5 years ago
loosescrews|5 years ago
spurgu|5 years ago
Jitsi also has background blur but it's only ok-ish on Chrome and unusably slow on Firefox.
mike_kamau|5 years ago
I thought the whole point of having a video call is to see who you are talking to, and their environment to further enhance the effectiveness of the conversation.
If you are in your kitchen, or under a tree, I definitely would like to see that because that environment will have an effect on how we communicate.
gerbler|5 years ago
adwww|5 years ago
I have coworkers who are in house shares with 5 other adults all trying to work from home around tiny desks. Background blur for them is a nice way to hide some of the chaos of their living arrangements.
spurgu|5 years ago
In the above scenarios, if I'm not certain there aren't going to be ackward things in behind me, I'd want to blur or set a custom background. Back against a wall also works which is what a lot of people seem to be doing.
hrktb|5 years ago
> In the current version, model inference is executed on the client’s CPU for low power consumption and widest device coverage.
Naively I would think model inference done server side would have the lower CPU power (from the client point of view) and widest device coverage (client does nothing more), what am I missing ?
jonex|5 years ago
Orphis|5 years ago
If the segmentation is done server-side, then you need to sync it to the sender and reflect that quickly in the preview. It's probably not a great experience, at least for a launch.
blauditore|5 years ago
nostromo|5 years ago
It sucks and it’s distracting.
Your hair and hands pop in and out of blur. Sometimes part of your face will blur.
I don’t care if your workspace is messy or your kid walks in the room. I do care that we’re all being distracted by your weirdly blurred hair and hands.
janekm|5 years ago
Given that many had to start WfH with short notice meaning they couldn't relocate to circumstances enabling a dedicated home office space blurry hair and hands are a very reasonable compromise.
josalhor|5 years ago
But that's not always true tho, I have seen background replacement all over people's face (and yes, I seem to be the only one who thinks that's wrong).
tziki|5 years ago
hota_mazi|5 years ago
Can we get a mute button visible at all times before 2024?
amf12|5 years ago
Is it just me or is the button visible at all times? I could see the button visible on the bottom of the screen at all times I used meet during a session with friends. I even tried it right now to make sure.
arketyp|5 years ago
jonex|5 years ago
jcims|5 years ago
nucleardog|5 years ago
Once you have depth information integrated with a camera, then it should be pretty trivial to do background removal.
Whereas a 35mm f1.8 from Nikon is like $200 and whatever you mount it to is still going to need to do auto focusing and a bunch of other camera-y stuff to make it accessible to non photo geeks and then you’re going to need an off camera microphone so the entire call isn’t listening to your autofocus motor and...
chdjakdkgb|5 years ago
samtheprogram|5 years ago
hoveringhen|5 years ago
easytiger|5 years ago
senectus1|5 years ago
We're being herded into the new more useless products.
vinhboy|5 years ago
I also think it makes the subject look better for some reason.
probably_wrong|5 years ago
Advantages: it looks natural, it covers whatever is going on behind you (in case you are not alone and people walks by, or if your living room is messy), and it blends better than fake backgrounds (because it's the same image behind it). I have a picture of my office that I use both at home and at my real office, and most people can't tell. And since I took the picture with my phone which has better resolution, my video feed looks better for cheap.
amq|5 years ago
adioe3|5 years ago
sercand|5 years ago
Nimitz14|5 years ago
mft_|5 years ago
https://1.bp.blogspot.com/-viEA4OY0sxA/X5s7IBwoXOI/AAAAAAAAG...
As in, the blurred background looks totally different (light:dark, shapes, etc.) to the unblurred background.
(I get that they’d need to do something funky to show blurred and unblurred backgrounds with the same foreground video, and faking it is likely easier than doing it programmatically, but this is just odd/sloppy.)
germandude123|5 years ago
The right clip is an example of background replacement.
This is why the blurred background on the left does not look anything like the unblurred background on the right.
Jyaif|5 years ago
kmisiunas|5 years ago
rkagerer|5 years ago
Although there's a lot of blurring on the shoulder of the guy at the beach: https://i.imgur.com/D5ueGUh.png
wdroz|5 years ago
There are some works on OBS to get the green screen AI working, so I hope we will get that on GNU/Linux one day.
kevingadd|5 years ago
lern_too_spel|5 years ago
Liskni_si|5 years ago
When the video is encoded, the codec does motion estimation (among other things) to reduce the bandwidth required. So why don't we use the motion vectors from the video codec to modify the foreground/background mask in real time? Obviously this is going to create weird artifacts pretty soon, but it might just be good enough for a few frames before the ML model produces another accurate mask.
supernova87a|5 years ago
I have observed in the last couple months that whenever I create a Google Calendar invite with others, Google has started inserting a Google Meet conference as the location to meet.
It was one thing to ask/offer this as an option if you'd like to use it, but now Google is positioning it as if you had chosen that. So if you left it empty, because you usually use some other understood method with your friends/colleagues, now your participants are confused and think you wanted to use Google Meet.
I think that's going too far to get people to adopt your product.
hongalex|5 years ago
Disclaimer: I work at Google but not on these products.
Edit: it seems the tooltip only appears the first time you try to add Meet. After that it doesn't appear and you have to go into settings.
fx32s|5 years ago
daxfohl|5 years ago
madeofpalk|5 years ago
alblue|5 years ago
mdoms|5 years ago
yjftsjthsd-h|5 years ago
The_rationalist|5 years ago
387032228|5 years ago
[deleted]
consolelog2000|5 years ago
[deleted]
consolelog2000|5 years ago
[deleted]
ZephyrBlu|5 years ago
[deleted]
ibuildthings|5 years ago
A personal anecdote is that a few years back the automatic door sensors in my university did not work on my skin tone.
mdoms|5 years ago
dharma1|5 years ago
acdha|5 years ago
sjs7007|5 years ago