top | item 39017607

Web AI Model Testing: WebGPU, WebGL, and Headless Chrome

199 points| kaycebasques | 2 years ago |developer.chrome.com

46 comments

order
[+] refulgentis|2 years ago|reply
Real, but naive, question: does TensorFlow have meaningful share outside Google? I've been in the HuggingFace ecosystem and it's overwhelmingly PyTorch, IIRC 93%, (I can't find the blog post that said it, but only gave it 2 minutes)
[+] hatthew|2 years ago|reply
TF used to be the most popular framework by a large margin, so a lot of things that were started 5+ years ago are still on it. PyTorch is most popular in places that only started more recently or have the ability to switch easily, e.g. new startups, research, LLMs, education, and companies that have the resources to do a migration project.
[+] summerlight|2 years ago|reply
A fun thing is that even in Google JAX is now preferred across researchers and slowly taking over the share.
[+] nwoli|2 years ago|reply
Best alternative for web imo (perf generally beats onnx for web)
[+] tcper|2 years ago|reply
Everything will run on a browser, eventually
[+] LoveMortuus|2 years ago|reply
Since the browsers work as a sort of unified experience (to some extent) that would be quite good. But sadly, I haven't seen the wide adoption of PWA or similar technology. Most usually just create their own app, which in many cases really isn't even needed, since the app is just a wrapped version of their website.
[+] rubatuga|2 years ago|reply
Hopefully this will solve some of the incompatibility with training models on AMD vs NVIDIA. Just use Google Chrome.
[+] bhakunikaran|2 years ago|reply
A question that comes to mind is: How significant is the performance difference between using CPUs and GPUs for these machine learning models in web applications, and are there specific types of applications where one would significantly outperform the other?
[+] FL33TW00D|2 years ago|reply
Very significant in the current paradigm.
[+] FL33TW00D|2 years ago|reply
This can also be done in Rust using the excellent `wasm_bindgen_test`!
[+] not_a_dane|2 years ago|reply
AFAIK, there is still a memory barrier in chrome which is set to 4gb per tab.
[+] jmayes|2 years ago|reply
Hello there, I am one of the authors of the piece. Fun fact just for the lols we have tried running a 1.3B parameter unoptimized TensorFlow.js model in this system just to see if it would work (could be much more memory efficient with tweaks), and it does. It uses about 6GB RAM and 14GB VRAM when using V100 GPU on Colab (15GB VRAM limit) but runs pretty fast otherwise once the initial load is complete! Obviously plenty of room to make this use much less memory in the future - we just wanted to check we could run such things as a test for now.
[+] jsheard|2 years ago|reply
At least on desktop you generally know where the line is, on mobile there's a mystery limit you're not allowed to cross, and you're also not allowed to know where the line is until you reach it, which might gracefully throw an error or might result in your tab being force-killed, and you're not allowed to know which of those will happen either.
[+] abxytg|2 years ago|reply
I hate it so much. So arbitrary and capricious. I would say this is currently the number one blocker for the web as a serious platform. And they're doing it on purpose.
[+] FL33TW00D|2 years ago|reply
This is a 7B parameter model at int4, lots to play with!
[+] sylware|2 years ago|reply
Isn't that exactly the modern, AI based, mouse and keyboard BOT? (trained with click farms)
[+] lxe|2 years ago|reply
I think better SIMD support for webassembly is more inclusive than relying on / expecting WebGPU
[+] jmayes|2 years ago|reply
For this blog post we are using Chrome for the testing environment which has WebGPU turned on by default now and other common browsers should hopefully follow suit, but given we are using Chrome here we know WebGPU will be available if the WebAI is using that (which many people are turning to for diffusion models and LLMs as its so much faster to run those types of models).

But yes, I am all for better support on all the things too, we have many WASM users too, and when anything new comes out there, this set of instructions can still be used to leverage testing that too as its just Chrome running on Linux essentially with the right flags set.

[+] NavinF|2 years ago|reply
CPU inference is 10x slower. Not good enough for most use cases