> By keeping computation and memory on a single wafer-scale processor, we eliminate the data-movement penalties that dominate GPU systems. The result is up to 15× faster inference, without sacrificing model size or accuracy.
I hope all AI will reach 300ms response times, including 200 line diffs. Querying a million rows or informing user that a codebase is wrong used to take minutes but now happen instantly.
remusomega|1 month ago
Alifatisk|1 month ago
https://xcancel.com/andrewdfeldman/status/201154226777402186...
2001zhaozhao|1 month ago
alcasa|1 month ago
kristianp|1 month ago
kingstnap|1 month ago
Guessing the plan might be for voice AI. That stuff needs to be real snappy.
aitchnyu|1 month ago
e40|1 month ago
whateverboat|1 month ago