(no title)
trsohmers | 2 years ago
My professional independent observer opinion (not based on my 2 years of working at Groq) would have me assume that their COGS to achieve these performance numbers would exceed several million dollars, so depreciating that over expected usage at the theoretical prices they have posted seems impractical, so from an actual performance per dollar standpoint they don’t seem viable, but do have a very cool demo of an insane level of performance if you throw cost concerns out the window.
[0]: https://www.nextplatform.com/2023/11/27/groq-says-it-can-dep...
tome|2 years ago
Anyone with a serious interest in the total cost of ownership of Groq's system is welcome to email contact@groq.com.
trsohmers|2 years ago
A guarantee to match the cheapest per token prices is sure a great way to lose a race to the bottom, but I do wish Groq (and everyone else trying to compete against NVIDIA) the greatest luck and success. I really do think that the great single batch/user performance by Groq is a great demo, but is not the best solution for a wide variety of applications, but I hope it can find its niche.
Aeolun|2 years ago
John doe and his friends will never have a need to have their fart jokes generated at this speed, and are more interested in low costs.
But we’d recently been doing call center operations and being able to quickly figure out what someone said was a major issue. You kind of don’t want your system to wait for a second before responding each time. I can imagine it making sense if it reduces the latency to 10ms there as well. Though you might still run up against the ‘good enough’ factor.
I guess few people want to spend millions to go from 1000ms to 10ms, but when they do they really want it.
nickpsecurity|2 years ago
It was also on my list of things to consider modifying for an AI accelerator. :)
trsohmers|2 years ago
There should be a podcast release (https://microarch.club/) in the near future that covers REX's history and a lot of lessons learned.