> Taalas’ silicon Llama achieves 17K tokens/sec per user, nearly 10X faster than the current state of the art, while costing 20X less to build, and consuming 10X less power.
Am I reading this right: 10x faster and 10x less power, ie. 100x more power efficient?
No comments yet.