Maybe make that intelligence per token per relative unit of hardware per watt. If you're burning 30 tons of coal to be 0.0000000001% better than the 5 tons of coal option because you're throwing more hardware at it, well, it's not much of a real improvement.
estsauver|5 days ago
But to be clear, 1000 tokens/second is WAY better. Anthropic's Haiku serves at ~50 tokens per second.