top | item 47051124 (no title) _ink_ | 12 days ago Interesting. Is it because they can or is it really more expensive for them to process bigger context? discuss order hn newest cube2222|12 days ago Attention is, at its core, quadratic wrt context length. So I'd believe that to be the case, yeah. pkaye|12 days ago I've read that compute costs for LLMs go up O(n^2) with context window size. But I think it is also a combination of limited compute availability, users preference for Anthropic models and Anthropic planning to go IPO.
cube2222|12 days ago Attention is, at its core, quadratic wrt context length. So I'd believe that to be the case, yeah.
pkaye|12 days ago I've read that compute costs for LLMs go up O(n^2) with context window size. But I think it is also a combination of limited compute availability, users preference for Anthropic models and Anthropic planning to go IPO.
cube2222|12 days ago
pkaye|12 days ago