(no title)
hugmynutus | 10 months ago
The original link [1] cites a discussion of the cost per query of GPT-4o at 0.3whr [2]. When you read the document [2] itself you see 0.3whr is a lower bound & 40whr is the upper bound. The paper [2] is actually pretty solid, I recommend it. It uses the public metrics from other LLM APIs to derive a likely distribution of the context size of the average query for GPT-4o which is a reasonable approach given that data isn't public. Then factoring in GPU power per FLOP, average utilization during, and cloud/renting overhead. It admits this likely has non-trivial error bars, concluding the average is between 1-4whr per query.
This is disappointing to me as the original link [1] attempts to bring in this source [2] to disprove the 3whr "myth" created by another paper [3], yet this 3whr figure lies directly in the error bars their new source [2] arrives at.
Links:
1. https://simonwillison.net/2025/Apr/29/chatgpt-is-not-bad-for...
2. https://epoch.ai/gradient-updates/how-much-energy-does-chatg...
3. https://www.sciencedirect.com/science/article/pii/S254243512...
Edit: whr not w/hr
Retric|10 months ago
Thus the results inherently fail to analyze the underlying question.
A more realistic estimate is to take their total spending assuming X% of their expenses are electricity directly or indirectly because the environmental impact isn’t adds up. Even that ignores the energy costs on 3rd party servers when they download their training data.
hugmynutus|10 months ago
cwillu|10 months ago
hugmynutus|10 months ago