Casual twitter response turned into a new article turned into a "X wants Y" is exactly why I stopped trusting most of social media as a source of information.
From the article: "the impacts of generating a 100-word email. They found that just one email requires .14 kilowatt-hours worth of electricity, or enough to power 14 LED lights for an hour"
Seems completely off the charts. A 70b model on my M3 Max laptop does it for 0.001 kWh... 140x times less that stated in the article. Let's say the OpenAI Nvidia clusters are less energy efficient than my Macbook... but not even sure about that.
One can also work backwards, to see what kind of compute hardware they think must be needed for the models, or how much they think OpenAI's electricity costs.
100 words is ~133 tokens, so 0.14 kWh/133 tokens is about 1 kWh/kilo-token. If electricity is all from record-cheapest PV at $0.01/kWh, then this limits them to a price floor of $10/mega-token. For more realistic (but still cheap) pricing of $0.05/kWh, that's $50/mega-token. Here's the current price sheet: https://platform.openai.com/docs/pricing
To generate a 133-token email in, say, 5 seconds, if it takes 0.14 kWh, is 101 kW. This does not seem like a plausible number (caveat: I don't work in a data centre and what I think isn't plausible may just be wrong): https://www.wolframalpha.com/input?i=0.14+kWh+%2F+5+seconds+...
For reference, a single NVIDIA H200 card has a TDP of 700watts. Considering all the middlemen you put between you and the model, .14KWh doesn't look too outrageous to me. Because you add processors, high-speed interconnects, tons of cooling, etc. into the mix. Plus the models you run at the datacenters are way bigger.
For "fathomability" case, the network cables (fibers in fact) you use in that datacenters carries 800gbps, and the fiber-copper interface converters at each end heats up to uncomfortable levels. You have thousands of these just converting packets to light and vice versa. I'm not adding the power consumption of the switches, servers, cooling infra, etc. into the mix.
Yes, water cooling is more efficient than air cooling, but when a server is burning through 6KWh of energy (8x Tesla cards, plus processors, plus rest of the system), nothing is efficient as a local model you hit at your computer.
esafak|10 months ago
lyjackal|10 months ago
itchyjunk|10 months ago
hbbio|10 months ago
Seems completely off the charts. A 70b model on my M3 Max laptop does it for 0.001 kWh... 140x times less that stated in the article. Let's say the OpenAI Nvidia clusters are less energy efficient than my Macbook... but not even sure about that.
ben_w|10 months ago
100 words is ~133 tokens, so 0.14 kWh/133 tokens is about 1 kWh/kilo-token. If electricity is all from record-cheapest PV at $0.01/kWh, then this limits them to a price floor of $10/mega-token. For more realistic (but still cheap) pricing of $0.05/kWh, that's $50/mega-token. Here's the current price sheet: https://platform.openai.com/docs/pricing
To generate a 133-token email in, say, 5 seconds, if it takes 0.14 kWh, is 101 kW. This does not seem like a plausible number (caveat: I don't work in a data centre and what I think isn't plausible may just be wrong): https://www.wolframalpha.com/input?i=0.14+kWh+%2F+5+seconds+...
bayindirh|10 months ago
For "fathomability" case, the network cables (fibers in fact) you use in that datacenters carries 800gbps, and the fiber-copper interface converters at each end heats up to uncomfortable levels. You have thousands of these just converting packets to light and vice versa. I'm not adding the power consumption of the switches, servers, cooling infra, etc. into the mix.
Yes, water cooling is more efficient than air cooling, but when a server is burning through 6KWh of energy (8x Tesla cards, plus processors, plus rest of the system), nothing is efficient as a local model you hit at your computer.
Disclosure: Sitting on top of a datacenter.
a3w|10 months ago