top | item 41578344

(no title)

Apfel | 1 year ago

Is this true in a marginal cost sense? I was under the impression most of the environmental impact occurred during the training stage, and that it was significantly less costly post training?

discuss

order

jeroenhd|1 year ago

You could argue that this is no longer the case once the model is done; the cost per request will go down over time, as the set amount of power and coolant pumped through data centres gets divided over more people.

However, AI companies can't afford to stand still. They have to keep training or they risk being made irrelevant by whatever AI company comes next.

Furthermore, a non-significant amount of energy and cooling is being used for generating responses as well. It's plainly obvious when you run even the very modest AI models at home how much power these things take.

The paper[1] mentions the statistics used to calculate these numbers. It has a separate column for inference, with numbers ranging from 10mL to 50mL of water per inference depending on the data centre sampled.

The numbers seem bad, but the authors also call out that more transparency is needed. With all the bad rep out there from independent estimations and no AI companies giving detailed environmental impact data, I have to assume the real cost is worse than estimated, or companies would've tried to greenwash themselves already.

[1] https://arxiv.org/pdf/2304.03271

guitarlimeo|1 year ago

> It's plainly obvious when you run even the very modest AI models at home how much power these things take.

Really good point to put this into perspective. I tried models locally and my gpu was running red hot. Granted, I think the server boards like H100 are more optimized for the AI workloads so they run more efficiently than consumer gpus, but I don't believe they are more than 1 magnitude more efficient.

gcr|1 year ago

Another corollary is that AI companies don’t train one model at a time. Typical engineers will have maybe 5-10 models training at once. Large hyperparameter grid searches might have hundreds or thousands. Most of these will turn out to be duds. Only one model gets released, and that one’s energy efficiency is what’s reported.

guitarlimeo|1 year ago

Trend is towards more inference compute (o1), so the post training costs will increase as they will scale that too.

QuadmasterXLII|1 year ago

Llama 403b takes OOM a kilowatt minute to respond on our local gpu server, or about 10 grams of C02 per email. Last I checked, add another 20 grams of amortized manufacturing emissions. A typical commute is OOM 5-10 kg of CO2.

this article is alarmist bullshit. (for entirely unrelated reasons openai delenda est)

gcr|1 year ago

So you can double your commute‘s environmental impact by using llama 1000x per day?

That sounds pretty bad still, no?