I've seen the Microsoft Aurora team make a compelling argument that weather is an interesting contradiction of the AI-energy-waste narrative. Once deployed at scale, inference with these models is actually a sizable energy/compute improvement over classical simulation and forecasting methods. Of course it is energy intensive to train the model, but the usage itself is more energy efficient.
gwern|2 months ago
And one of the biggest ironies of AI scaling is that where scaling succeeds the most in improving efficiency, we realize it the least, because we don't even think of it as an option. An example: a Transformer (or RNN) is not the only way to predict text. We have scaling laws for n-grams and text perplexity (most famously, from Jeff Dean et al at Google back in the 2000s), so you can actually ask the question, 'how much would I have to scale up n-grams to achieve the necessary perplexity for a useful code writer competitive with Claude Code, say?' This is a perfectly reasonable, well-defined question, as high-order n-grams could in theory write code without enough data and big enough lookup tables, and so it can be answered. The answer will look something like 'if we turned the whole earth into computronium, it still wouldn't be remotely enough'. The efficiency ratio is not 10:1 or 100:1 but closer to ∞:1. The efficiency gain is so big no one even thinks of it as an efficiency gain, because you just couldn't do it before using AI! You would have humans do it, or not do it at all.
hammock|2 months ago
Here is the NOAA on the improvements:
> 8% better predictions for track, and 10% better predictions for intensity, especially at longer forecast lead times — with overall improvements of four to five days.(1)
I’d love someone to explain what these measurements mean though. Does better track mean 8% narrower angle? Something else? Compared to what baseline?
And am I reading this right that that improvement is measured at the point 4-5 days out from landfall? What’s the typical lead time for calling an evacuation, more or less than four days?
(1)https://www.noaa.gov/news/new-noaa-system-ushers-in-next-gen...
inciampati|2 months ago
kingkawn|2 months ago
AStrangeMorrow|2 months ago
Eg you want to find a really good design. Designs are fairly easy to generate, but expensive to evaluate and score. Understand we can quickly generate millions of designs but evaluating one can take 100ms-1s. With simulations that are not easy to GPU parallelize. We ended up training models that try to predict said score. They don’t predict things perfectly, but you can be 99% sure that the actual score designs is within a certain distance of said score.
So if normally you want to get the 10 best design out of your 1 million, we can now first have the model predict the best 1000 and you can be reasonably certain your top 10 is a subset of these 1000. So you only need to run your simulation on these 1000.
trillic|2 months ago
klysm|2 months ago
lukeschlather|2 months ago
esafak|2 months ago
brewtide|2 months ago
xphos|2 months ago
threemux|2 months ago
derbOac|2 months ago
"Area for future improvement: developers continue to improve the ensemble’s ability to create a range of forecast outcomes."
Someone else noted the models are fairly simple.
My question is "what happens if you scale up to attain the same levels of accuracy throughout? Will it still be as efficient?"
My reading is that these models work well in other regions but I reserve a certain skepticism because I think it's healthy in science, and also because I think those ultimately in charge have yet to prove reliable judges of anything scientific.
throwaway613745|2 months ago
Majromax|2 months ago
Even when you include training, the payoff period is not that long. Operational NWP is enormously expensive because high-resolution models run under soft real-time deadlines; having today's forecast tomorrow won't do you any good.
The bigger problem is that traditional models have decades of legacy behind them, and getting them to work on GPUs is nontrivial. That means that in a real way, AI model training and inference comes at the expense of traditional-NWP systems, and weather centres globally are having to strike new balances without a lot of certainty.
TallGuyShort|2 months ago
There's an interesting parallel to Formula One, where there are limits on the computational resources teams can use to design their cars, and where they can use an aerodynamic model that was previously trained to get pretty good outcomes with less compute use in the actual design phase.
brookst|2 months ago
Assuming you’re not throwing the whole thing out after one forecast, it is probably better to reduce runtime energy usage even if it means using more for one-time training.
apawloski|2 months ago
unknown|2 months ago
[deleted]
goodolddays9090|2 months ago
[deleted]