Launch HN: Silurian (YC S24) – Simulate the Earth
338 points| rejuvyesh | 1 year ago
What is it worth to know the weather forecast 1 day earlier? That’s not a hypothetical question, traditional forecasting systems have been improving their skill at a rate of 1 day per decade. In other words, today’s 6-day forecast is as accurate as the 5-day forecast ten years ago. No one expects this rate of improvement to hold steady, it has to slow down eventually, right? Well in the last couple years GPUs and modern deep learning have actually sped it up.
Since 2022 there has been a flurry of weather deep learning systems research at companies like NVIDIA, Google DeepMind, Huawei and Microsoft (some of them built by yours truly). These models have little to no built-in physics and learn to forecast purely from data. Astonishingly, this approach, done correctly, produces better forecasts than traditional simulations of the physics of our atmosphere.
Jayesh and Cris came face-to-face with this technology’s potential while they were respectively leading the [ClimaX](https://arxiv.org/abs/2301.10343) and [Aurora](https://arxiv.org/abs/2405.13063) projects at Microsoft. The foundation models they built improved on the ECMWF’s forecasts, considered the gold standard in weather prediction, while only using a fraction of the available training data. Our mission at Silurian is to scale these models to their full potential and push them to the limits of physical predictability. Ultimately, we aim to model all infrastructure that is impacted by weather including the energy grid, agriculture, logistics, and defense. Hence: simulate the Earth.
Before we do all that, this summer we’ve built our own foundation model, GFT (Generative Forecasting Transformer), a 1.5B parameter frontier model that simulates global weather up to 14 days ahead at approximately 11km resolution (https://www.ycombinator.com/launches/Lcz-silurian-simulate-t...). Despite the scarce amount of extreme weather data in historical records, we have seen that GFT is performing extremely well on predicting 2024 hurricane tracks (https://silurian.ai/posts/001/hurricane_tracks). You can play around with our hurricane forecasts at https://hurricanes2024.silurian.ai. We visualize these using [cambecc/earth] (https://github.com/cambecc/earth), one of our favorite open source weather visualization tools.
We’re excited to be launching here on HN and would love to hear what you think!
shoyer|1 year ago
One nit on your framing: NeuralGCM (https://www.nature.com/articles/s41586-024-07744-y), built by my team at Google, is currently at the top of the WeatherBench leaderboard and actually builds in lots of physics :).
We would love to metrics from your model in WeatherBench for comparison. When/if you have that, please do reach out.
cbodnar|1 year ago
Re NeuralGCM, indeed, our post should have said "*most* of these models". Definitely proves that combining ML and physics models can work really well. Thanks for your comments!
bbor|1 year ago
Main takeaway, gives me some hope:
But I will admit, I clicked the link to answer a more cynical question: why is Google funding a presumably super-expensive team of engineers and meteorologists to work on this without a related product in sight? The answer is both fascinating and boring: From https://research.google/philosophy/. Talk about a cool job! I hope such programs rode the intimidation-layoff wave somewhat peacefully…d_burfoot|1 year ago
Haha. The old NLP saying "every time I fire a linguist, my performance goes up", now applies to the physicists....
joshdavham|1 year ago
What else do you hope to simulate, if this becomes successful?
CSMastermind|1 year ago
nikhil-shankar|1 year ago
cshimmin|1 year ago
Signed,
A California Resident
brunosan|1 year ago
nikhil-shankar|1 year ago
serjester|1 year ago
nikhil-shankar|1 year ago
furiousteabag|1 year ago
Shameless plug: recently we've built a demo that allows you to search for objects in San Francisco using natural language. You can look for things like Tesla cars, dry patches, boats, and more. Link: https://demo.bluesight.ai/
We've tried using Clay embeddings but we quickly found out that they perform poorly for similarity search compared to embeddings produced by CLIP fine tuned on OSM captions (SkyScript).
brunosan|1 year ago
We did try to relate OSM tags to Clay embeddings, but it didn't scale well. We did not give up, but we are re-considering ( https://github.com/Clay-foundation/earth-text ). I think SatClip plus OSM is a better approach. or LLM embeddings mapped to Clay embeddings...
sltr|1 year ago
Disclosure: I work there.
https://climavision.com/
codeyogini|1 year ago
[deleted]
bbor|1 year ago
Best of luck, and thanks for taking the leap! Humanity will surely thank you. Hopefully one day you can claim a bit of the NWS’ $1.2B annual budget, or the US Navy’s $infinity budget — if you haven’t, definitely reach out to NRL and see if they’ll buy what you’re selling!
Oh and C) reach out if you ever find the need to contract out a naive, cheap, and annoyingly-optimistic full stack engineer/philosopher ;)
cbodnar|1 year ago
Re question 2: Simulations don't need to be explainable. Being able to simulate simply means being able to provide a resonable evolution of a system given some potential set of initial conditions and other constraints. Even for physics-based simulations, when run at huge scale like with weather, it's debatable to what degree they are "interpretable".
Thanks for your questions!
britannio|1 year ago
[1] https://x.com/karpathy/status/1835024197506187617 [2] https://www.youtube.com/watch?v=-KMdo9AWJaQ&t=1010s
OrvalWintermute|1 year ago
What will your differentiators be?
Are you paying for weather data products?
danielmarkbruce|1 year ago
Better weather predictions are worth money, plain and simple.
amirhirsch|1 year ago
Once upon a time I converted spectral-transform-shallow-water-model (STSWM or parallelized as PSTSWM) from FORTRAN to Verilog. I believe this is the spectral-transform method we have run for the last 30 years to do forecasting. The forecasting would be ~20% different results for 10-day predictions if we truncated each operation to FP64 instead of Intel's FP80.
nikhil-shankar|1 year ago
1. The truth is we still have to investigate the the numerical stability of these models. Our GFT forecast rollouts are around 2 weeks (~60 steps) long and things are stable in in that range. We're working on longer-ranged forecasts internally.
2. The compute requirements are extremely favorable for ML methods. Our training costs are significantly cheaper than the fixed costs of the supercomputers that government agencies require and each forecast can be generated on 1 GPU over a few minutes instead of 1 supercomputer over a few hours.
3. There's a similar floating-point story in deep learning models with FP32, FP16, BF16 (and even lower these days)! An exciting area to explore
Angostura|1 year ago
ijustlovemath|1 year ago
It seems like this is another instance of The Bitter Lesson, no?
CharlesW|1 year ago
agentultra|1 year ago
Deep Blue wasn't a brute-force search. It did rely on heuristics and human knowledge of the domain to prune search paths. We've always known we could brute-force search the entire space but weren't satisfied with waiting until the heat death of the universe for the chance at an answer.
The advances in machine learning do use various heuristics and techniques to solve particular engineering challenges in order to solve more general problems. It hasn't all come down to Moore's Law.. which stopped bearing large fruit some time ago.
However that still comes at a cost. It requires a lot of GPUs, land, energy, and fresh water, and Freon for cooling. We'd prefer to use less of these resources if possible while still getting answers in a reasonable amount of time.
photochemsyn|1 year ago
Notably forecast skill is quantifiable, so we'd need to see a whole lot of forecast predictions using what is essentially the stochastic modelling (historical data) approach. Given the climate is steadily warming with all that implies in terms of water vapor feedback etc., it's reasonable to assume that historical data isn't that great a guide to future behavior, e.g. when you start having 'once every 500 year' floods every decade, that means the past is not a good guide to the future.
crackalamoo|1 year ago
1wd|1 year ago
jandrewrogers|1 year ago
The biggest issue is that the basic data model for population behavior is a sparse metastable graph with many non-linearities. How to even represent these types of data models at scale is a set of open problem in computer science. Using existing "big data" platforms is completely intractable, they are incapable of expressing what is needed. These data models also tend to be quite large, 10s of PB at a bare minimum.
You cannot use population aggregates like census data. Doing so produces poor models that don't ground truth in practice for reasons that are generally understood. It requires having distinct behavioral models of every entity in the simulation i.e. a basic behavioral profile of every person. It is very difficult to get entity data sufficient to produce a usable model. Think privileged telemetry from mobile carrier backbones at country scales (which is a lot of data -- this can get into petabytes per day for large countries).
Current AI tech is famously bad at these types of problems. There is an entire set of open problems here around machine learning and analytic algorithms that you would need to research and develop. There is negligible literature around it. You can't just throw tensorflow or LLMs at the problem.
This is all doable in principle, it is just extremely difficult technically. I will say that if you can demonstrably address all of the practical and theoretical computer science problems at scale, gaining access to the required data becomes much less of a problem.
ag_rin|1 year ago
kristjansson|1 year ago
IMO the short answer is that such models can be made to generate realistic trajectories, but calibrating the model the specific trajectory of reality we inhabit requires knowledge of the current state of the world bordering on omniscience.
[0]: https://www.santafe.edu/research/results/working-papers/asse...
Nicholas_C|1 year ago
cossatot|1 year ago
7e|1 year ago
andrewla|1 year ago
nikhil-shankar|1 year ago
nxobject|1 year ago
Have specific industries reached out to you for your commerical potential – natural resource exploration, for example?
scottcha|1 year ago
cbodnar|1 year ago
legel|1 year ago
As a fellow deep learning modeler of Earth systems, I can also say that what they're doing really is 100% top notch. Congrats to the team and YC.
abdellah123|1 year ago
Using the full expressive power of a programming language to model the real world and then execute AI algorithms on highly structured and highly understood data seems like the right way to go!
kristopolous|1 year ago
nikhil-shankar|1 year ago
rybosome|1 year ago
From the post.
jay-barronville|1 year ago
What more did you want from them? (Genuine question.)
cyberlimerence|1 year ago
[1] https://github.com/cambecc/earth
Urchin2|1 year ago
99catmaster|1 year ago
koolala|1 year ago
xpe|1 year ago
hwhwhwhhwhwh|1 year ago
sillysaurusx|1 year ago
Specifically, I could imagine throwing current weather data at the model and asking it what it thinks the next most likely weather change is going to be. If it's accurate at all, then that could be done on any given day without further training.
The problems happen when you start throwing data at it that it wasn't trained on, so it'll be a cat and mouse game. But it's one I think the cat can win, if it's persistent enough.
nikhil-shankar|1 year ago
the_arun|1 year ago
SirLJ|1 year ago
unknown|1 year ago
[deleted]
baetylus|1 year ago
1. How will you handle one-off events like volcanic eruptions for instance? 2. Where do you start with this too? Do you pitch a meteorology team? Is it like a "compare and see for yourself"?
cbodnar|1 year ago
Re where do we start. A lot of organisations across different sectors need better weather predictions or simulations that depend on weather. Measuring the skill of such models is a relatively standard procedure and people can check the numbers.
julienlafond|1 year ago
nikhil-shankar|1 year ago
resters|1 year ago
kyletns|1 year ago
zeitgeistcowboy|1 year ago
itomato|1 year ago
cbodnar|1 year ago
bschmidt1|1 year ago
I had a web app online in 2020-22 called Skim Day that predicted skimboarding conditions on California beaches that was mostly powered by weather APIs. The tide predictions were solid, but the weather itself was almost never right, especially wind speed. Additionally there were some missing metrics like slope of beach which changes significantly throughout the year and is very important for skimboarding.
Basically, I needed AI. And this looks incredible. Love your website and even the name and concept of "Generative Forecasting Transformer (GFT)" - very cool. I imagine the likes of Surfline, The Weather Channel, and NOAA would be interested to say the least.
cbodnar|1 year ago
jawmes8|1 year ago
gvidon|1 year ago
[deleted]
yifurjoshdf|1 year ago
[deleted]
chenbin74851|1 year ago
[deleted]
yifurjoshdf|1 year ago
[deleted]
codelikeit|1 year ago
[deleted]
codelikeit|1 year ago
[deleted]