OpenTSLM: Language models that understand time series
280 points| rjakob | 5 months ago |opentslm.com
Repo: https://github.com/StanfordBDHG/OpenTSLM
Foundation models excel at text, images, audio, and video, but lack temporal reasoning capabilities over time-series data streams that run the real world: vitals, prices, telemetry, grid loads, clickstreams, machine logs, business processes.
Time Series Language Models (TSLMs) are open foundation models, supporting time‑series as a native modality, next to text, letting users ask questions, get explanations, and recommendations, all in natural language.
The OpenTSLM White Paper released today demonstrates state-of-the-art temporal reasoning performance. Unlike prior approaches, the cross-attention architecture scales to long time-series remaining viable at scale.
The results:
- Sleep staging: 4.4× accuracy with a model 200× smaller (~880× efficiency)
- Activity recognition: ~6× accuracy with 200× smaller (~1,000× efficiency)
- ECG interpretation: ~2× accuracy with 200× smaller (~400× efficiency)
— first model to process 12-lead ECG signals and text simultaneously with chain-of-thought reasoning validated by cardiologists.
For the first time, foundation models can handle multiple time-series streams of varying lengths concurrently, integrate them with textual context, and produce interpretable explanations (verified by domain experts, clinicians).
This work is the result of a growing collaboration between researchers from Stanford, ETH Zurich, UIUC, University of St. Gallen, University of Washington, Google, and Amazon.
It points to the next foundation model frontier: temporal intelligence that unlocks proactive healthcare, adaptive robotics, resilient infrastructure, and new forms of human-AI collaboration.
copypaper|5 months ago
For example, you ask an off-the-shelf LLM to analyze your ECG data. The LLM uses a tool to call out to your ECG ts analysis library. The library iterates over the data and finds stats & ECG events. It returns something like "Average heart rate: 60bpm, AFib detected at <time>, etc...". The LLM has all the info it needs to give an accurate analysis at a fraction of computational cost.
On top of that, this requires a large annotated dataset and a pre-trained model. And correct me if I'm wrong, but I don't think it's possible to have a "general" model that could handle arbitrary time series data. I.e. a model that is trained on ECG data would not be compatible with stock market data. And there isn't a way to have a model that understands both stock market data and ECG data.
manquer|5 months ago
The point is to be reliably run it on the edge , nobody sane would want their heart rate monitor to be run via the cloud with the uptimes and reliability that come that would come with any remote service plus the extra challenges of llm inference .
The goal would be running on the edge in addition to standard rules based detection which already these machines have and add advanced pattern detection that llms can provide to reduce alert fatigue and also detect new class of complex patterns which these sensors typically don’t.
SebastianSosa|5 months ago
Ok bro.
let_tim_cook_|5 months ago
rjakob|5 months ago
unknown|5 months ago
[deleted]
lomase|5 months ago
Animats|5 months ago
(The web site is too cute. Applying a left to right gradient on text is a bit much.)
[1] https://arxiv.org/pdf/2204.14198
FilosofumRex|5 months ago
Unlike most commercial & medical applications where signals are stationary with white (uncorrelated) noise, the NSA & Rentec mostly deal with non-stationary signals with regime changes and correlated noise, which can't be denoised without loss of information.
The idea is not so much to predict the next stock price tick or to decipher an intercepted signal (most likely encrypted anyways), but rather to detect "regime changes", ie quickest detection of a change of pattern in non-stationary signals. Then the detected pattern is matched to known trading patterns for a particular stock or to the expected spy activities.
brandonb|5 months ago
In medical AI, IMO, the most exciting work is detecting disease signals too subtle for humans—for example, estimating ejection fraction from an ECG (which cardiologists can’t do this, but algorithms can and have been tested in RCTs: https://www.nature.com/articles/s41591-021-01335-4 ).
Since OpenTSLM tokenizes time-series into an LLM embedding space, would that process prevent capturing such subtle signals? Or could the approach be extended to handle that use case?
RealLast|5 months ago
aerugo_|4 months ago
In my opinion we need a multi-modal model that is great at both tabular datasets and text analysis. Most analytical work in economics, policy, public health, medicine etc requires a combination of crosschecking between both. Current gen LLMs are not good enough at generating novel insights by looking at tables and text at the same time. I also haven’t have found any data on this so please serve it to be on a plate if I’m wrong.
esafak|5 months ago
sync|5 months ago
> The Claude Agent SDK excels at code generation—and for good reason. Code is precise, composable, and infinitely reusable, making it an ideal output for agents that need to perform complex operations reliably.
> When building agents, consider: which tasks would benefit from being expressed as code? Often, the answer unlocks significant capabilities.
https://www.anthropic.com/engineering/building-agents-with-t...
ForHackernews|5 months ago
RealLast|5 months ago
pks016|5 months ago
I work with a large number of audio time series data (not words and all have subtle variation). It would be interesting to see how it compares to traditional statistical methods.
resters|5 months ago
lsh0|5 months ago
LudwigNagasena|5 months ago
woadwarrior01|5 months ago
https://huggingface.co/OpenTSLM
amelius|5 months ago
lacoolj|5 months ago
zubairov|5 months ago
llmslave|5 months ago
fogzen|5 months ago
The actual algorithms for predicting price movement were fairly simplistic, most work was around strategies for dealing with overfitting and how to execute the trades. Accuracy was around 51-55% (a bit better than coin toss) so it was a big challenge to actually execute the trades and still make a profit after fees and other nonsense. Finding alpha is what ML is used for but that’s just the first step.
1980phipsi|5 months ago
reactordev|5 months ago
fmbb|5 months ago
constantcrying|5 months ago
senorrib|5 months ago
yawnxyz|5 months ago
unknown|5 months ago
[deleted]
qwe----3|5 months ago
pdntspa|5 months ago
ivape|5 months ago
syntaxing|5 months ago
ivape|5 months ago
t_mann|5 months ago
> A universal TSLM will power proactive healthcare, adaptive robotics, resilient infrastructure, and new forms of human-AI collaboration.
> scientists, engineers, and builders from ETH, Stanford, Harvard, Cambridge, TUM, CDTM, Google, Meta, AWS, and beyond
What's with all this fuss? Why not just upload your paper to arxiv? Time series models are interesting enough, but from the abstract it's not even clear whether they are using transformers or a recurrent architecture like xLSTM - arguably a more intuitive choice for time series - or something else. This website is barely distinguishable from a crypto/DeFi pitch.
RealLast|5 months ago
ghc|5 months ago
I mean, sure, but why would you need a study for that? There's plenty of prior work using cross-attention to integrate time series dynamics into non-LLM transformer models, right? Or maybe I'm assuming that integrating a time series embedding with an LLM is easier than it is.
Looking at the repo, the training data seems extremely health-focused. I guess I would have to tune the model with my own datasets if I want it to answer questions about multi-source sensor data?
orbifold|5 months ago
EGreg|5 months ago
dschaurecker|5 months ago
iLoveOncall|5 months ago
nowittyusername|5 months ago
NwtnsMthd|5 months ago
unknown|5 months ago
[deleted]
RLAIF|5 months ago
[deleted]
unknown|5 months ago
[deleted]
Imad_mkdm|5 months ago
[deleted]
esafak|5 months ago
posidoli|5 months ago
Y_Y|5 months ago