top | item 24547240

(no title)

prostodata | 5 years ago

For example, I have data from 1900 till 2000. I train ARMA using this data by storing the corresponding coefficients as model parameters. Now I get data from 2010 to 2020. My goal is to use these (AR and MA) coefficients in order to predict the value in 2021 (without using the historic data I used for training). I think it does make sense and it is precisely how typical ML algorithms work. So it is more a matter of how an algorithm is implemented and which usage patterns it follows.

discuss

order

em500|5 years ago

Ok, if you want to apply the fitted model to later data points of the same series, in principle you could. Superficially browsing the source of sklearn, it does not seem to support/expose it. AFAICT, sklearn's ARIMA estimator wraps pmdarima, which wraps SARIMAX from statsmodels, which uses the statsmodels state space model for the actual calculations. Best I can tell, none of the higher lever wrappers support/expose the functionality that you wish. If you know how to work with the raw state space form in statsmodels, you could do more or less what you described (predict with a fitted model without retaining the full history - tough you also need to store the estimated state in addition to the ARMA coefficients).

If you don't know how to do this, I'd advise you not to bother, unless you have a really specialistic need.

("just storing and applying fitted coefficients on new data" is straightforward if you have a pure AR(p) model: you can just plug in the coefficients in the recursive AR equation using the last observations. But as soon as you have an MA term, you have a problem, because a finite lag MA(q) model is equivalent to an infinite lag AR(p) model. You need some specialized algorithms like the innovations algorithm or Kalman filters to handle that. Statsmodels uses a Kalman filter on the state space form of the ARMA model.)