top | item 20258335

(no title)

ddmd | 6 years ago

One problem I have with statsmodels is that I cannot apply trained models to new data rather than to the train data. In other words I do not want to forecast the train data - I want to forecast completely new time series.

For example, here I create and train a model:

    model = ARIMA(df.value, order=(1,1,1))
    fitted = model.fit(disp=0)

And then I immediately do forecast:

    fc, se, conf = fitted.forecast(...)

Yet, it is not what I need. Typically, I store the model and then apply it to many new datasets which are received later.

sklearn explicitly supports this pattern (train data set for training and then another data set for prediction).

Is it possible in ARIMA and other forecasting algorithms in statsmodels?

discuss

gareman|6 years ago

In ARIMA the only thing that changes is time so you don't need to wait for future data. It sounds like you may be trying to use covariates, which is not a problem with statsmodels but model choice -- you'd need to use ARIMAX not ARIMA.

ddmd|6 years ago

Assume we have a series of N value 4, 7, 3, ..., 9, 2, 5. We use them for training a model. Now we forget about this data (for whatever reason). We receive a completely new series like 2, 3, 7, 5, 4, 5. Our goal is to forecast the next value of this new series. How can we find it using statsmodels?