It depends on your time series, really. If the samples are evenly spaced, e.g., your sensor gives you a reading every millisecond, your measured sequences aren’t partially overlapping, and you don’t have structured discrete events, then time isn’t very useful. You can always rescale time so that it is just the same as the index.
For your calendar example, date information is very useful because patterns tend to exhibit a cyclic nature across years and there’s discrete special events (holidays). With enough data, you probably don’t need to include the date, but it’s informative for smaller data sets.
Yeah most of the time series data I've had to work with I end up spending a huge amount of time interpolating (so all time slices have some data even if it isn't real) or aggregating to some common denominator (e.g taking sporadic sales and summing up to daily sales). I get why most packages expect nicely spaced or evenly dense data, but boy I would love if I had more options there.
Most time series models assume you've already deseasonalized your data in advance. Typically, seasonality is obvious to the human doing the modeling (e.g. sales being up near Christmas), so it's usually preferable for the human to deseasonalize the data in advance using a separate model that bakes in some of their human knowledge of how the world works. Forcing the model to learn seasonal trends fully on its own adds another layer of estimation error.
Prophet is popular because it works off the shelf with non-deseasonalized data and mixed frequency data, which makes it great for quick forecasting exercises. But IMO it is never the ideal model if you have a lot of time and expertise to work with.
Prophet has worked so well for us, especially since we have a TON of custom events and holidays to consider. None of the other approaches have really come close.
hi, I dont want to enter a public discussion about the split of sktime, I fear the application of Godwin's law. A summary of the key points behind the split from my perspective are here
https://github.com/aeon-toolkit/aeon/issues/456
the other sides view will no doubt be forthcoming. If you want to chat about it, join our slack and message me, I'm more than happy to help. How are we different? Well I think we can all live together, its open source, but from my perspective the priorities are
1. Align as closely as we can with sklearn, so as to make it completely intuitive how to use aeon if you know sklearn.
2. Focus on implementations of state of the art algorithms for time series machine learners and less on just wrapping other code. The goal is to reduce the lead time from publication of new ideas to widespread adoption
3. Documentation: make it good.
my interests primarily lie in classification, clustering and regression, but next year we are going into the forecasting world, plenty of exciting collaborations in the brew.
Its refreshing to see that classification is mentioned before forecasting. It has been a frustrating journey embarking on time series classification as it seems overlooked compared to forecasting. Will follow this project closely, and implement it in my next project!
Aeon is an sktime fork which happened after one of the sktime core developers (Franz K.) took hostage of the sktime project by kicking out other core devs from the GitHub. Its info you can collect from some GH issues
Aeon has the advantage of including a friendly deep learning framework, all of the models discussed on the 'Deep Learning for Time Series Classification: a review" are included in aeon with the variety of choices on how to change the parameters of the architecture. More state of the art models such as InceptionTime are also included, not only for classification but regression as well and soon forecasting and clustering.
To obfuscate the choice of algorithm behind kwargs (as opposed to creating separate classes) has always seemed to me a suspect choice, in sklearn as well as here. And it seems to make development of the package more complex at the expense of... less readable code for the user, with less flexibility for differences in hyperparameter specifications, etc.
There are of course exceptions, something like `TrendPredictor(order=1, interp="polynomial")` as an example can be flexibly adapted up or down the hierarchy of model complexity much easier than commenting out different lines.
I have taught machine learning in Java using Weka for a long time, and when we moved over to sklearn this also annoyed me. It made a good teaching point with, for example, decision trees having a dozen separate different classes for different algorithms in Weka and sklearn having one configurable one. I guess just design preference in the end. With aeon we are leaning more towards the one class per algorithm or algorithm family, but its not a hard and fast rule. One issue is when does a change in algorithm mean a change in class? So, for example, we have separate transformers for ROCKET, MINROCKET and MULTIROCKET (convolution transforms), but a single configurable RocketClassifier. UltimatelyI think it comes down to how comprehensible it is to a new user.
hotstickyballs|2 years ago
Prophet, for example uses dates to create Fourier terms and indicators to holidays for example and that just seems like a more sane approach.
tnecniv|2 years ago
For your calendar example, date information is very useful because patterns tend to exhibit a cyclic nature across years and there’s discrete special events (holidays). With enough data, you probably don’t need to include the date, but it’s informative for smaller data sets.
frankfrank13|2 years ago
poomer|2 years ago
Prophet is popular because it works off the shelf with non-deseasonalized data and mixed frequency data, which makes it great for quick forecasting exercises. But IMO it is never the ideal model if you have a lot of time and expertise to work with.
frakt0x90|2 years ago
tony_bagnall|2 years ago
my interests primarily lie in classification, clustering and regression, but next year we are going into the forecasting world, plenty of exciting collaborations in the brew.
fastasucan|2 years ago
Epa095|2 years ago
abrichr|2 years ago
https://github.com/sktime/sktime
https://github.com/tslearn-team/tslearn
https://github.com/unit8co/darts
https://github.com/johannfaouzi/pyts
https://github.com/cesium-ml/cesium
Also:
https://github.com/timeseriesAI/tsai
sampo|2 years ago
It's a fork of sktime. Last common commit before the fork is on Jan 30, 2023.
polo333|2 years ago
djl0|2 years ago
[1] https://unit8co.github.io/darts/
polo333|2 years ago
ahif1999|2 years ago
thetinymite|2 years ago
https://twitter.com/sktime_toolbox/status/164721412371161907...
sampo|2 years ago
polo333|2 years ago
uoaei|2 years ago
There are of course exceptions, something like `TrendPredictor(order=1, interp="polynomial")` as an example can be flexibly adapted up or down the hierarchy of model complexity much easier than commenting out different lines.
tony_bagnall|2 years ago
ahif1999|2 years ago
ahif1999|2 years ago