Aeon: A unified framework for machine learning with time series

hotstickyballs|2 years ago

It strikes me as a bit weird that these time series packages tend to discard the time component of the data and just.. not do anything with it.

Prophet, for example uses dates to create Fourier terms and indicators to holidays for example and that just seems like a more sane approach.

tnecniv|2 years ago

It depends on your time series, really. If the samples are evenly spaced, e.g., your sensor gives you a reading every millisecond, your measured sequences aren’t partially overlapping, and you don’t have structured discrete events, then time isn’t very useful. You can always rescale time so that it is just the same as the index.

For your calendar example, date information is very useful because patterns tend to exhibit a cyclic nature across years and there’s discrete special events (holidays). With enough data, you probably don’t need to include the date, but it’s informative for smaller data sets.

frankfrank13|2 years ago

Yeah most of the time series data I've had to work with I end up spending a huge amount of time interpolating (so all time slices have some data even if it isn't real) or aggregating to some common denominator (e.g taking sporadic sales and summing up to daily sales). I get why most packages expect nicely spaced or evenly dense data, but boy I would love if I had more options there.

poomer|2 years ago

Most time series models assume you've already deseasonalized your data in advance. Typically, seasonality is obvious to the human doing the modeling (e.g. sales being up near Christmas), so it's usually preferable for the human to deseasonalize the data in advance using a separate model that bakes in some of their human knowledge of how the world works. Forcing the model to learn seasonal trends fully on its own adds another layer of estimation error.

Prophet is popular because it works off the shelf with non-deseasonalized data and mixed frequency data, which makes it great for quick forecasting exercises. But IMO it is never the ideal model if you have a lot of time and expertise to work with.

frakt0x90|2 years ago

Prophet has worked so well for us, especially since we have a TON of custom events and holidays to consider. None of the other approaches have really come close.

tony_bagnall|2 years ago

hi, I dont want to enter a public discussion about the split of sktime, I fear the application of Godwin's law. A summary of the key points behind the split from my perspective are here https://github.com/aeon-toolkit/aeon/issues/456 the other sides view will no doubt be forthcoming. If you want to chat about it, join our slack and message me, I'm more than happy to help. How are we different? Well I think we can all live together, its open source, but from my perspective the priorities are 1. Align as closely as we can with sklearn, so as to make it completely intuitive how to use aeon if you know sklearn. 2. Focus on implementations of state of the art algorithms for time series machine learners and less on just wrapping other code. The goal is to reduce the lead time from publication of new ideas to widespread adoption 3. Documentation: make it good.

my interests primarily lie in classification, clustering and regression, but next year we are going into the forecasting world, plenty of exciting collaborations in the brew.

fastasucan|2 years ago

Its refreshing to see that classification is mentioned before forecasting. It has been a frustrating journey embarking on time series classification as it seems overlooked compared to forecasting. Will follow this project closely, and implement it in my next project!

Epa095|2 years ago

Wondering how it compares to the rest of the lot: sktime, tslearn, darts, pyts, and cesium.

abrichr|2 years ago

Links for those who are curious:

https://github.com/sktime/sktime

https://github.com/tslearn-team/tslearn

https://github.com/unit8co/darts

https://github.com/johannfaouzi/pyts

https://github.com/cesium-ml/cesium

Also:

https://github.com/timeseriesAI/tsai

sampo|2 years ago

> Wondering how it compares to the rest of the lot: sktime

It's a fork of sktime. Last common commit before the fork is on Jan 30, 2023.

polo333|2 years ago

Aeon is an sktime fork which happened after one of the sktime core developers (Franz K.) took hostage of the sktime project by kicking out other core devs from the GitHub. Its info you can collect from some GH issues

djl0|2 years ago

Looking forward to checking this out! How does this compare with darts[1]?

[1] https://unit8co.github.io/darts/

polo333|2 years ago

Darts is a commercial FOSS, aeon is community driven. Also aeon is more following scikit-learn.

ahif1999|2 years ago

Aeon has the advantage of including a friendly deep learning framework, all of the models discussed on the 'Deep Learning for Time Series Classification: a review" are included in aeon with the variety of choices on how to change the parameters of the architecture. More state of the art models such as InceptionTime are also included, not only for classification but regression as well and soon forecasting and clustering.

thetinymite|2 years ago

I wonder why aeon split from sktime.

https://twitter.com/sktime_toolbox/status/164721412371161907...

sampo|2 years ago

Here is a little about it: https://astrojuanlu.substack.com/p/episodio-70

polo333|2 years ago

See my comment above, mostly all active core devs from sktime at that time left or had to leave to the aeon project

uoaei|2 years ago

To obfuscate the choice of algorithm behind kwargs (as opposed to creating separate classes) has always seemed to me a suspect choice, in sklearn as well as here. And it seems to make development of the package more complex at the expense of... less readable code for the user, with less flexibility for differences in hyperparameter specifications, etc.

There are of course exceptions, something like `TrendPredictor(order=1, interp="polynomial")` as an example can be flexibly adapted up or down the hierarchy of model complexity much easier than commenting out different lines.

tony_bagnall|2 years ago

I have taught machine learning in Java using Weka for a long time, and when we moved over to sklearn this also annoyed me. It made a good teaching point with, for example, decision trees having a dozen separate different classes for different algorithms in Weka and sklearn having one configurable one. I guess just design preference in the end. With aeon we are leaning more towards the one class per algorithm or algorithm family, but its not a hard and fast rule. One issue is when does a change in algorithm mean a change in class? So, for example, we have separate transformers for ROCKET, MINROCKET and MULTIROCKET (convolution transforms), but a single configurable RocketClassifier. UltimatelyI think it comes down to how comprehensible it is to a new user.

ahif1999|2 years ago

Recently aeon included a new implementation for both PAA and SAX transformations that are much more efficient and much faster !

ahif1999|2 years ago

all of the functionalities of aeon can easily be mastered thanks for the help of its documentation and example notebooks.

23 comments