(no title)
dccsillag | 1 year ago
In practice, I've personally ran some benchmarks on a collection of datasets I had laying around. The results were generally abysmal, with the method only matching simple baselines in some few datasets.
Finally, the original paper is very weird, and reads more as a marketing piece. The theory, which is touted throughout the paper, is very weak, the actual algorithm is not sufficiently well explained there and the experiments are lacking. In particular, I find it telling that they do not include and even go out of their way to ignore important baselines such as boosted trees, which are the state-of-the-art solution to the problem that they intended to solve (and even work very well in occasions where they claim that both KANs and MLPs perform badly, e.g. in high dimensions).
SpaceManNabs|1 year ago
Only one follow up question:
> I'm also can't see how to incorporate inductive biases other than the standard R^n / tabular regression one, and the existing attempts on this that I'm aware of are just band-aids (along the lines of feature engineering)
A lot of the way we induct biases in the traditional network setting (activations are on the node instead of on the edge like in KAN) is by using graph-based architectures, like convolution or transformers, or by setting up particular losses and optimizations like in equivariant networks. Can't we do the same thing for KANs?