Data doesn't actually live on a manifold. It's an approximation used for thinking about data. Near total majority, if not 100%, of the useful things done in deep learning have come from not thinking about topology in any way. Deep learning is not applied anything, it's an empirical field advanced mostly by trial and error and, sure, a few intuitions coming from theory (that was not topology).
sota_pop|9 months ago
saberience|9 months ago
It is primarily linear algebra, calculus, probability theory and statistics, secondarily you could add something like information theory for ideas like entropy, loss functions etc.
But really, if "manifolds" had never been invented/conceptualized, we would still have deep learning now, it really made zero impact on the actual practical technology we are all using every day now.
kwertzzz|9 months ago
behnamoh|9 months ago
I think these 'intuitions' are an after-the-fact thing, meaning AFTER deep learning comes up with a method, researchers in other fields of science notice the similarities between the deep learning approach and their (possibly decades old) methods. Here's an example where the author discovers that GPT is really the same computational problems he has solved in physics before:
https://ondrejcertik.com/blog/2023/03/fastgpt-faster-than-py...
ogogmad|9 months ago
constantcrying|9 months ago
unknown|9 months ago
[deleted]
theahura|9 months ago
umutisik|9 months ago
almostgotcaught|9 months ago
niemandhier|9 months ago
Deep learning in its current form relates to a hypothetical underlying theory as alchemy does to chemistry.
In a few hundred years the Inuktitut speaking high schoolers of the civilisation that comes after us will learn that this strange word “deep learning” is a left over from the lingua franca of yore.
adamnemecek|9 months ago
esafak|9 months ago
Koshkin|9 months ago
Often, they do (and then they are called "sheaves").
wenc|9 months ago
In fact not all ML models treat data as manifolds. Nearest neighbors, decision trees don’t require the manifold assumption and actually work better without it.
motoboi|9 months ago
See? Everything lives in the manifold.
Now for a great visualization about the Manifold Hypothesis I cannot recommend more this video: https://www.youtube.com/watch?v=pdNYw6qwuNc
That helps to visualize how the activation functions, bias and weights (linear transformations) serve to stretch the high dimensional space so that data go into extremes and become easy to put in a high dimension, low dimensional object (the manifold) where is trivial to classify or separate.
Gaining an intuition about this process will make some deep learning practices so much easy to understand.
thuuuomas|9 months ago
Even if existing theory is inadequate, would an operating theory not be beneficial?
Or is the mystique combined with guess&check drudgery job security?
canjobear|9 months ago
jebarker|9 months ago
lumost|9 months ago
For instance, we do not have consensus on what a theory should accomplish - should it provide convergence bounds/capability bounds? Should it predict optimal parameter counts/shapes? Should it allow more efficient calculation of optimal weights? Does it need to do these tasks in linear time?
Even materials science in metals is still cycling through theoretical models after thousands of years of making steel and other alloys.
hiddencost|9 months ago
danielmarkbruce|9 months ago
There is an enormous amount of theory used in the various parts of building models, there just isn't an overarching theory at the very most convenient level of abstraction.
It almost has to be this way. If there was some neat theory, people would use it and build even more complex things on top of it in an experimental way and then so on.
baxtr|9 months ago
Physics is just applied mathematics
Chemistry is just applied physics
Biology is just applied chemistry
It doesn’t work very well.
constantcrying|9 months ago
Neural Networks consist almost exclusively of two parts, numerical linear algebra and numerical optimization.
Even if you reject the abstract topological description. Numerical linear algebra and optimization couldn't be any more directly applicable.
yubblegum|9 months ago
Of course. Now, to actually deeply understand what is happening with these constructs, we will use topology. Topoligical insights will without doubt then inform the next generations of this technology.
solomatov|9 months ago
Regic|9 months ago
nomel|9 months ago
Coming up with an idea for how something works, by applying your expertise, is the fundamental foundation of intelligence, learning, and was behind every single advancement of human understanding.
People thinking is always a good thing. Thinking about the unknown is better. Thinking with others is best, and sharing those thoughts isn't somehow bad, even if they're not complete.
HarHarVeryFunny|9 months ago
Even with LLMs, there's no real mystery about why they work so well - they produce human-like input continuations (aka "answers") because they are trained to predict continuations of human-generated training data. Maybe we should be a bit surprised that the continuation signal is there in the first place, but given that it evidentially is, it's no mystery that LLMs are able to use it - just testimony to the power of the Transformer as a predictive architecture, and of course to gradient descent as a cold unthinking way of finding an error minimum.
Perhaps you meant how LLMs work, rather than why they work, but I'm not sure there's any real mystery there either - the transformer itself is all about key-based attention, and we now know that training a transformer seems to consistently cause it to leverage attention to learn "induction heads" (using pairs of adjacent attention heads) that are the main data finding/copying primitive they use to operate.
Of course knowing how an LLM works in broad strokes isn't the same as knowing specifically how it is working in any given case, how is it transforming a specific input layer by layer to create the given output, but that seems a bit like saying that because I can't describe - precisely - why you had pancakes for breakfast, that we don't know how the brains works.
csimon80|9 months ago
woopwoop|9 months ago
unknown|9 months ago
[deleted]