There's a lot of space to be explored at the intersection of ml and hypercomplex numbers. There's a Clifford svm that unlike a regular sam which learns a hyperplane, learns any manifold.
Thanks for sharing this. There's a lot to digest in there, but there were a few highlights that stood out as possibly relevant to the OP paper.
> Theorem 6.4 ([2]) Complex FCMLPs having (6.9) as activation function are only universal approximators in L∞ for the class of analytic functions, but not for the class of complex continuous functions.
> ... the complex numbers (C0,1) are a subalgebra of the quaternions (C0,2). Hence the quaternionic logistic function is also unbounded. Neither could it give rise to universal approximation (w.r.t. L∞) since this does not hold for the complex case. One may argue that such things become more and more less important when proceeding to higher dimensional algebras since less and less components are affected. This is somehow true, but it hardly justify the efforts.
> ... Summarising all of the above the case of Complex FCMLPs looks settled down in a negative way. ... Hence Complex FCMLPs remain not very promising.
Unless I'm misreading, it seems already known that you _can_ use complex numbers (or quaternions) in neural networks...but you don't really gain anything from doing it.
I skipped to the table at the end. The gains don't seem enormous. Is there a kind of problem where we would expect quaternions to perform dramatically better than other kinds of numbers?
Not to mention that it appears they're comparing against networks of the same architecture. If you build your quaternion components with with same number types as your reals you effectively have 4 times the number of parameters, which could be most of the benefit. They should also benchmark against similar architectures with equivalent parameter counts.
This paper was mostly to lay out the framework and give out the Keras layers for others to use. We expect that the biggest improvements will come from segmentation where the gains may come from treating each color channel as an imaginary axis. And from architectures like PointNet https://arxiv.org/abs/1612.00593.
Quaternions are an extension of the idea of complex numbers. Complex numbers have a real and an imaginary part, while quaternions have a real part and multiple imaginary parts (3). So the basic idea is that these richer types of number, when used to build a network (instead of plain real numbers) have benefits.
So to get started with reading this paper you just need to learn about deep learning, and then also the very basics of quaternions, which would be taught in, for example, a first course on abstract algebra.
In summary: Just as the complex numbers are defined by adding a new element i such that i^2 = -1, the quaternions are defined by adding elements i, j, k such that i^2 = j^2 = k^2 = ijk = -1.
Does use of complex numbers really provide improvement? How does this work? (other than cramming 2 numbers into 1... which itself is suspect... the complex plane has the same cardnality as the reals...)
Cardinality is a complete red herring here, we don't care about the set theoretic structure, and ultimately we're taking finite approximations anyway. The structures we care about are the metric which tells us which solutions (neural nets in this case) are nearby to each other and the algebra which tells us how to compose solutions.
The algebra of real numbers is simply less structured than the complex numbers. One of the key properties of the complex numbers is that they naturally have both a magnitude and a phase. This lets them capture phenomena that have a notion of superposition and interference.
As you correctly pointed out, you can simulate a complex number with two real numbers. The key is to exploit the particular geometric and algebraic properties of the complexes. One example in neural networks is the phenomenon of synchronization, where the outputs of neurons depending on the presence of a particular stimulus all have the same phase. This can be exploited for applications such as object segmentation.
So the widest possible view of this line of research is that putting more algebraic structure on your parameters can improve the behavior of your learning algorithms. My extremely hot take on how far this can go is a full fledged integration of harmonic analysis and representation theory into the theory of deep learning.
Cardinality has nothing to do with this discussion. Quarternions have different structure than real numbers. Over every set, you can find any (reasonable) structure you want, but that's not a very interesting question. For example Q and Z have the same cardinality but they look nothing like each other other than the fact that Q is analogous to the fraction field of Euclidean Domain Z.
Quaternions are known to represent spatial transforms, and there is a little bit of prior work that demonstrates quaternion filters 'make sense'.
However, octonions are the obvious next step here: if you look at Appendix Figure 1, of "Deep Complex Networks" [1] , the authors authors used (Real + Complex), and Figure 1 of our paper[2] with quaternions uses (Real + Complex + Complex + Complex)!
You had me on "The field of deep learning...". Sounds seriously scientific to me. What's the next big flashy field? "Deep thought"? Oh, nope, Douglas Adams already covered that one.
[+] [-] adamnemecek|8 years ago|reply
[+] [-] MrQuincle|8 years ago|reply
[+] [-] dbranes|8 years ago|reply
[deleted]
[+] [-] thesz|8 years ago|reply
But quaternions are themselves generalized by Geometric Algebra. And there is a plenty of information about use of GA in the field of neural computing: https://arxiv.org/pdf/1305.5663.pdf (page 3). For example, universal approximation theorem for GA is presented at https://www.informatik.uni-kiel.de/inf/Sommer/doc/Dissertati...
I think that fine article is a step back.
[+] [-] cgearhart|8 years ago|reply
> Theorem 6.4 ([2]) Complex FCMLPs having (6.9) as activation function are only universal approximators in L∞ for the class of analytic functions, but not for the class of complex continuous functions.
> ... the complex numbers (C0,1) are a subalgebra of the quaternions (C0,2). Hence the quaternionic logistic function is also unbounded. Neither could it give rise to universal approximation (w.r.t. L∞) since this does not hold for the complex case. One may argue that such things become more and more less important when proceeding to higher dimensional algebras since less and less components are affected. This is somehow true, but it hardly justify the efforts.
> ... Summarising all of the above the case of Complex FCMLPs looks settled down in a negative way. ... Hence Complex FCMLPs remain not very promising.
Unless I'm misreading, it seems already known that you _can_ use complex numbers (or quaternions) in neural networks...but you don't really gain anything from doing it.
[+] [-] enriquto|8 years ago|reply
[+] [-] Jeff_Brown|8 years ago|reply
[+] [-] highd|8 years ago|reply
[+] [-] gaudetcj|8 years ago|reply
[+] [-] doyoulikeworms|8 years ago|reply
Even if it was, like, years of studying. I’m just curious how deep this rabbit hole is.
[+] [-] theoh|8 years ago|reply
So to get started with reading this paper you just need to learn about deep learning, and then also the very basics of quaternions, which would be taught in, for example, a first course on abstract algebra.
[+] [-] cgmg|8 years ago|reply
[+] [-] freethemullet|8 years ago|reply
[+] [-] loxias|8 years ago|reply
[+] [-] danharaj|8 years ago|reply
The algebra of real numbers is simply less structured than the complex numbers. One of the key properties of the complex numbers is that they naturally have both a magnitude and a phase. This lets them capture phenomena that have a notion of superposition and interference.
As you correctly pointed out, you can simulate a complex number with two real numbers. The key is to exploit the particular geometric and algebraic properties of the complexes. One example in neural networks is the phenomenon of synchronization, where the outputs of neurons depending on the presence of a particular stimulus all have the same phase. This can be exploited for applications such as object segmentation.
So the widest possible view of this line of research is that putting more algebraic structure on your parameters can improve the behavior of your learning algorithms. My extremely hot take on how far this can go is a full fledged integration of harmonic analysis and representation theory into the theory of deep learning.
[+] [-] gnulinux|8 years ago|reply
[+] [-] moultano|8 years ago|reply
[+] [-] kuwze|8 years ago|reply
[0]: https://www.haroldserrano.com/blog/best-books-to-develop-a-g...
[1]: https://www.amazon.com/Quaternions-Computer-Graphics-John-Vi...
[+] [-] eleitl|8 years ago|reply
[+] [-] naveen99|8 years ago|reply
[+] [-] godelmachine|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] mike_n|8 years ago|reply
[+] [-] wespiser_2018|8 years ago|reply
However, octonions are the obvious next step here: if you look at Appendix Figure 1, of "Deep Complex Networks" [1] , the authors authors used (Real + Complex), and Figure 1 of our paper[2] with quaternions uses (Real + Complex + Complex + Complex)!
[1] https://arxiv.org/pdf/1705.09792.pdf [2] https://arxiv.org/pdf/1712.04604.pdf
[+] [-] danharaj|8 years ago|reply
[+] [-] snissn|8 years ago|reply
[+] [-] dschuetz|8 years ago|reply