top | item 16609033

Deep Quaternion Networks (2017) [pdf]

151 points| adamnemecek | 8 years ago |arxiv.org | reply

73 comments

order
[+] adamnemecek|8 years ago|reply
There's a lot of space to be explored at the intersection of ml and hypercomplex numbers. There's a Clifford svm that unlike a regular sam which learns a hyperplane, learns any manifold.
[+] dbranes|8 years ago|reply

[deleted]

[+] thesz|8 years ago|reply
The fine article generalizes complex numbers to quaternions. Okay.

But quaternions are themselves generalized by Geometric Algebra. And there is a plenty of information about use of GA in the field of neural computing: https://arxiv.org/pdf/1305.5663.pdf (page 3). For example, universal approximation theorem for GA is presented at https://www.informatik.uni-kiel.de/inf/Sommer/doc/Dissertati...

I think that fine article is a step back.

[+] cgearhart|8 years ago|reply
Thanks for sharing this. There's a lot to digest in there, but there were a few highlights that stood out as possibly relevant to the OP paper.

> Theorem 6.4 ([2]) Complex FCMLPs having (6.9) as activation function are only universal approximators in L∞ for the class of analytic functions, but not for the class of complex continuous functions.

> ... the complex numbers (C0,1) are a subalgebra of the quaternions (C0,2). Hence the quaternionic logistic function is also unbounded. Neither could it give rise to universal approximation (w.r.t. L∞) since this does not hold for the complex case. One may argue that such things become more and more less important when proceeding to higher dimensional algebras since less and less components are affected. This is somehow true, but it hardly justify the efforts.

> ... Summarising all of the above the case of Complex FCMLPs looks settled down in a negative way. ... Hence Complex FCMLPs remain not very promising.

Unless I'm misreading, it seems already known that you _can_ use complex numbers (or quaternions) in neural networks...but you don't really gain anything from doing it.

[+] enriquto|8 years ago|reply
geometric algebra is the Rust of mathematics. You guys cannot simply let people do their work without trying to evangelize ?
[+] Jeff_Brown|8 years ago|reply
I skipped to the table at the end. The gains don't seem enormous. Is there a kind of problem where we would expect quaternions to perform dramatically better than other kinds of numbers?
[+] highd|8 years ago|reply
Not to mention that it appears they're comparing against networks of the same architecture. If you build your quaternion components with with same number types as your reals you effectively have 4 times the number of parameters, which could be most of the benefit. They should also benchmark against similar architectures with equivalent parameter counts.
[+] gaudetcj|8 years ago|reply
This paper was mostly to lay out the framework and give out the Keras layers for others to use. We expect that the biggest improvements will come from segmentation where the gains may come from treating each color channel as an imaginary axis. And from architectures like PointNet https://arxiv.org/abs/1612.00593.
[+] doyoulikeworms|8 years ago|reply
I’m able to follow neither the article nor the discussion. What would I have to learn in order to be able to?

Even if it was, like, years of studying. I’m just curious how deep this rabbit hole is.

[+] theoh|8 years ago|reply
Quaternions are an extension of the idea of complex numbers. Complex numbers have a real and an imaginary part, while quaternions have a real part and multiple imaginary parts (3). So the basic idea is that these richer types of number, when used to build a network (instead of plain real numbers) have benefits.

So to get started with reading this paper you just need to learn about deep learning, and then also the very basics of quaternions, which would be taught in, for example, a first course on abstract algebra.

[+] cgmg|8 years ago|reply
In summary: Just as the complex numbers are defined by adding a new element i such that i^2 = -1, the quaternions are defined by adding elements i, j, k such that i^2 = j^2 = k^2 = ijk = -1.
[+] loxias|8 years ago|reply
Does use of complex numbers really provide improvement? How does this work? (other than cramming 2 numbers into 1... which itself is suspect... the complex plane has the same cardnality as the reals...)
[+] danharaj|8 years ago|reply
Cardinality is a complete red herring here, we don't care about the set theoretic structure, and ultimately we're taking finite approximations anyway. The structures we care about are the metric which tells us which solutions (neural nets in this case) are nearby to each other and the algebra which tells us how to compose solutions.

The algebra of real numbers is simply less structured than the complex numbers. One of the key properties of the complex numbers is that they naturally have both a magnitude and a phase. This lets them capture phenomena that have a notion of superposition and interference.

As you correctly pointed out, you can simulate a complex number with two real numbers. The key is to exploit the particular geometric and algebraic properties of the complexes. One example in neural networks is the phenomenon of synchronization, where the outputs of neurons depending on the presence of a particular stimulus all have the same phase. This can be exploited for applications such as object segmentation.

So the widest possible view of this line of research is that putting more algebraic structure on your parameters can improve the behavior of your learning algorithms. My extremely hot take on how far this can go is a full fledged integration of harmonic analysis and representation theory into the theory of deep learning.

[+] gnulinux|8 years ago|reply
Cardinality has nothing to do with this discussion. Quarternions have different structure than real numbers. Over every set, you can find any (reasonable) structure you want, but that's not a very interesting question. For example Q and Z have the same cardinality but they look nothing like each other other than the fact that Q is analogous to the fraction field of Euclidean Domain Z.
[+] moultano|8 years ago|reply
It isn't just the size of the space that matters, but how smooth and connected it is.
[+] eleitl|8 years ago|reply
This assumes numerics is free at very large scale, which is not reasonable if you want to create efficient biologically inspired AI.
[+] mike_n|8 years ago|reply
if quaternions, why not octonions?
[+] wespiser_2018|8 years ago|reply
Quaternions are known to represent spatial transforms, and there is a little bit of prior work that demonstrates quaternion filters 'make sense'.

However, octonions are the obvious next step here: if you look at Appendix Figure 1, of "Deep Complex Networks" [1] , the authors authors used (Real + Complex), and Figure 1 of our paper[2] with quaternions uses (Real + Complex + Complex + Complex)!

[1] https://arxiv.org/pdf/1705.09792.pdf [2] https://arxiv.org/pdf/1712.04604.pdf

[+] danharaj|8 years ago|reply
The lack of associativity might suck.
[+] snissn|8 years ago|reply
and to that - why not N-nions for large values of N?
[+] dschuetz|8 years ago|reply
You had me on "The field of deep learning...". Sounds seriously scientific to me. What's the next big flashy field? "Deep thought"? Oh, nope, Douglas Adams already covered that one.