top | item 17698415

(no title)

I asked this on Twitter, but maybe folks here can answer better: how important is nonlinearity for deep neural networks? This method's output seems to be a linear function of its (complex) input. Does that put important bounds on performance? https://mobile.twitter.com/AndrewGYork/status/10228414045888...

discuss

czr|7 years ago

Echoing the other respondents–if you don't have a nonlinearity, your whole network is just a sequence of linear transforms, which (multiplied out) is the same as a single linear transform. Meaning that removing the nonlinearities gives you (effectively) a one-layer network.

transfire|7 years ago

Mind boggled that this article comes up now. Been working on similar tech recently, and the question of non-linearality arose right away. The discussed conclusion was "impossible". Yet I was able to design a crude NAND gate. So there has to be non-linearality it the quantum nature of diffraction and interference.

raverbashing|7 years ago

Linear functions/transforms have two important properties

    f(x) + f(y) = f(x+y)
    k*f(x) = f(x*k)

(hence derivatives and Fourier transforms are linear transforms)

But yeah, in NN nonlinearities are very important, otherwise they would be simplifiable to a single transformation

obastani|7 years ago

Though the other comments are correct, I want to point out that you can get some nontrivial behavior with only linear functions. For example, low-rank matrix factorization is kind of like a neural network

f(x) = U * V * x,

where U is an n by k matrix and V is a k by m matrix, where k is much smaller than n and m. Basically, we are constraining the set of allowed linear transformations, which is a form of regularization. Convolutional layers in neural networks similarly restrict the allowed linear transformations.

Nevertheless, the power of linear neural networks is far less than that off nonlinear networks.

1024core|7 years ago

Without a non-linearity, it's just a linear function. So the performance won't be that great.

h4b4n3r0|7 years ago

Nonlinearity is very important and is the only reason why neural nets can approximate arbitrary functions. You can’t do that with linear transformations alone. Though from briefly skimming the paper they do seem to achieve similar effects through phase modulation. Otherwise even MNIST would be out of the question.