top | item 18091736

Genetic algorithms for training deep neural networks (2017)

101 points| 1gor | 7 years ago |eng.uber.com | reply

20 comments

order
[+] wholemoley|7 years ago|reply
The article is from last year but it's still extremely valuable and interesting.

Exploring this topic is currently my primary hobby. Specifically, I've been using OpenAI's retro (Sonic, Contra, Mario, Donkey Kong and, more recently FZero) and comparing the ancient NEAT with more fashionable stuff like DQN, PPO, A3C and DDPG.

With my extremely limited experience, NEAT seems to outperform all of these other algorithms. I believe the advantage is the potential for strange/novel network structure.

And the best part is that NEAT doesn't require a powerful GPU.

Apologies for the shameless plug but here's a link to a series on youtube I made about using Retro and NEAT together to play Sonic. https://www.youtube.com/watch?v=pClGmU1JEsM&list=PLTWFMbPFsv...

[+] i_phish_cats|7 years ago|reply
You are evolving the topology, but using regular gradient descent/backprop for any given network, correct?
[+] guybedo|7 years ago|reply
For those interested, i built a python app on top of tensorflow/keras to do neural networks architecture & hyper parameters search with genetic algorithms.

https://github.com/guybedo/minos

[+] antirez|7 years ago|reply
Note that here the weights are evolved, not just the parameters. Gradient descent is not used at all AFAIK.
[+] BrandoElFollito|7 years ago|reply
I lost it at "emerging revolution".

My PhD thesis in 2000 already used genetic algorithms for seeds and it was hardly new then.

[+] hacker_9|7 years ago|reply
Your missing the point I think. GAs aren't new, same as neural networks which were thought up half a century ago. It's the finding that GAs work at such a scale, with 100 of layers and millions of nodes and even outperform SGD, that is of interest here.
[+] ur-whale|7 years ago|reply
Genetic algorithms are (and were already back in 2000) a pretty decent and -more importantly- generic solution to the problem of global optimization (as opposed to local optimization) when the problem to optimize has some sort of (maybe not-so-smooth) structure.

Many of the recent "AI" development very often boil down to finding a local extremum using some sort of ski down the slope optimization program (aka "training"). These techniques very rarely tackles global optimization, or when they do, bundle it up under the moniker "hyper-parameter tuning".

A good example of something that falls under "global optimization" and isn't often tackled in deep learning would be finding the correct deep net architecture for a given problem.

The problem doesn't lend itself very well to local optimization, but might yield to GA-type optimization.

[+] LoSboccacc|7 years ago|reply
I had a genetically trained neural network driving a car simulation* in qbasick in 1997, and it wasn't "novel" it was quite common back then to combine them.

*a very limited one, imagine 2d, no inertia, 5 sensors at 15° each other each reporting distance from the nearest "grass", with the road 20px wide, drawn by hand gray on green and fitting the screen 13 space. network was 5x5 neuron, with weights all part of the gneetic code. car never made past the fourth corner, let alone did a lap.

[+] mark_l_watson|7 years ago|reply
I tried training small RNN models using GA around 1990. I had lunch with John Koza (the genetic programming pioneer) and he suggested that it was an interesting idea but would not scale. The Uber team, my controlling mutation, got it to scale - good for them. I did a few years later use this as an example in my book "C++ Power Paradigms", McGraw-Hill 1994 (Genetic Algorithms, Neural Networks, and Constraint Programming).
[+] bayesian_horse|7 years ago|reply
Too many ideas, too little time! I am already thinking for a while about how Deep Learning and Genetic Algorithms could benefit each other.

GAs allow optimization of parameters without a differentiable loss function - a major problem with evaluating behavior of a neural model, for example.

But also GAs could benefit from ML/DL. Predicting loss functions from a chromosome representation (to save computing time), learning to select promising pairs and even learning cross-over operators.

[+] DrBazza|7 years ago|reply
This is the not-so-secret sauce when training neural nets and backtesting for algorithmic trading. It dramatically reduces the time taken.
[+] hacker_9|7 years ago|reply
Sounds interesting, are you able to go into more detail?
[+] vmchale|7 years ago|reply
step 1: hook it up to a car