top | item 14798264

(no title)

plingamp | 8 years ago

In the past, i've found that these higher level libraries built on top of TF are useful for quick model building, but should be used cautiously. By having default hyperparameters it can be easy to blindly build semi-working models without knowing what's happening. I would recommend either reading the associated papers or implementing the models in code (at least once) before using these pre-built models. That being said, i'm really excited by this project! I think it'll save researchers a bunch of time.

discuss

TheIronYuppie|8 years ago

Disclosure: I work at Google.

This is great feedback! I'd love to hear more - if you'd like to send me some examples of what you've seen in the past with pitfalls, I'd love to share them with the team.

Thanks! aronchick (at) google

gwern|8 years ago

I like their focus on flexibility. I've tried a few deep RL implementations in the past and run into issues like their DQN or A3C implementation being hardwired in a number of ways to working only on ALE, with no way to use it on other problems (eg the CNN dimensions are hardwired).

naturalgradient|8 years ago

If I understand the project correctly it precisely does not advocate default hyperparameters but exposes all configurations through the declarative interface.

plingamp|8 years ago

I may have used the term hyperparameter too loosely. Yes, this project does a good job on taking a configuration first approach, but even they set some defaults. For example, they set relu as their default layer activation function. I haven't had time to see what other such defaults are being set.