Ask HN: How do you version control your neural nets?
42 points| mlejva | 8 years ago
The same approach doesn't work with neural nets for me. There's 'only' one feature you want to implement - you want your neural net to generalise better/generate better images/etc (depends on the type of problem you are solving). This is very abstract though. One often doesn't even know what's the solution until you empirically try to tweak several hyper parameters and see the loss function and accuracy. This makes the branch model impossible to use I think. Consider this: you create a branch where you want to use convolutional layers for example. Then you find out that your neural net is performing worse. What should you do know? You can't merge this branch to your develop branch since it's a basically 'dead end' branch. On the other hand when you delete this branch you lose information that you've already tried this model of your net. This also produce huge amount of branches since you have enormous number of combinations for your model (e.g. convolutional layers may yield better accuracy when used with different loss function).
I've ended up with a single branch and a text file where I manually log all models I have tried so far and their performance. This creates nontrivial overhead though.
[+] [-] btown|8 years ago|reply
If you check this in, then every commit will include the diff of everything you tried to get there alongside the final source file, and additionally that file will serve as a single historical record for everything you tried for all time. Asking yourself a month later "did I ever try cross entropy" is as easy as grepping the file.
Heck, you could insert into a database as well if you really wanted to, and visualize your performance changes over time a la http://isfiberreadyyet.com/ . Sky's the limit.
[+] [-] kixiQu|8 years ago|reply
[+] [-] cityhall|8 years ago|reply
I end up with a bunch of code like:
By logging the hyperparameter dict, source checkpoint, and rand seed, results should be reproducible.This works well for rapid iteration like in jupyter notebooks. For models that take days to train, you might as well use source control for your scripts.
[+] [-] dwhitena|8 years ago|reply
[+] [-] taroth|8 years ago|reply
I got tired of maintaining one-off scripts to do recording, so I started working with friends on a dedicated solution. Today it lets you stream logs via a small Python library, then view individual training runs on an iOS/Android app. Takes less than a minute to get setup.
We're planning on expanding to model versioning in the next few weeks. Interesting to see how others are thinking about it. If you have model versioning thoughts you dont feel like posting here, drop me a note at [email protected]
[+] [-] agitator|8 years ago|reply
[+] [-] rpedela|8 years ago|reply
If the former, you could try a single experiment branch and use tags to denote different experiments. Add a tag when you finish an experiment then overwrite with your changes for next experiment and repeat. This would keep all the changes while not have having a huge number of dead branches and the branch could be merged when necessary.
If the latter, why not an experiment log that is checked in which has a similar form to a change log? Or maybe create an issue and branch for each experiment then update the issue with results and delete the branch?
[+] [-] p1esk|8 years ago|reply
[+] [-] kungito|8 years ago|reply
[+] [-] mlejva|8 years ago|reply
[+] [-] andbberger|8 years ago|reply
I've converged to a workflow where I maintain a library with a main project pipeline and reusable tools for the project, and do all scripting with jupyter (all notebooks version controlled).
I've found that machine learning projects can be pretty effectively parametrized with config dicts for data, training and the model. Each type of config gets it's own pipelined method that does all of the library calls - pipeline_batch_gen, pipeline_train, pipeline_build_model.
Example of a poorly organized config from a project:
model_config = { 'optimizer': optimizer, 'clip_grad': clip_grads, 'name': model_name, 'residual': residual, 'n_conv_filters': n_conv_filters, 'n_output_hus': n_output_hus, 'activation': activation, 'batch_norm': batch_norm, 'output_bn': output_bn, 'generation': generation, 'data_spec': { 'uniform_frac': uniform_frac, 'include_augment': True, 'batch_size': batch_size, 'bulk_chunk_size' : bulk_chunk_size, 'max_bulk_chunk_size': max_bulk_chunk_size, 'loss_weighter': loss_weight }, 'train_spec': { 'early_stopping_patience': early_stopping_patience, 'lr_plateau_patience': lr_plateau_patience, 'learning_rate': init_lr, 'clip_grads': clip_grads, 'partial_weight': partial_weight } }
I've wanted to give Sacred a try https://github.com/IDSIA/sacred - looks promising but haven't tried yet so can't comment.
I still tend to keep track of model performance by hand though. But I have always have the notebooks I can go back to for reference. This is something sacred could help a lot with.
Another very non-trivial aspect of this kind of work is the compute/storage infrastructure you need to scale beyond a single workstation.
We have a nice system here where $HOME lives on NFS and gets mounted when you log into any machine on the network - I can hardcode paths in my code and count on every worker having the same filesystem. I can't imagine how we would do distributed jobs without NFS. That's not a very realistic solution for homegamers though - you need a very fast network and expensive commodity hardware. And sys admins.
Does anyone have a solution for that half of the problem? I've seen a number of merkle-tree based data version control solutions recently...