Thanks a lot, and another great suggestion for improvement. I also found that the common advice is "tweak hyperparameters until you find the right combination". That can definitely help. But usually issues hide in different "corners", both of the problem space and its formulation, the algorithm itself (e.g., just different random seeds have big variance in performance), and more.As you mentioned, in real applications of DRL things tend to go wrong more often than right: "it doesn't work just yet" [1]. And my short tutorial definitely lacks in the area of troubleshooting, tuning, and "productionisation". If I carve time for expansion, this will likely make top of list. Thanks again.
[1] https://www.alexirpan.com/2018/02/14/rl-hard.html
ubj|1 year ago
[2]: https://bostondynamics.com/blog/starting-on-the-right-foot-w...
[3]: https://www.incontrolpodcast.com/1632769/13775734-ep15-david...
alessiodm|1 year ago
You bring up a very good point though: more recent advancements and assessments should be linked and/or mentioned in the repo (e.g., in the resources and/or an appendix). I will try to do that sometime.