top | item 43480945

(no title)

dcrimp | 11 months ago

A mate of mine built a works scheduler using RL + MCTS. It was interesting seeing the scheduler get smarter as they added in reward for real life constraints. For example, certain types of work couldn't happen on a tuesday - they add that in to the reward calculation, retrain, it now avoids Tuesdays. Build up that reward calculation based on available data, and it got to be super capable at making a workable schedule. Also orders of magnitude faster than linear solvers (albeit without guarantee of "optimality").

discuss

No comments yet.