I'd also like to mention the classic book "Reinforcement Learning" by Sutton & Barto, which goes into some relevant mathematical aspects for choosing the "best" among a set of options. They have a full link of the PDF for free on their website [1]. Chapter 2 on "Multi-Armed Bandits" is where to start.[1] http://incompleteideas.net/book/the-book-2nd.html
No comments yet.