top | item 47042424

(no title)

gwern | 13 days ago

My immediate thought is that OP is reinventing dynamic programming/RL from first principles. The final visualization looks exactly like a standard value estimate heatmap. Golf is a MDP over all the physical points on the course, with stochastic probabilities of transition to each one based on golfer skill and physical randomness. Strokes are the cost to be minimized, the colors are the value estimate at each state, and his difficulties with the different maps is because a value function is defined as the expected value of being in that state assuming you will follow a particular policy thereafter (ie. be a golfer of a particular skill level, playing optimally for that skill). This lets you formalize 'strategicness' of a golf course: it is how much the value estimates differ on average across the full range of golf skills; a non-strategic course looks identical for the beginner and pro, while an incredibly strategic course might have completely different values for every point for every bracket of skill. (You could probably automate creation of pathological golf courses this way, where even a slight increase in skill makes the new strategy different.)

discuss

scoofy|13 days ago

So, yes, OP here and you're effectively right on the money. Mark Broadie's strokes gain approach is literally dynamic programming applied to golf. He even discusses a bit of the history of dynamic programming in Every Shot Counts.

The point of what I'm doing here is pointing that strokes gained approach at the golf course instead at the player. Ideally, I'd like to continue working on it to build something that can help clubs make minimal, inexpensive changes while maximally improving the strategic interest if the way the course plays.

gwern|9 days ago

> Ideally, I'd like to continue working on it to build something that can help clubs make minimal, inexpensive changes while maximally improving the strategic interest if the way the course plays.

Yes, that's why I mentioned that this gives you at least one possible measure of 'strategicness' which can be computed automatically. Measure the variance across policies (skill level, which maybe you could define physically as some sort of 'mean absolute circular error' in strokes?), and now you can do quite simple optimization routines to modify courses. Like take a course, randomly flip some squares to sand/water/grass/etc, compute the new strategicness, and keep it if it's higher. Random search, simulated annealing, CMA-ES, novelty search, lots of easy possibilities I bet even a LLM could implement for you these days which would let you take an existing golf course and search for new modified layouts with higher strategy, to inspire a human expert in modifying a course.