Very well written. Unfortunately I already tried all these kind of tricks in some competitions but others were still able to beat my code. Are there additional resources for improving it even further?
Cache + avoid dynamic allocations like the plague. For example, high-perf search algos implement reversible transitions: instead of creating & pushing new states around all the time, modify just one state, in-place. And then apply the same transformation in reverse when backtracking.
If you design your data structures well – to reflect the required transitions and query operations, rather than what the problem looks like to a human when drawn on a piece of paper – the forward/backward transition is nearly a no-op. Just some binary bit fiddling over data that's already in a CPU cache. And there's NO DYNAMIC ALLOCATIONS at all. Your search will fly!
The OP also mentions another great "caching" technique, under "Entropy reduction". This is really hard but basically try to find symmetries in the search space which allow you to prune away entire subspaces apriori, without searching at all. Often it'll be something like "rotating by 90 degrees leads to the same position", mirror positions, invariance to color, time… the symmetry types and their runtime benefits are problem-specific, so you need to sit and think hard about what makes a solution unique.
In the limit, you may be lucky enough to prune away so much of the solution space that there's only a single state left. Congratulations: you've solved the problem analytically :)
Depending on the type of puzzle, it might be possible to work backwards. For example here, if you can compute reverse-transitions you could run one or two steps of that, and put all states from which the goal can be reached in two moves into a HashMap. Then you run the forward search and stop as soon as you hit anything in that HashMap. In theory, you can reduce the runtime to find an n-move solution from O(bf^n) to O(bf^{n/2}), at the cost of more memory use. This can also be combined with A*.
Low-level optimization can be worth it:
* You can try to pack the game state into integers and use bitwise operations. An 8×8 board can be stored as a 64 bit vector, so a `u64`. If you know the edges of the board are never occupied, then moving around can be as simple as a bit shift (probably not for this game).
* A smaller state representation also means that HashMap lookups will be faster.
* Instead of using a pair of integers to represent a position, use a single integer and save a multiplication for every lookup into a grid.
* Add a ring of impassible cells around the board, instead of checking for the edges of the board each time.
This is a great idea, and the low-level optimization tips are all excellent ones I have used in the past. I want to talk a little bit more about using bidirectional A* though, because I think it's very interesting. It's a great strategy in general, but this may be a case where it doesn't do as well.
Working backwards for this particular puzzle is very difficult because on each turn an actor may or may not move. This effectively increases the branching factor from 4 (one for each direction) to 4 * 2^n (for each of four directions, each actor may or may not have moved). In practice it would be lower than that upper bound, but it could still be significantly higher than the forward branching factor. A nice visualization for this to think of your start and end states as points in space, and your A* searches as cones emitting from one point and growing toward the other. The angle of the cone would be roughly approximate of your branching factor, and when your cones meet each other or a point the search is done. If your branching factor is the same forwards and backwards, you can travel through much less space by searching forwards and backwards simultaneously. However, if your backwards branching factor is higher then the cone from the end state will be much broader. This could travel through much more space than just doing a forward search.
This kind of behavior is very evocative one-way functions, and makes me think it might be related to NP-hardness in some way. I'm really not qualified to prove these kinds of statements though. Maybe someone else can offer a more rigorous mathematical perspective?
Haven't looked at this particular game closely but had a similar experience to OP with an Othello AI competition back in college. There were four killer features for me that allowed me to win (and to apparently keep winning for years after I left, according to my prof)
1. having a really good and creative heuristic. The one I used ended up taking 6 different ideas I had for heuristics and combining them together in a weighted average based on their performance in a randomized trial I conducted between the 6. My vague recollection is that slightly over-valuing 4-corners positions performs unexpectedly well in Othello, but there was a lot more to it than that. The actual effectiveness of various heuristics changes over time as the game goes on, though I never modeled or attempted to exploit this.
2. Knowing the exact memory and execution time bounds on my prof's machine and setting things up so that I can terminate exactly when the time is ~5ms away from running out. We were limited to exactly 1 second per turn.
3. Caching. This was especially important in my case since I was technically using 6 different heuristics. I actually pre-generated a cache of the 100 most popular gamestates I encountered during my randomized trials, and this vastly increased the average depth I was able to explore in the allotted calculation time for one turn (1 second), especially during early game.
4. This is a continuation of 3, but it's super important if you have a turn based game with execution time limits to not throw away your work between turns. If you can modify your search so that it is pausible / resumable (which you can do with some rather simple multi-threading), and then define a simple routine that lets you resume a previous search by quickly modifying the tree and then resuming instead of starting an entirely new one, you are going to explore much much more. This optimization even with a crappy heuristic is going to win 99% of the time against opponents who don't use it.
One thing I didn't explore but wish I had was trying to predict which heuristic in my library of heuristics is closest to that of my opponent, and then opting for a strategy that is most likely to beat that heuristic. This would look something like you calculate each turn what the most likely opponent heuristic is based on their moves so far, and then have a pre-computed table of each heuristic's "foil". Maybe this would only kick in after several turns. An even better version of this would probably be to just use the probabilities for each heuristic as the weighted importance of each respective foil, and use all the foils together in a weighted average.
Fun fact: this was all in Java at the time. I can only imagine what havoc one could wreck with this sort of approach in Rust.
Caltech, right? Were you the author of Flippy? There's a new generation of AlphaZero-style ML bots that's managed to finally dethrone it :)
I've been working on and off on a Rust Othello bot aiming to combine AlphaZero in the midgame with a fast endgame solver [1]. Probably the coolest feature that's currently finished is that valid moves are generated and executed with SIMD instructions, so searching a new position only takes a few clocks on a modern cpu.
Java is generally not too bad for competitive programming. Of course C++ will easily beat it, but generally only by a factor of 2x or so. Unlike Python, which can easily be 5-10x worse…
In my experience, the only way to make meaningful progress on performance from here on out is to:
- Squeeze out more entropy (for example, rotating states for symmetric boards)
- Make the heuristic function smarter (for example, by calculating the assignment bottleneck)
I wrote a Carcassonne solver once and found many little optimization opportunities by detecting fail states early for example. Avoiding dead ends saves a massive amount of time.
Radim|4 years ago
If you design your data structures well – to reflect the required transitions and query operations, rather than what the problem looks like to a human when drawn on a piece of paper – the forward/backward transition is nearly a no-op. Just some binary bit fiddling over data that's already in a CPU cache. And there's NO DYNAMIC ALLOCATIONS at all. Your search will fly!
The OP also mentions another great "caching" technique, under "Entropy reduction". This is really hard but basically try to find symmetries in the search space which allow you to prune away entire subspaces apriori, without searching at all. Often it'll be something like "rotating by 90 degrees leads to the same position", mirror positions, invariance to color, time… the symmetry types and their runtime benefits are problem-specific, so you need to sit and think hard about what makes a solution unique.
In the limit, you may be lucky enough to prune away so much of the solution space that there's only a single state left. Congratulations: you've solved the problem analytically :)
twanvl|4 years ago
Low-level optimization can be worth it:
* You can try to pack the game state into integers and use bitwise operations. An 8×8 board can be stored as a 64 bit vector, so a `u64`. If you know the edges of the board are never occupied, then moving around can be as simple as a bit shift (probably not for this game).
* A smaller state representation also means that HashMap lookups will be faster.
* Instead of using a pair of integers to represent a position, use a single integer and save a multiplication for every lookup into a grid.
* Add a ring of impassible cells around the board, instead of checking for the edges of the board each time.
taintegral|4 years ago
Working backwards for this particular puzzle is very difficult because on each turn an actor may or may not move. This effectively increases the branching factor from 4 (one for each direction) to 4 * 2^n (for each of four directions, each actor may or may not have moved). In practice it would be lower than that upper bound, but it could still be significantly higher than the forward branching factor. A nice visualization for this to think of your start and end states as points in space, and your A* searches as cones emitting from one point and growing toward the other. The angle of the cone would be roughly approximate of your branching factor, and when your cones meet each other or a point the search is done. If your branching factor is the same forwards and backwards, you can travel through much less space by searching forwards and backwards simultaneously. However, if your backwards branching factor is higher then the cone from the end state will be much broader. This could travel through much more space than just doing a forward search.
This kind of behavior is very evocative one-way functions, and makes me think it might be related to NP-hardness in some way. I'm really not qualified to prove these kinds of statements though. Maybe someone else can offer a more rigorous mathematical perspective?
sam0x17|4 years ago
1. having a really good and creative heuristic. The one I used ended up taking 6 different ideas I had for heuristics and combining them together in a weighted average based on their performance in a randomized trial I conducted between the 6. My vague recollection is that slightly over-valuing 4-corners positions performs unexpectedly well in Othello, but there was a lot more to it than that. The actual effectiveness of various heuristics changes over time as the game goes on, though I never modeled or attempted to exploit this.
2. Knowing the exact memory and execution time bounds on my prof's machine and setting things up so that I can terminate exactly when the time is ~5ms away from running out. We were limited to exactly 1 second per turn.
3. Caching. This was especially important in my case since I was technically using 6 different heuristics. I actually pre-generated a cache of the 100 most popular gamestates I encountered during my randomized trials, and this vastly increased the average depth I was able to explore in the allotted calculation time for one turn (1 second), especially during early game.
4. This is a continuation of 3, but it's super important if you have a turn based game with execution time limits to not throw away your work between turns. If you can modify your search so that it is pausible / resumable (which you can do with some rather simple multi-threading), and then define a simple routine that lets you resume a previous search by quickly modifying the tree and then resuming instead of starting an entirely new one, you are going to explore much much more. This optimization even with a crappy heuristic is going to win 99% of the time against opponents who don't use it.
One thing I didn't explore but wish I had was trying to predict which heuristic in my library of heuristics is closest to that of my opponent, and then opting for a strategy that is most likely to beat that heuristic. This would look something like you calculate each turn what the most likely opponent heuristic is based on their moves so far, and then have a pre-computed table of each heuristic's "foil". Maybe this would only kick in after several turns. An even better version of this would probably be to just use the probabilities for each heuristic as the weighted importance of each respective foil, and use all the foils together in a weighted average.
Fun fact: this was all in Java at the time. I can only imagine what havoc one could wreck with this sort of approach in Rust.
maxwells-daemon|4 years ago
I've been working on and off on a Rust Othello bot aiming to combine AlphaZero in the midgame with a fast endgame solver [1]. Probably the coolest feature that's currently finished is that valid moves are generated and executed with SIMD instructions, so searching a new position only takes a few clocks on a modern cpu.
[1] https://github.com/maxwells-daemons/reason
saagarjha|4 years ago
taintegral|4 years ago
- Squeeze out more entropy (for example, rotating states for symmetric boards)
- Make the heuristic function smarter (for example, by calculating the assignment bottleneck)
I wrote a Carcassonne solver once and found many little optimization opportunities by detecting fail states early for example. Avoiding dead ends saves a massive amount of time.