Game playing programs
This blog explains my approaches to a game contest hosted by HackerEarth. The contest asked programmers to put forward their bots to play games of chain reaction. You can find the source code here.
In case you haven't played chain reaction, this post should be a treat for you.
Let me warn you that the following will involve some heavy technical words. But all of them are easier to learn than they sound. In my limited experience of developing game playing AI, I have found the following approaches to be most useful.
- Min-Max tree generation
- Heuristic Improvements
- Iterative deepening
- Alpha-Beta Pruning
- Program Optimization
Min Max tree generation
A Min Max tree mimics natural human thinking. When playing tic tac toe, our thinking goes like this:
What you see above is exactly what a min max strategy is. It considers your move, your opponent's possible responses, and your subsequent responses. The loop goes on till we find a terminal game state, in which case we return the terminal node's value.
Every possible board position maps to a value. The value is directly proportional to how much X favors the position. If X wins (The green position), the position has +∞ value. For a loss (Circled with red), it evaluates to -∞.
Each layer in the tree contains all possible positions for that move number. If X was playing first, we know that the second layer is O's turn. The third is X's, and so on. When a terminal value reaches the parent, it tries to maximize (if X) or minimize (if O) the score from all its available branches.
Generating the entire min-max tree from move 1 for all moves is infeasible. We need a method of guessing how good a position is without looking at every possibility from it.
Heuristic Evaluation
If you have played any sport, you have had moments when your instinct wakes up. Danger, opportunity, momentum... all these feelings rise from an acute understanding of game positions.
So, play a few games in the competition. Try to get a sense of a given position. If a position feels good, then ask yourself “why”. What are the elements on the board which are key factors to this assessment? Is your king unsafe, or has your opponent kept her pawns hanging? You can think of this mathematically as:
instinct (game_position, player_to_move) = feeling
No wonder heuristics are the most powerful and fun stuff to work on. A heuristic is an instinct function for a computer. As generating the entire min max tree from move 1 is infeasible, a cutoff on search depth is set. If there is no result up to that point, we treat it as a terminal state and return the heuristic value of that position.
f(game_position, player_to_move) = evaluation
Note down the factors you use for assessing a position and assign a weight to each factor. Then design a heuristic using those factors. Be careful not to make things too fancy though. A complicated heuristic makes us reluctant to change it. I had the following function for chain reaction:
f(position, player) = (player ==1 ? 1: -1) * (cell_diff() + explosive_diff())
It performed better than the other complicated stuff I kept trying throughout the competition.
Iterative deepening
Now the problem with choosing a cutoff depth is that it is too rigid. The first few positions have fat min-max trees because the number of possible moves is large. The positions later have thin trees when the game is almost decided. We need a flexible way to go as deep as possible without running out of time.
Iterative deepening is creating min max trees of increasing depth in a loop. We start with a tree of depth 2-3 and increase depth by 1 for each iteration. Because a tree of height D is much larger than a tree of height D - 1, we safely assume that running multiple searches for smaller depths are feasible. The best result of the last iteration before timeout is then returned.
Then the algorithm adapts per the situation. If many moves are possible from a given position, it goes for a small depth. Otherwise, it goes as deep as it can. I cannot emphasize how important this technique is to make a competitive game program.
Alpha Beta Pruning
It isn't α - β so much as a pruning strategy that sets the toppers apart, but I suspect nearly all of them use this.
At the core of every competition is move prediction. We can assume that our opponent will always play to maximize his gains while minimizing ours. A min max tree consists of a lot of wasteful processing in this regard. Some nodes can be mathematically proved to be sub optimal once you have traversed through a few branches of that node.
α and β are two bounds set on every node of the min max tree. α represents the minimum points X will take here, and β is the maximum points O will give. Hence, X tries to maximize α, while O tries to minimize β.
A good explanation of implementing α - β can be found here.
Remember that for every extra move you can predict, your program gets much stronger. Putting things in perspective, if your program could evaluate 1000 nodes with a branch factor of 4, the maximum depth it could go to is 5. With Alpha Beta, you can easily evaluate up to depth = 6 even if you just have a reasonable heuristic.
Program Optimization
We now come to the most ingenious and complicated parts of developing game playing AI. No matter what you do, there is someone working hard to beat you. To win, you should optimize. But please, DO NOT optimize prematurely. Optimize only when
- Your program is completely bug-free
- You have applied techniques such as saving information in objects to avoid repeated calculations.
- You have test cases for different positions, and the reason your program fails some of them is because it needs a little more time to find the best move.
After that, we might need to get into higher levels of optimization. All the concepts here are advanced. I have attached a link for each concept and the table below describes their complexity and usefulness.
Technique | Effectiveness | Implementation | |
1. | Position Cache | Moderate | Moderate |
2. | Killer Move Heuristic | Moderate | Easy - Moderate |
3. | Symmetry analysis | Low-Moderate | Moderate - Difficult |
4. | Concise representations | High | Very Difficult |
5. | Object Pooling and Reuse | High | Difficult |
6. | Parallelization | Low-Moderate | Difficult - Very Difficult |
7. | Heuristic Improvement | High | Moderate - Difficult |
A good way to generate a lot of positions is to make your program play against itself, with each side’s heuristic function using different weights. We can then store positions from these games which are deciders. Also, this method can be used to improve the heuristic. Choose the weights per the one winning more games.
Making static objects for objects like “Move”' is a good idea. Also, if a function like “undo” can be implemented in O(1), then avoid creating multiple boards. If undo is impossible, create a store of board objects to avoid initializing too many boards.
An excellent paper I found for game development is on Othello. I suggest you go through it when in a crunch competition. Well, that’s it for now.
Participate here in HackerEarth's AI challenge, Battle of the Bots.