An evaluation function, also known as a heuristic evaluation function or static evaluation function, is a function used by game-playing programs to estimate the value or goodness of a position in the minimax and related algorithms. The evaluation function is typically designed to prioritize speed over accuracy; the function looks only at the current position and does not explore possible moves (therefore static as opposed to pre-planning algorithms). In this project, I've designed multiple agents for the classic version of Pacman, including ghosts. Let's see how well it performs!
- A capable reflex agent will have to consider both food locations and ghost locations to perform well.
- The reciprocal of important values (distance to food) are used as features rather than just the values themselves.
- The evaluation function evaluates state-action pairs only.
- Default ghosts used are random.
- Implemented minimax tree with multiple min layers (one for each ghost) for every max layer.
- A single search ply in planning is considered to be one Pacman move and all the ghosts' responses, so depth 2 search will involve Pacman and each ghost moving two times.
- Evaluation function is now evaluating *states* rather than actions, as we were for the reflex agent. Look-ahead agents evaluate future states whereas reflex agents evaluate actions from the current state.
- On larger boards, Pacman with naive minimax is good at not dying, but quite bad at winning. He'll often thrash around without making progress. He might even thrash around right next to a dot without eating it because he doesn't know where he'd go after eating that dot. This behavior is as expected - our evaluation function is horrible. We will get back to this later.
- When Pacman believes that his death is unavoidable, he will try to end the game as soon as possible because of the constant penalty for living. Sometimes, this is the wrong thing to do with random ghosts, but minimax agents always assume the worst.
- This agent uses alpha-beta pruning to more efficiently explore the minimax tree.
- The minimax visual above runs at a tree depth 2 minimax, the visual on the right runs on depth 3, with a speed improvement of a few seconds per move.
- The alpha-beta agent has minimax values identical to those of the minimax agent, although the actions it selects can vary because of different tie-breaking behavior.
- Pseudocode for alpha-beta pruning located on the right.
Notes: - Minimax and alpha-beta work well, but they both assume an adversary who makes optimal decisions. As with games such as tic-tac-toe, this is not always the best approach. Expectimax is useful for modeling probabilistic behavior of agents who may make suboptimal choices. Random ghosts are of course not optimal minimax agents, and so modeling them with minimax search may not be appropriate. - This expectimax pacman will no longer take the min over all ghost actions, but the expectation according to a percieved model of how the ghosts act. - As demonstrated on the right, Pacman has a more cavalier approach in close quarters with ghosts than with just minimax (shown above). In particular, if Pacman perceives that he could be trapped but might escape to grab a few more pieces of food, he'll at least try.
So! Is there some way to put everything together? And make the best of what we have? That answer is yes! With expectimax and a stronger evaluation function we can definately do much better. But who wants to limit themselves to expectimax when can get into machine learning? To be continued...