Sean Stern ‘09

SPE ‘06

Advisor: Umar Syed

 

Artificial Texas Hold’em Player

Links

Background

Texas Hold’em

For those unfamiliar with Texas hold’em, Wikipedia offers an excellent introduction to the game, including rules, strategy, and terminology.

 

 

Poker and Artificial Intelligence

Today, poker represents one of the great challenges of AI gaming research. At present, work is underway to develop new and more sophisticated artificial poker players, some of which can be challenged on the internet.

 

Nevertheless, AI gaming researchers have yet to achieve the same success in poker as they have in other areas like chess, checkers, and Othello. Poker represents such a great difficulty for researchers because of a number of different factors.

 

First and foremost, poker, in stark contrast to the games just listed, is not a game of perfect information. That is, a player does not have access to all the information about his opponents’ current states. Because the correct course of action at any point in the game is determined in large part by the opponents’ current states (specifically, their cards), it is extremely difficult to select the correct action without access to this information.

 

Moreover, the means by which poker programs can estimate an opponent’s cards is relatively limited. While human players can analyze both the physical appearance and the betting practices of an opponent to look for tells, artificial players can only use the betting practices of an opponent to estimate his cards. Though this is a rich source of information about an opponent, it is in not necessarily linked to his actual hand. As Tim Harford writes, the fundamental problem is that “there is no straightforward connection between what a player bets and what hand he holds.”

 

Poker is further complicated by the fact that the correct course of action depends not only on the current state of the game, but also on the future state of the game. Bets and calls should only be made if the chances of winning are commensurate with the pot odds. However, this requires information about all future bets. Like the opponents’ current sates, this can be estimated, but finding a reliable means to do so has proven difficult.

 

Finally, methods that have aided gaming researchers in the past have proven woefully inadequate for poker. Specifically, game theory appears to provide little or no hope of “solving” poker. Optimal solutions (series of actions to maximize one’s chances of winning) cannot be found by even the fastest computers. In addition, game theory rests on the principle that players will always choose the action that optimizes their returns. Sadly, though, most human players are not mathematicians, and, as such, this is assumption rarely, if ever, holds true in real games.

 

 

 

 

Objective

The goal of my project was to create a Java program that could play no-limit Texas hold’em against an arbitrary number of opponents in real time.

 

 

Project Overview

My Strategy

Change My Objective

After a few weeks of work, I quickly discovered that writing an artificial no-limit hold’em player would be too difficult to achieve with my limited time and knowledge.

 

No-limit hold’em presented a problem for two reasons. Because no-limit allows arbitrarily large bets, bets made are less likely to correspond to the actual quality of an opponent’s cards (because he is bluffing or is a poor player). Thus, it is difficult to gauge one’s chances of winning. Moreover, because bets can be any size, it is very difficult to establish a reliable estimate of the final pot size (necessary for computing pot odds).

 

With this in mind, I changed my project from a no-limit Texas hold’em player to a fixed-limit or limit Texas hold’em player. The important difference between the two (for my project) is that the structure of limit hold’em increases the likelihood that an opponent’s bets are strongly tied to the quality of his cards (that is, reduces the likelihood of successful bluffs) and allows more accurate estimation of each opponent’s final bet and, thus, the final pot size.

 

 

My Algorithm

In the end, the algorithm I chose for my artificial player was to estimate the pot odds and the odds of winning. If the odds of winning are greater than the pot odds, the artificial player should make the appropriate bet available to it. If not, it should fold.

 

Though this seems fairly simple, the difficulty lies in determining reasonable estimates for the odds of winning and the pot odds.

 

To determine pot odds, my program incorporated the following strategy:

 

 

Let bi be the average total bet per round for player i. Let ci be the current amount in the pot from player i.

Let mi = max (bi, ci)

 

When it is a given player’s turn to bet estimate that the total money in the pot at the end of the round will be:

 

The sum of the ci’s from people who have already folded, plus

k * (maxi mi), where k is the number of people who have not yet folded, and i ranges over just the people who have not yet folded.

 

In other words, the program assumes that everyone currently in the hand will not fold (which implies they will all bet the same total amount), and that they will bet as much as the opponent who bets most aggressively and has yet to fold.

 

The reason for selecting this strategy of estimating the pot is that it appears to be a relatively conservative one. That is, the program rests on the assumption that it cannot force people out of the pot, and must have a hand that has a chance at beating everyone currently in the pot.

To determine the odds of winning, my program incorporated a slightly more difficult algorithm.

 

A straightforward approach would be simply to sample random games. That is, for every unknown card on the table (including opponents cards and future community cards to be dealt) have the program select a card at random from the possible cards remaining in the deck, and assume that the unknown card is the selected card. Once all unknown cards have been filled in, simply have the program determine if it wins or not. If the program repeats this process of filling in unknown cards and recording wins, the artificial player can get a relatively accurate idea of it’s chances of winning.

 

However, this method does not return the realistic chances of winning because in poker, it are not likely to have completely random cards. That is, this method assumes that opponents are just as likely to continue playing good cards as bad cards.

 

A more realistic approach assumes that opponents who are left have good cards because bad cards would have been folded earlier in the game. Thus, for an artificial player to determine its chances of winning in a realistic fashion, it must estimate the sets of cards an opponent is likely to have.

 

To do this, my artificial player models opponents using the random sampling method. That is, when an opponent bets, the program estimates the opponents pot odds after the bet (using the method described above). Then, it randomly selects a set of cards for the opponent. Keeping this specific opponent’s set of cards fixed, it then samples random games. That is, the artificial player puts a random card in place of every unknown card from the perspective of the opponent, including the artificial player’s cards. It then records if the opponent won with this fixed set of cards. Then using the same fixed set of cards for the same opponent, it repeats this process of filling in unknown cards and recording wins a number of times. From this data, it can determine the opponent’s chances of winning with a given set of cards. If the opponent’s chances of winning exceed the opponent’s pot odds, then the artificial player remembers that this is a possible set of cards for the opponent. Then, it selects another fixed set of cards for the same opponent and repeats the random sampling method for that set, remembering it only if the opponent’s chances of winning exceed the opponent’s pot odds. Repeating this process a number of times then can generate a set of possible sets of cards that an opponent may have based solely on his bets.

 

 

Thus, when it is the artificial player’s turn to determine his chances of winning, it use the same sampling method to determine its odds of winning except for one crucial difference. Instead of filling in each opponent’s unknown cards with random cards, the program fills the unknown cards  in with cards it remembered when it modeled the opponent. Thus, the artificial player’s chances of winning are adjusted because it is no longer playing against opponents with completely random cards.

Determining Pot Odds

Determining Odds of Winning

My Results

At present, I have not finished my project. Though I have many of the methods in place for the artificial player, in order to test it, I need something that it can play on and people that it can play against. As such, I am now working on creating a program where people can play limit hold’em over a network through a text based interface. When this is completed, I will then be able to bring the methods for my artificial player together into one program that will play over the network.

 

 

Play

Learn

Watch

Thanks to:

My advisor, Umar Syed, for all the help and ideas

 

Kevin Wayne, for organizing the SPE and getting the CS department to actually pay me to work on this for 6 weeks, not to mention the ice cream