Computer program wins at no-limit Texas Hold 'Em by trusting its gut
Like human players, program honed instincts through repetition, playing 10 million poker hands against itself
A computer program has learned to win at one of the most complex poker games by copying a very human impulse — trusting its gut.
"I think there's a lot of similarities to real human intuition," said Michael Bowling of the University of Alberta's Computer Poker Research Group.
Teaching poker to computers has been a popular tool in the artificial intelligence community for years.
Unlike games such as chess, no poker player knows what cards other players hold. Having to deal with incomplete information makes poker programs useful in everything from improving public security to helping doctors treat patients with diabetes.
Bowling's lab has long worked on poker and attracted worldwide attention in 2015 for developing Cepheus, a program that was unbeatable in two-handed, fixed-bet Texas Hold 'Em.
The lab's latest achievement, revealed Thursday in the journal Science, went after the much more complex version of Texas Hold 'Em in which there is no limit on bets.
Cepheus worked by allowing the computer to learn from mistakes. After billions of hands, it developed a 10-terabyte table of probabilities that made it unbeatable.
But that wasn't going to work for no-limit Hold 'Em.
The fixed-bet version of the game has decision points equal to the number 10 with 14 zeros after it. The no-limit equivalent is 10 with 160 zeros after it.
"That's more than there are atoms in the universe," said Bowling — way too many to simply crunch through probabilities.
Bowling's team found the answer with a program called Deep Stack.
"Deep Stack doesn't compute the whole strategy beforehand," Bowling said.
Deep Stack strategizes like human player
"It's going to compute how it's going to play online, as it's playing. It's going to only worry about the decision points it reaches while it plays, and figure out how to play those on the fly, in the middle of the game, much more like a human player."
The key is in developing what Bowling calls intuition. Like a human player, Deep Stack trains its instincts through repetition — in this case, 10 million poker hands played against itself.
"Deep Stack will play against itself over and over again until it figures out, 'I think this is how much it's worth, being in this poker situation,'" Bowling said.
That data gets fed into a program that recognizes patterns, which in turn allows Deep Stack to deal with new situations.
"If it does a good job, what I should be able to do is feed it a poker situation that's not any of the situations it's seen before, and it should still give me a good answer for how valuable that situation is.
"It's going to generalize from all its past experience."
Remarkably, Deep Stack doesn't take that much horsepower. While Cepheus required vast amounts of computing ability, Deep Stack can be run on an off-the-shelf gaming laptop.
And it seems to work.
In December, Deep Stack played a tournament of 3,000 hands with 30 poker professionals from around the world. The program beat all 11 professionals who completed all 3,000 hands.
It was the first time a computer beat humans in a two-handed, no-limit game.
"There were some players that, even when you showed them that they were losing, they were convinced they were not," Bowling recalled. "This is the nature of poker. You can always convince yourself that you just got unlucky."