How Libratus Managed to Defeat the Pros
The geniuses behind the Liberatus poker robot – the first poker program to defeat a professional player – have revealed some further information on how they were able to perform such an impressive feat. Details shared by Tuomas Sandholm and Noah Brown of Carnegie Mellon University whilst speaking with the American Association for the Advancement of Science have shed some interesting light on how exactly the Liberatus bot managed to perform so well in its heads-up no-limit hold’em contest this year.
Turning Point in AI
At the start of 2017, Liberatus played against four hold’em experts in a competition dubbed “Brain vs AI”, with the 120,000-hand match spread out across 20 days at the Pittsburgh casino. The players in question were Dong Kim, Jason Les, Jimmy Chou, and Daniel McCauley, and by the end of it all the pros were down by a collective 1,776,250 chips, or around 14 big blinds per 100 hands, marking a decisive win for the AI. Moreover, every human player lost against the AI, representing a major turning point since Libratus lost against a group of other pros back in 2015.
Complex Programming
It’s difficult to create an AI program that can beat a pro because of how complex poker can be, with so many decision points making it almost impossible to pre-compute a strategy for every hand. Elaborating further, Sandholm and Brown wrote:
“An iterative algorithm was used to near-optimally solve heads-up limit Texas hold’em, a relatively simple version of poker, which has about 10^13 unique decision points. In contrast, HUNL has 10^161 decision points, so traversing the entire game tree even once is impossible. Pre-computing a strategy for every decision point is infeasible for such a large game.”
However, Liberatus has three main modules that allow it to regard every hand of poker separately, thus allowing it to create strategies in real time. Even though there are a multitude of decision points, the computer scientists said that they were still were cognizant of not over-simplify things to such an extent that humans could exploit Libratus easily.
Differentiating Individual Hand Strategies
There is no fundamental difference between a king-high and a queen-high flush, and so treating such hands as identical helps makes the game less complex and easier to compute. Nevertheless, there are some obvious differences between the two hands, which can ultimately make the difference between a winning and losing hand.
However, analysts seem to point more towards the river strategy of Libratus as the main reason that it succeeded against its human opponents. According to observers, Liberatus had a well balanced and powerful river over-bet approach that used both bluffs and value bets to confuse its human players.
Using this strategy, Liberatus was then able to use “blockers” during poker hands to gain the advantage. This is because there are times when you don’t have a very good hand by the river, but the river tells you that your opponents doesn’t either. This then gives you the chance to make a huge bet as a bluff in order to take down the pot.
Adapting and Learning
Libratus is different from other poker AIs in that it has the programming to alter its wagers based upon the bets humans made, rather than being stuck making a pre-computed bet on the river. Sandholm and Brown subsequently wrote that their algorithm was ran on an abstraction that was detailed for the first two rounds, but coarse in the final two rounds. As they then explained:
“.. Libratus never plays according to the abstraction solution in the final two rounds. Rather, it uses the abstract blueprint strategy in those rounds only to estimate what reward a player should expect to receive with a particular hand in a subgame. This estimate is used to determine a more precise strategy during actual play.”
What Next?
So what comes next for Liberatus? One challenge on the horizon could involve conquering six-handed poker. At present, however, the scientists have said that such a feat is beyond the abilities of Libratus, although research is currently on going on the area, with Noam Brown stating his belief that it may eventually be accomplished within just two years.