Pluribus 3! from SB

SunPowerGuru · May 20, 2020, 9:24am

Here’s a few minutes of Pluribus playing 6 handed. I think it’s playing copies of itself, but not 100% sure. Very interesting bet sizing and sometimes odd post flop play. If you want to discuss a specific hand, include a time stamp.

puggywug · May 20, 2020, 1:22pm

I sure wouldn’t mind getting those hole cards and flops!

1Warlock · May 20, 2020, 2:12pm

Pluribus was playing with 5 humans in these games. I’ve seen a bunch of these sessions broken down with solvers. The AI and GTO lines are not in conflict and often in close agreement. I assume that’s because the 5 humans were all super high level players. Good stuff.

1Warlock · May 20, 2020, 7:09pm

What do you think of the hand with TT at 4:15? I doubt many people would play either players’ hands in this manner, pre or post flop. Pluribus has a lot of x as 3! OOP player. It really protects its ranges.

SunPowerGuru · May 20, 2020, 7:26pm

I’m not a fan of checking that flop with that board. Why give your opponent the chance to check and see a free card?

1Warlock · May 21, 2020, 8:41am

Its an interesting flop and the board is likely to change on future streets. I’m guessing that Pluribus didn’t think it had a 3-street hand OOP here. Maybe it was going for a x/r on the flop but opponent didn’t stab at it?

SunPowerGuru · May 21, 2020, 9:18am

Yeah, but you only fear QQ there, so bet big and fold them out, or at least make it a mistake to continue. The problem with trying to x/r is that you allow your opponent to determine if you can do it. I’m almost always betting second set on that board.

Pluibus seems to have a betting range that it applies across its entire range, which is something I like to do too. This way there is no apparent connection between my hand strength and my bet sizing. I use suits or flop colors to randomize, but then weight my bets based on opponent tendencies and board texture. This all fits under my range distortion umbrella.

I’ve suggested donk betting in several analysis threads. I think leading into the preflop raiser has an undeserved bad reputation, especially as a semi-bluff. It can throw your opponent off, gives you control, can help in pot control, often gets folds, can help define your opponent’s hand, and can make you look like an idiot… all good stuff. I was pleased to hear that Pluribus agrees!

I’m not much for looking at databases, because… free poker. Still, I would be very interested to see a much larger sample of hands from this AI.

I would also like to look under the hood and see what makes Pluribus tick. For example, is it using regret minimization algorithms (from decision theory) as part of its imperfect-information-game solves of the limited-lookahead subgames?

SunPowerGuru · May 21, 2020, 12:16pm

OK, I found this interesting, “In the case of six-player poker, we take the viewpoint that our goal should not be a specific game-theoretic solution concept…”

“Previous poker-playing bots such as Libratus coped with hidden information in games as large as two-player no-limit Texas Hold’em by combining a theoretically sound self-play algorithms based on Counterfactual Regret Minimization (CFR) with a carefully constructed search procedure for imperfect-information games.”

Previous bots, hmmmm.

“The version of self-play used in Pluribus is an improved variant of the iterative Monte Carlo CFR (MCCFR) algorithm.”

So it;s using a new hybrid Monte Carlo / regret minimization algorithm. This is fairly similar to the approach taken by AlphaZero, which started by taking totally random moves, then evaluated those games after the fact and used machine learning to gradually refine its strategies. AlphaZero was taught the basic rules of chess, played itself for 6 hours, and emerged able to beat any human or computer on the planet.

Unfortunately, this approach simply can’t be done by humans in one lifetime. We may have to settle for mimicking what it does without ever really knowing why. Someone would have to crunch the raw data and devise new theories that explain how it works, because its play is not based on any known theoretical approach.

Those wishing to stay one step ahead now have a clear direction to explore.

Source: Facebook, Carnegie Mellon build first AI that beats pros in 6-player poker

puggywug · May 21, 2020, 1:43pm

It’d be great if there was a way to play against pluribus online and see what it’s like to face it.

SunPowerGuru · May 21, 2020, 2:19pm

Yeah, that would be cool.

As it turns out, CMU has published the first 10,000 hands Pluribus played vs 10 pro players. Here are some links to summaries…

Pluribus LJ poker hands
Pluribus hand ranges HJ, Cut-off and Button
Small Blind play
Finally overall recap of raise first in strategy

I have not digested this yet, but one thing kinda stuck out: 63% SB range, split 50/50 between limp and bet. If I am understanding that correctly, it’s a little eye-opening. (this is when folded around to the SB)

I also like this quote from the bottom of that last linked page, “Not saying this is what we should do, just the poker hands Pluribus plays and their bet sizes, but if past history is anything to go by, decent players will move that way.” Well yeah, if they wanna stay one step ahead!

1Warlock · May 21, 2020, 4:35pm

I’ve been playing around with split ranges from the SB for a while now - its a pain in the butt to implement. Moreover, vs most populations there is considerable EV loss in not open raising yourself from the SB. I think Pluribus’ strategy here is related to why she didn’t bet the flopped set in the hand we were discussing above. Vs very good players, there is a higher frequency of villain taking stabs and building pots for you when you check. In populations many of us face, we can’t count on out opponents building pots for us and so we have to do it ourselves.

This is nothing more than conjecture on my part but I think there is at least some logic behind my guess. Mostly I’m just happy that I don’t have to play poker with any of these bots or pros and can simply armchair quarterback their play with my wallet safely tucked away

Yeah - this is my issue with some of the AI approaches. I’m all about knowing why something works because then I can adapt the logic to other circumstances. I am not interested in blindly following another approach - been there and done that.

SunPowerGuru · May 21, 2020, 5:03pm

I see 2 main problems, specific to the SB data, but also applying with varying extent, to all of it I have seen so far.

First, 10,000 hands might seem a a lot, but since the only summaries I’ve found are for “first in” situations, and we are talking a table full of pros, they are small sample sizes per hand value. How many times are the pros going to fold around to the SB in the first place? I’ll be searching for the data from the 120,000 hand run.

Second, is there collusion, implicit or otherwise? I’m sure top pros don’t want to get beat by a computer, how much is this affecting their game? I want to see all the hands, not just the ones Pluribus is involved with. Soft play, if any, should be relatively easy to spot in a large enough sample size. Is it Pluribus vs top pros, or vs a team consisting of top pros, intended or not?

If, as is almost certainly the case, Pluribus is using an adaptive strategy, this could have a major impact on the way it plays.

Yeah, but not for a straight randomized 50/50 split. If your first hole card is red, take one action, if it’s black, take the other. Want 37.5%? If your 1st card is not a heart (75%) and your second card is black, do A, otherwise, do B. (OK, that’s not EXACTLY 37.5%) If you want random, you can work out most values and let the pRNG do it.

I do see the obvious pitfalls of trying to apply a principal we don’t understand, but let’s not throw out the good because it’s not perfect! (haha, sorry) Maybe if we try some of these new ideas, the why will become clearer in time. If they become the de facto standard, do we have a choice?

puggywug · May 21, 2020, 5:46pm

Re: collusion against Pluribus… In a proper study the pro players wouldn’t know anything about their opponents (they would be anonymonized, with pseudonyms) so they shouldn’t know which is the bot. They might still be able to work it out pretty quickly, if it plays very differently from the way humans play.

SunPowerGuru · May 21, 2020, 5:53pm

Our earliest known ancestors, Australopithecus, evolved over 4 million years ago, and were untouchable in their rain forest canopy habitat. When the climate changed and the rain forests dwindled, they were forced to spend more time on the ground, where they were vulnerable. For the first time, they were forced to cooperate in order to survive. Where one rock-throwing Australopith would end up in the belly of an angry lion, 100 of them working together could drive it off or even kill it. Lacking a language, this behavior must have been instinctual.

So, for the last 4+ million years, we have banded together in order to prevail against common foes. This behavior is baked in, and is as much a part of being human as walking upright.

Pluribus was, without doubt, seen as the enemy in those games. I have no doubt that the players were working together to some extent, even if they weren’t aware of it. Instinct is instinct, survival is survival.

I also have no doubt that Pluribus uses adaptive machine learning to formulate it’s strategies.

The only question I have, then, is: are strategies designed to beat a team as valid in other settings?

SunPowerGuru · May 21, 2020, 5:58pm

No, I’m almost certain they knew, at least in some of the games. In the very first Pluribus article I linked (in the other thread) one of the pros was quoted as saying how excited he was to have the chance to play against Pluribus. In the vid at the start of this thread, it is clearly identified, though it’s possible the other players didn’t see that info.

puggywug · May 21, 2020, 6:33pm

They may have known they would be playing an AI, but not necessarily which seat it was occupying, or how many seats.

SunPowerGuru · May 21, 2020, 6:42pm

Yeah, I’m not sure. the 10,000 hands were vs “10 pros.” Since 6-max, obviously not all at once. I do know that some early games were 5 copies of Pluribus vs 1 pro at a time. Was this 1 at a time, 5 at a time, something else? I’ll poke around and see if I can find the methodology.

1Warlock · May 21, 2020, 8:28pm

Personally, I’d like to see Pluribus play 250K hands at 25nl, 100nl, 200nl … vs humans who don’t know there is a bot on the tables. Lets see how Pluribus’ winrate fares at various levels and see what particular exploits it learns at each stake. If that was done, then I think we would be able to glean useful information from it.

I practice vs AI on PokerSnowie. I know its not Pluribus-level programing but it is more than enough for my purposes. That program, is an absolute terror to play - just ask Jungleman how he enjoyed the experience.

I know Linus Loliger played vs 5 copies at once but I don’t know who else did.

SunPowerGuru · May 22, 2020, 8:13am

“In one experiment, Pluribus played 10,000 hands of poker against five human players selected randomly from the pool.”

I’m not sure if this was the 10,000 hands they released, but it seems likely. Since they volunteered to be part of this, they knew they would be facing a bot, but I don’t know if the bot was specifically identified at the table.

Interesting to note that they were playing for free chips.

Source: Inside Pluribus: Facebook’s New AI That Just Mastered the World’s Most Difficult Poker Game

Topic		Replies	Views
Hand reviews: Running some bluffs Hand Review	37	1199	October 18, 2021
An AI has just beaten “elite” poker pros at 6-handed No Limit Hold’em poker Poker Strategy	7	1608	October 23, 2020
How's your day going? Poker Discussion	30	865	March 4, 2020
---Fish Alert--- Poker Strategy	6	584	October 16, 2020
Marc's Hand Review! Hand Review	65	2759	April 24, 2021

Pluribus 3! from SB

Related topics