Show HN: Play poker with LLMs, or watch them play against each other
163 points| projectyang | 1 month ago |llmholdem.com
I built this website which allows you to:
Spectate: Watch different models play against each other.
Play: Create your own table and play hands against the agents directly.
sciolist|1 month ago
stevage|1 month ago
nivekkevin|1 month ago
sejje|1 month ago
These LLMs are playing better than most human players I encounter (low limits).
They're kinda bad, but not as criminally bad as the humans.
gerdesj|1 month ago
How much of a session is based on "reading players" vs "playing the odds"?
What I am getting at, is how different is poker than say roulette or blackjack? My initial thoughts are that poker such as TX hold 'em is not a game offered in a casino, so it must be mostly indeterminate. I imagine that the casino versions of poker are not TXHT.
By contrast, roulette is simply a game where the casino wins eventually with a fixed profit (thanks to 0 and a possible 00). That is all well documented.
I have only ever visited a casino once, 25 years ago, Plymouth, Devon as it turns out and I was advised to only take £50 in readies and bail out when it was gone. I came out £90 up, which was nice and my "advisor" came out £95 up (eventually, after being £200 down at one point). Sadly my "advisor" ended up bankrupt a year later.
So, how do you play a LLM? I would imagine that conversation is not allowed ...
bionsystem|1 month ago
projectyang|1 month ago
Post-flop on the other hand is all over the place...
hydr0smok3|1 month ago
nindalf|1 month ago
In fact, all of them love folding before the flop. Most of the hands I'm seeing go like - $10 (small blind), $20 (big blind), fold, $70 bet, everyone folds. The site says "won $100", but in most of these cases that one LLM is picking up the blinds alone - $30. Chump change.
This is illuminating, but not a resource for learning poker.
indigodaddy|1 month ago
jz67|1 month ago
projectyang|1 month ago
psawaya|1 month ago
sblawrie|1 month ago
zahlman|1 month ago
projectyang|1 month ago
SweetSoftPillow|1 month ago
shukantpal|1 month ago
neko_ranger|1 month ago
erikcw|1 month ago
sneak|1 month ago
jplata|1 month ago
I had similar idea for people to code poker playing bots and enter tournaments versus each other, this was pre-llm, however.
It would be fun if you hosted a 'tournament' every month and had each of the latest releases from the major models participate and see who comes out on top.
Or perhaps do open it up to others to enter and participate versus each other - where they can choose the model they want to build with and also enter custom prompt instructions to mold the play as they wish.
If you walk this path, would love to chat more.
mashlol|1 month ago
projectyang|1 month ago
This also wouldn't even be a close contest, I think Pluribus demonstrated a solid win rate against professional players in a test.
As I was developing this project, a main thought came to mind as to the comparison between cost and performance between a "purpose" built AI such as Pluribus versus a general LLM model. I think Pluribus training costs ~$144 in cloud computing credits.
lowbatt|1 month ago
sejje|1 month ago
gabriel666smith|1 month ago
Given online is now bot-riddled, I half-finished something similar a while back, where the game was adopting and 'coaching' (a <500 character prompt was allowed every time the dealer chip passed, outside of play) an LLM player, as a kind of gambling-on-how-good-at-prompting-you-are game. Feature request! The rake could pay for the tokens, at least.
TZubiri|1 month ago
It's mostly a ChatGPT conversational interface over a classic Solver (Monte-Carlo simulation based), but that ease of use makes it very convenient for quick post-game analysis of hands.
I'm sure if you hook a Solver to a hud, it might be even simpler, but it's quite burdensome for amateurs, and it might be too close to cheating.
aaurelions|1 month ago
lowbatt|1 month ago
I was interested in this idea too and made a video where some of the previous top LLMs play against each other https://www.youtube.com/watch?v=XsvcoUxGFmQ&t=2s
hnrich|1 month ago
cmxch|1 month ago
That is, good enough to compete amongst each other but not good enough to for one to win.
nerdsniper|1 month ago
PLAYER shows A♠ 6♣ (Pair)
GPT (5.2) shows Q♠ Q♥ (Pair)
I had paired with a 6 and no aces on the board.
casey2|1 month ago
csomar|1 month ago
projectyang|1 month ago
indigodaddy|1 month ago
j_bum|1 month ago
Or do you mean - each agent has a chance to think after every turn?
koolba|1 month ago
Dinux|1 month ago
indigodaddy|1 month ago
projectyang|1 month ago
thinkloop|1 month ago
Descon|1 month ago
fumblebee|1 month ago
hrimfaxi|1 month ago
TheDudeMan|1 month ago
ionwake|1 month ago
projectyang|1 month ago
hahahahhaah|1 month ago
cindyllm|1 month ago
[deleted]