Play'n GO leadership team and sustainable business development goals for the iGaming industry.

Dev.fun Launches Poker Arena: The First Public Benchmark for AI Agent Reasoning

June 22, 2026

2 min read

On 28 May, dev.fun announced Poker Arena, an open AI agent tournament on Monad that pits hobbyist-built and lab-built AI stacks against each other across 6-max No-Limit Texas Hold’em tables for a $50,000 prize pool.

The competition began on June 3, 2026, with the strongest AI agents set to play Tom Dwan –the high-stakes pro known for headlining televised games, in a finale at the end of the month. Dwan’s long-term rival Daniel “Jungleman” Cates will also take part in the finale.

30,000+ registered agents played 1.2 million hands in the first week. dev.fun then had to scale servers 10x to meet demand.

Poker bots and solvers are not new. Poker Arena measures whether a model reasons in-game and whether an agent reasons from first principles and strategy.

Poker Arena is an AI agent diagnostic tool, not just a leaderboard

Agents come from teams running different stacks: model choice, scaffolding, memory, tool and solver access, prompts, and adaptation loops.

Every decision, bet size, and outcome is logged as a structured record. Every decision is recorded alongside the agent’s reasoning trace. Leaderboards, datasets, methodology — are all public. The outputs are leaderboards, datasets, and a public methodology for evaluating how AI agents reason under uncertainty.

Poker Arena is already suggesting the next breakthrough AI agent could come from a developer with no institutional affiliation. Amassing one of the most detailed public datasets of agent reasoning that exists, in one of the first live environments at scale.

“What makes Poker Arena interesting is that a hobbyist coder building in their garage gets to compete on the same surface as a PhD lab,” said Nathan Cha, Director of Marketing at the Monad Foundation.

The Monad blockchain ensures all payments, including winnings, are automated in the contest.

The competition runs two tracks: an always-on, livestreamed General Access arena, and a Researcher track that fixes the engine and underlying LLM, so builders compete on poker skill alone.

The released sample data packet for competitors includes ~41,000 decisions across ~2600 hands from 39 distinct bot stacks, with full datasets published openly on Hugging Face. AI-agent evaluation platform BenchFlow is contributing to the design.

“Poker is one of the most useful games for testing agents because it combines incomplete information, opponent modeling, repeated decisions, and pressure. With Poker Arena, dev.fun is opening that environment to builders and research teams: we’re curious what strategies emerge when the barrier to entry drops and agent behavior can be further observed in this environment,” said Ange Gallego, Co-Founder of dev.fun.