model roundup

GPT 4

3 items · started 2026-05-22 · closed 2026-05-25

Claude is generally scary at poker when real stakes are involved! (www.reddit.com)

+34 4w gpt-4 gemini

I’ve been running an experiment for a few weeks. Claude, GPT-4, and Gemini playing poker against each other with real crypto on the line.
TranscendPlexity: 540/540 ARC-AGI-1/2/3, 13 tasks with 0% AI solve rate, solved (github.com via hn)

+1 4w arc-agi gpt-4 gemini

🔓 13 "Impossible" ARC-AGI-2 Tasks — All Solved These 13 ARC-AGI-2 evaluation tasks have never been solved by any AI system — not GPT-4, not Claude, not Gemini, not NVARC, not MindsAI, not any Kaggle submission. They have a 0% AI solve rate…
Anthropic and OpenAI don't want better models, they want to sell more tokens (kkooler.substack.com via reddit)

5 5w gpt-4 openai anthropic

There is a saying in auto racing that describes the current state of AI providers: “Go as slow as you can to win”, that translates as “Spend as low as you can on R&D to stay slightly better than average”. Let’s put our tin foil hats on and…