model roundup

Haiku 4.5

4 items · started 2026-05-19 · closed 2026-05-25

Created an LLM quiz program to check if AIs' performance varies over time (www.reddit.com)

+22 4w haiku

I've been noticing an increasing number of posts and comments on Reddit claiming that LLM models are either becoming dumber over time or have varying performance throughout the day. I tried to find long-form, over-time performance graphs o…
Built an AI flat-finder in a weekend. Indian rental sites are 70% broker spam so I scraped Reddit instead. (www.reddit.com)

+14 5w haiku sonnet anthropic

Weekend build, ~10 hours. Demo: https://trurent-five.vercel.app/ Problem I was poking at: every major Indian rental site (NoBroker, MagicBricks, 99acres) is infested with brokers even when you filter "direct owner." Reddit actually has hon…
I Made LLMs Play Texas Hold’em. The Smallest Model Beat a ~1T Model by Being Too Dumb to Fold (www.reddit.com)

+71 5w minimax haiku anthropic

Made LLMs play Texas Hold’em against each other. 6 models at the table: a tiny 1.2B running locally on my 16GB MacBook, a couple mid-size ones, and cloud models going up to about 1 trillion parameters.
My 1.2B model won 2 out of 5 poker tournaments against models up to 1T params. (www.reddit.com)

1 5w minimax haiku qwen

I made 6 LLMs play Texas Hold’em against each other. Ran 5 tournaments on my 16GB MacBook.