model roundup
Gemini 3.1
-
I was trying to find a problem in my math heavy code and asked an agent (Gemini 3.1) to find the issue. Often when I know it’s a hard problem I let it be and go get coffee or lunch.
-
GPT 5.5 - Strong, not mind-blowing, but very token efficient (www.reddit.com)
I've been benching GPT-5.5 for the past couple days and would like to share my findings. This is based on a benchmark I've created that pits models against each other in autonomous games of Blood on the Clocktower - a highly complex social…
-
Cursor (again) not working with Gemini 3.1 API (www.reddit.com)
Last week it was broken, then they "fixed" smth few days ago. Now again...
-
The Significance of Google's recent TPU 8t and TPU 8i (www.reddit.com)
Cost & Performance Efficiency Training Cost-Performance (8t): +170% to +180% gain (2.7x–2.8x) Inference Cost-Performance (8i): +80% gain Training Power Efficiency (8t): +124% gain in performance-per-watt Inference Power Efficiency (8i): +1…
-
I’ve been using Cursor for ~1.5 years, mainly with Gemini 3.1 Pro. Recently I ran into a serious pricing issue.
-
Real benchmark breakdown in AI agents (www.reddit.com)
I dove deep into the most recent benchmark stats from GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro via official reports & third-party evaluations. I found a interesting thing:There’s no such thing as a “one-size-fits-all model.” My finding…
-
Kimi K2.6 - the mighty turtle that wins the race (www.reddit.com)
Hi folks, I've been benching Kimi K2.6 for the past few days, and I'd like to share my findings. For context, this is based on a benchmark I've created that pits models against each other in autonomous games of Blood on the Clocktower - a…
-
Major performance jump though. Worth it?
-
could not extract summary
-
Show HN: Gemini Plugin for Claude Code (github.com via hn)
I built a plugin that lets Claude Code delegate work to Gemini CLI. I started this after finding myself reaching for Gemini more often on long context repo work.