model roundup

DeepSeek 3.2

3 items · started 2026-04-25 · closed 2026-04-29

GPT-5.5 improves over GPT-5.4 and overtakes Opus 4.6 to take the 2nd place behind Gemini 3.1 Pro on the Extended NYT Connections Benchmark (www.reddit.com)

+5210 6w gpt-5 deepseek qwen+2

GPT-5.5: xhigh: 94.0→97.5 high: 93.6→96.9 medium: 92.0→95.0 no reasoning: 32.8→37.5 Kimi K2.6 improves over Kimi K2.5 (78.3→91.4) and becomes the #1 open weights model. DeepSeek V4 Pro improves over DeepSeek V3.2 (50.2→75.7).
DeepSeek V3.2 looping bug: what settings / harness tweaks are actually reducing it in production? (www.reddit.com)

+11 6w tool-use deepseek agentic

I’m trying to isolate the looping / repetition issue some people have been reporting with DeepSeek V3.2 around April 2026, especially in agentic or tool-use setups on hosted providers like OpenRouter and SiliconFlow. Public model pages des…
Decreased Intelligence Density in DeepSeek V4 Pro (www.reddit.com)

+4233 6w gpt-5 deepseek gemini

In the V3.2 paper, they mentioned: Second, token efficiency remains a challenge; DeepSeek-V3.2 typically requires longer generation trajectories (i.e., more tokens) to match the output quality of models like Gemini 3.0-Pro. Future work wil…