model roundup

DeepSeek 3.2

3 items · started 2026-05-04 · closed 2026-05-08

Has anyone tried Zyphra 1 - 8B MoE? (www.reddit.com)

+61 7w moe gpt-5 deepseek

https://x.com/ZyphraAI/status/2052103618145501459?s=20 Today we're releasing ZAYA1-8B, a reasoning MoE trained on u/AMD and optimized for intelligence density. With <1B active params, it outperforms open-weight models many times its size o…
As MTP prepares to land in llama.cpp, Models that support MTP (www.reddit.com)

+1516 7w llama

DeepSeekv3 OG DeepSeekv3.2/4 Qwen3.5 GLM4.5+ MiniMax2.5+ Step3.5Flash Mimo v2+ Until we get mtp weights, you need to download HF weights and convert to gguf. I think I'm going to try either qwen3.5-122b or glm4.5-air first.
Built a tiny router so Cursor stops showing "usage limit reached" at 3pm. Sonnet auto-falls to Haiku, you keep working (www.reddit.com)

1 7w haiku deepseek sonnet+3

Cursor's custom-OpenAI URL feature is what makes this work. Pointed it at a router I built.