model roundup

DeepSeek 3.2

3 items · started 2026-05-04 · closed 2026-05-08

  1. https://x.com/ZyphraAI/status/2052103618145501459?s=20 Today we're releasing ZAYA1-8B, a reasoning MoE trained on u/AMD and optimized for intelligence density. With <1B active params, it outperforms open-weight models many times its size o…

  2. DeepSeekv3 OG DeepSeekv3.2/4 Qwen3.5 GLM4.5+ MiniMax2.5+ Step3.5Flash Mimo v2+ Until we get mtp weights, you need to download HF weights and convert to gguf. I think I'm going to try either qwen3.5-122b or glm4.5-air first.

  3. Cursor's custom-OpenAI URL feature is what makes this work. Pointed it at a router I built.

← all threads