model roundup

Mistral 3.5

4 items · started 2026-04-30 · closed 2026-05-04

[Help] Running big dense models faster (www.reddit.com)

+26 5w mistral vllm qwen+1

I have been trying Mistral 3.5 on my 4x RTX 3090 rig with llama.cpp. Inference is slow (about 11 t/s) even without anything being offloaded to the CPU.
Unsloth solved bug in Mistral Medium 3.5 implementation (www.reddit.com)

+436 5w mistral llama

https://unsloth.ai/docs/models/mistral-3.5 "May 1, 2026 Update: We worked with Mistral to fix Mistral Medium 3.5 inference affecting some implementations, and released updated GGUFs with the fix (NOT related to Unsloth or our quants). The…
Terminal Bench score for Mistral 3.5 Medium (www.reddit.com)

+610 5w mistral agentic

So... there were a couple promising benchmark scores reported by mistralai in the model card for Mistral 3.5 Medium, BUT there wasn't the one that I usually care about the most, which is TerminalBench 2.0.
Is Mistral-3.5-Medium-128B broken in Llama CPP? (www.reddit.com)

+14 5w swe-bench mistral vllm+1

Trying some if Bartowski's Q4 quants. Using Vulkan with the latest main branch as of a few hours ago.