Running Minimax 2.7 at 100k context on strix halo (www.reddit.com)
model roundup
MiniMax 2.7
-
Just wanted to share because it took me a lot of tweaking to get here: llama-server -hf unsloth/MiniMax-M2.7-GGUF:UD-IQ3_XXS --temp 1.0 --top-k 40 --top-p 0.95 --host 0.0.0.0 --port 8080 -c 100000 -fa on -ngl 999 --no-context-shift -fit of…
-
Is Qwen3-coder the best kept secret out there? (www.reddit.com)
So I'm brand new to this scene but I'm using Claude to help me fine tune a model for a startup idea I have in the Healthcare space. I have been working with the 27-35B parameter mdoels (Qwen3.6, Gemma 4) and the couple of 120B+ models (Qwe…
-
What is your "Haiku/Sonnet/Opus" trio? (www.reddit.com)
Hi. Probably others too, but in Claude/Claude Code at least, we have the concept of a model trio: The fast and cheap model for bulk/easy work, the "main" model, and the expensive model for complicated stuff.