(llama.cpp) Possible to disable reasoning for some requests (while leaving reasoning on by default)?

reddit-localllama · www.reddit.com ·11 pts·10 replies ↗ ·1d

I am running unsloth/gemma-4-26B-A4B-it-GGUF/gemma-4-26B-A4B-it-UD-Q4_K_XL.gguf with llama-server (with reasoning enabled). Is it possible to disable reasoning for some requests only?

llamagemma

open →

← back to top