How faster is Gemma 4 26B-A4B during inference vs 31B?

reddit-localllama · www.reddit.com ·16 replies ↗ ·16h

I want to download one and usually do inference on CPU having old GPU so I'm concerned with speed. One link on the web (I have posted with it and post been removed): Multiple users are reporting that Gemma 4's MoE model (26B-A4B) runs sign…