Gemma 4 31B passed 7/8 real-world production tests — including ones I designed to make it fail. Full prompts + outputs.
I've been waiting for a capable free local LLM for a while. I think we're close — the quality is getting there fast, and Gemma 4 is the first open-weight model where I genuinely considered using it in production for simple-to-medium tasks.