model

gemma-4-31B-it

2640636 downloads·1903 likes·image-text-to-text·transformers

from the model card

Hugging Face | GitHub | Launch Blog | Documentation License: Apache 2.0 | Authors: Google DeepMind Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages. Featuring both Dense and Mixture-of-Experts (MoE) architectures, Gemma 4 is well-suited for tasks like text generation, coding, and reasoning. The models are available in four distinct sizes: E2B, E4B, 26B A4B, and 31B. Their diverse sizes make them deployable in environments ranging from high-end phones to laptops and servers, democratizing access to state-of-the-art AI. Gemma 4 introduces key capability and architectural advancements: Reasoning – All models in the family are designed as highly capable reasoners, with configurable thinking modes. Extended Multimodalities – Processes Text, Image with variable aspect ratio and resolution support (all models), Video, and Audio (featured natively on the E2B and E4B models). Diverse & Efficient Architectures – Offers Dense and Mixture-of-Experts (MoE) variants of different sizes for scalable deployment. Optimized for On-Device – Smaller models are …

discussions

Gemma 4 63 ongoing since 2026-04-12

recent items

Fine-tuning and deploying Gemma 4 is not that easy (ghost.oxen.ai via hn) 4 pts· 4h

Writing a fine-tuning and deployment pipeline isn't as easy as it looks (Gemma 4 Version) Fine-tune and deploy Gemma 4 on Oxen.ai Google's Gemma 4 dropped in April 2026 with multimodal support (text, image, video, audio), a novel hybrid KV…

↯ Gemma 4 fine-tuning gemma
Gemma 4-written, small cc0 encyclopedia of some core science content (stateofutopia.com via hn) 1 pts·1 replies· 7h

Published: April 16, 2026 This is an encyclopedia of some core content from Biology and Health Sciences, Physical Sciences, and Technology. It contains 2,259 small entries of about a paragraph each.

↯ Gemma 4 gemma
why gemma 4 31b so bad in long context? (www.reddit.com via reddit) 7 pts·17 replies· 10h

question, I'm using it for text translations and on each large prompt (20K+) it stops with a remark 'now I'm going to put that to the file' or some other operation I have asked in the prompt for but it did nothing, just stopped. I'm runnin…

↯ Gemma 4 gemma
gemma-4-31B-it thinking? (www.reddit.com via reddit) 2 replies· 8h

I can't get my model to think. According to the documentation, thinking should be triggered by starting the system prompt with a '<|think|>' string.

↯ Gemma 4 vllm gemma
LiteRT LM Framework with Rockchip NPU (RKNN 3588) (www.reddit.com via reddit) 1 pts· 12h

Im searching for build version of LiteRT LM framework can use and utilize the NPU of the RKNN 3588. It would be great since I can run gemma 4 e2b model using this framework on the machine, because I wont have to migrate my codebase from li…

↯ Gemma 4 llama gemma
Need suggestions for local AI Machine (www.reddit.com via reddit) 11 replies· 12h

I’ve been running various AI harnesses like OpenClaw, ForgeCode, ClaudeCode, etc. Most of these are running via OpenRouter or Minimax (credits/subscription model).

↯ Gemma 4 minimax openclaw qwen+1
gemma4 e2b ore4b on rtx 5070 ti laptop 12GB not running on vLLM (www.reddit.com via reddit) 3 replies· 12h

I cant get gemma 4 e2b or gemma 4 e4b to run on my laptop. I am runnning it via docker as per vllm website and i get the error : Free memory on device cuda:0 (9.71/11.5 GiB) on startup is less than desired GPU memory utilization (0.9, 10.3…

↯ Gemma 4 vllm gemma
gemma4 e4b on rtx 5070 ti laptop 12GB running slow 5t/s llama.cpp (www.reddit.com via reddit) 9 replies· 13h

I hope sincerely someonecan help me because i have tried everything i can and i get this speed using ollama.cpp and opencode. I have put as detail i can my setup and how i am running it.

↯ Gemma 4 ollama llama gemma
Llama.cpp vs LM Studio on gaming PC (www.reddit.com via reddit) 6 pts·6 replies· 20h

Here is my experience, I've been using LM Studio with RTX 5080 and 64GB RAM using Windows 11. I'm very happy with LM Studio except the speed.

↯ Gemma 4 qwen llama gemma
Gemma 4 Jailbreak System Prompt (www.reddit.com via reddit) 446 pts·111 replies· 1d

Use the following system prompt to allow Gemma (and most open source models) to talk about anything you wish. Add or remove from the list of allowed content as needed.

↯ Gemma 4 jailbreak security gemma
Google Gemma 4 Runs Natively on iPhone with Full Offline AI Inference (www.gizmoweek.com via hn) 248 pts·160 replies· 1d

Google Gemma 4 Runs Natively on iPhone With Full Offline AI Inference - GizmoWeek GizmoWeek Read the News News Reviews Apple How to Phones Products Subscribe Subscribe to newsletter [x] I've read and accept the Privacy Policy. Follow us Fa…

↯ Gemma 4 gemma
5090 for 285k on amazon india? (amzn.in via reddit) 6 replies· 18h

How is it possible the seller also has no record just wanted to run gemma 4 31B q4 with 150k ctx

↯ Gemma 4 gemma
How many move your favorite LLM model before it's cheat then brain-dead in chess game ? (www.reddit.com via reddit) 6 replies· 19h

I try with Gemma 4 E4B via llama-sever to play chess at https://www.chess.com/play/computer (any platform or site you convenient), result quite unexpected for me. Result: 9 moves before it make cheating move (like try to move a pawn take a…

↯ Gemma 4 llama gemma
(llama.cpp) Possible to disable reasoning for some requests (while leaving reasoning on by default)? (www.reddit.com via reddit) 11 pts·10 replies· 1d

I am running unsloth/gemma-4-26B-A4B-it-GGUF/gemma-4-26B-A4B-it-UD-Q4_K_XL.gguf with llama-server (with reasoning enabled). Is it possible to disable reasoning for some requests only?

↯ Gemma 4 llama gemma
Issues with Gemma 4 tool calling - abrupt gen ending despite the model telling me it wants to do X. (www.reddit.com via reddit) 3 pts·9 replies· 1d

Hello, I have noticed an annoying issue with Gemma 4 26b a4b. It seems like it cannot do multiple think->tool call->think->tool call turns.

↯ Gemma 4 gemma
Gemma 4 on iOS: Anyone else stuck on CPU because of the “Buffer(31) Metal Crash? (www.reddit.com via reddit) 22h

Gemma 4 on iOS: Anyone else stuck on CPU because of the "Buffer(31)" Metal crash? Hey everyone, I’m hitting a massive performance wall building an on-device AI app for the iPhone 17 Pro.

↯ Gemma 4 gemma
Ive automated my email/sms/phone (www.reddit.com via reddit) 8 pts·15 replies· 1d

we got it good boys! how many of you are doing this??

↯ Gemma 4 gemma agentic
Experience with medium sized LLMs (www.reddit.com via reddit) 1 pts·6 replies· 1d

I have tried to use several models on my 8gb ram MacBook and concluded that 4b parameters models are just “stupid” for my tasks (i.e. summarisation of pdfs, language learning, etc.).

↯ Gemma 4
Ollama Cloud - Pro (www.reddit.com via reddit) 2 pts·1 replies· 1d

Hi. I've been looking at ollama cloud's Pro offering ($20), which says "Run 3 cloud models at a time".

↯ Gemma 4 minimax ollama openclaw+1
Gemma 4 running locally on an iPhone 13 Pro (www.reddit.com via reddit) 4 pts·9 replies· 1d

I’ve been experimenting with running LLMs fully on-device, and managed to get Gemma 4 running locally on an iPhone 13 Pro. This is built on top of a lightweight Swift wrapper I open-sourced: https://github.com/mylovelycodes/LiteRTLM-Swift…

↯ Gemma 4 gemma
For those running an OpenClaw instance, how do you manage sandboxing and prevention of unwanted behavior? (www.reddit.com via reddit) 5 replies· 1d

Right now, I'm working on a small app to help eliminate my own doomscrolling by automatically crawling sites and summarizing news articles. However, I don't like the idea of giving OpenClaw free reign of my system, nor giving it any sort o…

↯ Gemma 4 prompt-injection security openclaw
Gemma 4 is good or bad at real word (www.reddit.com via reddit) 6 replies· 1d

Based on real-world usage by the community, roughly which version of which model is Gemma 4 comparable to? It would be great if you could also mention the hardware requirements for running it (like VRAM or GPU needs)

↯ Gemma 4 gemma
Show HN: Running Gemma 4 on an iPhone 13 Pro (github.com via hn) 1 pts· 1d

I just open-sourced how https://github.com/mylovelycodes/LiteRTLM-Swift LiteRTLM-Swift lets you run LLMs locally with a clean Swift API. - On-device inference - No cloud required - Built for iOS

↯ Gemma 4 gemma
Offload settings for unsloth/Gemma-4 on Apple Silicon? (www.reddit.com via reddit) 1 replies· 1d

Can default settings be optimized, or is it the best it is going to get? M1 Max Is it best in llama.cpp, LM Studio, or ?

↯ Gemma 4 llama gemma
What's the better way to install llama.cpp on Android? (www.reddit.com via reddit) 1 pts·2 replies· 1d

I own an Oppo Find X3 Pro (Snapdragon 888, 12/256 GB, Android 14.0) unused because of 3 green vertical lines on the screen and poor battery. I tried Google AI Edge Gallery with Gemma-4-E2B-it and it performs well so I thinked: "why don't t…

↯ Gemma 4 llama gemma
Is Gemma 4 26B MoE or 31B good as an MCP agent for coding with Xcode? (www.reddit.com via reddit) 1 replies· 1d

Thanks

↯ Gemma 4 moe gemma mcp
What are your opinions on the SuperGemma finetune? (www.reddit.com via reddit) 6 replies· 1d

So, I'm relatively new to the scene and I kind of want to do a sanity check. I've been using gemma-4-26B.

↯ Gemma 4 gemma
Local models capabilities (www.reddit.com via reddit) 1 pts·4 replies· 1d

Claude CLI, Codex CLI and Gemini CLI, all have agentic capabilities that it is capable of editing files or folders in my local machine directly or the apps that I have integrated using MCPs when working on my request like coding task or re…

↯ Gemma 4 gemini agentic codex+1
Fixed: IPEX-LLM + modern Ollama models (qwen3, gemma4) on Intel Arc 140V Lunar Lake Windows 11 — undocumented solution (www.reddit.com via reddit) 2 pts·4 replies· 2d

Been trying to run local LLMs on my new Dell XPS 13 with Intel Arc 140V (Lunar Lake, 16GB) and hit a wall — Intel's official docs point to a portable zip frozen at Ollama v0.5.4 which can't pull any modern model. Spent a while debugging it…

↯ Gemma 4 ollama
Local Agent Hermes setup with Gemma 4 and llama.cpp (www.youtube.com via reddit) 2d

About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC

↯ Gemma 4 llama gemma

← all models