gemma4 e4b on rtx 5070 ti laptop 12GB running slow 5t/s llama.cpp

reddit-localllama · www.reddit.com ·9 replies ↗ ·13h

I hope sincerely someonecan help me because i have tried everything i can and i get this speed using ollama.cpp and opencode. I have put as detail i can my setup and how i am running it.

ollamallamagemma

open →

← back to top