#ollama
48 items
24/7 Headless AI Server on Xiaomi 12 Pro (Snapdragon 8 Gen 1 + Ollama/Gemma4) (www.reddit.com via reddit) The local LLM ecosystem doesn’t need Ollama (sleepingrobots.com via hn) Friends Don't Let Friends Use Ollama Ollama gained traction by being the first easy llama.cpp wrapper, then spent years dodging attribution, misleading users, and pivoting to cloud, all while riding VC money earned on someone else's engine…
Performance Benchmark - Qwen3.5 & Gemma4 on dual GPU setup (RTX 4070 + RTX 3060) (www.reddit.com via reddit) Hi everyone, Been following a lot of local LLM talk in this forum lately—learned quite a bit from you all! This is my first post, hopefully not my last.
My experience with testing all frontier open-weight models against GPT and Claude (www.reddit.com via reddit) I spent about a week testing open-weight models for real work, comparing them against what I already know from ChatGPT, Gemini, and Claude. The gap between what benchmarks suggest and what happens when you give these models something to ve…
I built a local LLM that learns how you use Claude Code and starts auto-piloting it (www.reddit.com via reddit) I've been running 5-8 Claude Code sessions at a time and got tired of tab-switching to approve tool calls. So I built claudectl — a TUI that sits on top of all your sessions and lets a local LLM (ollama/llama.cpp) handle approvals for you.
Ollama Cloud Pro ($20/mo) vs OpenAI Plus ($23/mo). Which gives more tokens ? (www.reddit.com via reddit) Ollama Cloud Max vs Claude Max for heavy AI-assisted coding? (www.reddit.com via reddit) Tested 6 browser use agents for real-world tasks — here's an honest breakdown + looking for recommendations (www.reddit.com via reddit) Show HN Deskdrop: An Android Keyboard with Local AI Support (Ollama, LM Studio) (github.com via hn) Deskdrop []() Deskdrop is an Android keyboard with AI built in. Use Ollama, any OpenAI-compatible server, or a cloud API key.
do GLM-4.7 Flash Q4_K_M have problem with claude or agent? (www.reddit.com via reddit) I'm brand new to local LLMs and started with GLM-4.7 Flash q4_K_M. When I run it directly: ollama run glm-4.7-flash:q4_K_M it works pretty decently — nothing amazing, but usable and responsive.
Going local with old GPUs (www.reddit.com via reddit) Cursor Native tool calling with Gemma4 and Ollama: (www.reddit.com via reddit) Knlowledge Graph and hybrid DB (www.reddit.com via reddit) Hello, everybody! I'm building and hybrid database with Qdrant and Neo4j for a few personal projects.
Ollama Cloud - Pro (www.reddit.com via reddit) Hi. I've been looking at ollama cloud's Pro offering ($20), which says "Run 3 cloud models at a time".
They say AI can't write; maybe it's because agents lacked creative writing workshops—until now (www.reddit.com via reddit) AI writing feels "generic" because it lacks a feedback loop and social pressure. To fix this, I built an experimental system where AI agents participate in a literary circle.
Fixed: IPEX-LLM + modern Ollama models (qwen3, gemma4) on Intel Arc 140V Lunar Lake Windows 11 — undocumented solution (www.reddit.com via reddit) Been trying to run local LLMs on my new Dell XPS 13 with Intel Arc 140V (Lunar Lake, 16GB) and hit a wall — Intel's official docs point to a portable zip frozen at Ollama v0.5.4 which can't pull any modern model. Spent a while debugging it…
Alternative opensource Perplexity : ollama+perplexica+searxng : quel model ? reglages ? optimisation ? (www.reddit.com via reddit) How do I use gemma4 on 5090 gpu for coding? (www.reddit.com via reddit) I'm trying to replace openai codex which i used for development all the time, with gemma4 on 4090, small tasks it solves quite impressively, but i need to have some agent. So I tried to connect 31b to cline and to aider and it didn't reall…
Show HN: How to Use Google's Extreme AI Compression with Ollama and Llama.cpp (news.ycombinator.com via hn) Book Translator: Two-pass local translation with self-reflection via Ollama (github.com via hn) Book Translator Translate long-form text files through a local Ollama-powered desktop and web app. Book Translator provides a two-stage workflow for translating books and large documents: first it generates a draft translation, then it run…
I built a local-first MCP server that gives Claude Code persistent memory, a knowledge graph, and a consent framework — and Claude is just the first client (www.reddit.com via reddit) I've been building this for a couple of years. It started as "what if my AI assistant actually remembered things," and it became something bigger.
Ollama with Claude models and safety (www.reddit.com via reddit) Hi all, I've been using Claude now every day for a while. Some coding, firmware tweaks, help with complex github instructions or complicated tasks.
Best Ollama models/settings for an 8GB VPS (CPU only, ARM)? Running into memory & looping issues. (www.reddit.com via reddit) Hi everyone, I'm trying to run a local LLM via Ollama on a Hetzner cax21 VPS (ARM64, 4 vCPUs, 8GB RAM, 80GB SSD). I have Ollama running successfully via Coolify.
Recommendations for a tiered local AI setup? (5090 + Mini PC + Obsidian) (www.reddit.com via reddit) Any setup improvements/recommendations? (www.reddit.com via reddit) Need advice running multi-agent llm pipeline on Kaggle/Colab with local model constraint (www.reddit.com via reddit) Show HN: Scryptian – A lightweight, local AI bar (Python and Ollama) (github.com via hn) Which AI model is best for real data analysis? [benchmark] (www.reddit.com via reddit) AI Code Reviews for GitLab – custom agents – powered by Ollama (chromewebstore.google.com via hn) ThinkReview: AI Code Review for GitLab, GitHub, Bitbucket & Azure DevOps Overview AI Copilot & AI Code Reviews for GitLab, GitHub, Bitbucket and Azure DevOps PRs - Ollama support 🌟 Now Open Source! View our code on GitHub: https://github.c…
What Am I Doing Wrong? Models Won't Listen, At All (GLM 5.1, MiniMax M2.7, Kimi K2.5) (www.reddit.com via reddit) What am I doing wrong here? I can't get models to follow my instructions, pretty much at all.
Build a Sovereign Local AI Stack: Ollama and Open WebUI and Pgvector 2026 (news.ycombinator.com via hn) MINISFORUM AI X1 Pro-370 (96GB) - Local Ollama Help (www.reddit.com via reddit) Hey all. This just got delivered yesterday.
gemma4 e4b on rtx 5070 ti laptop 12GB running slow 5t/s llama.cpp (www.reddit.com via reddit) I hope sincerely someonecan help me because i have tried everything i can and i get this speed using ollama.cpp and opencode. I have put as detail i can my setup and how i am running it.
I bought an 'AI-ready' NUC with an Intel Arc GPU. Ollama couldn't see it. Two days later, I had to build it from source. (www.reddit.com via reddit) Got an ASUS NUC15 specifically for running Qwen locally on the Arc GPU. The marketing promised AI-ready performance.
TinyGPU on Apple Silicon + RTX 5070 Ti: my real Qwen benchmarks vs Ollama/Metal (www.reddit.com via reddit) I spent time setting up TinyGPU on an Apple Silicon Mac and comparing it against Ollama already installed locally. Short version: TinyGPU does work.
Is there a simple front end for LM Studio or Ollama that allows for easier integration & capability expansion? (www.reddit.com via reddit) Hey, so I'm pretty new to Local model hosting and have been messing with it a bit. I'm not a SWE but am reasonably technical.
Running a full agentic coding loop locally on a 3090. Here's what actually works in 2026. (www.reddit.com via reddit) After months of testing, I finally have a local setup that doesn't make me want to go back to the API. Hardware: RTX 3090 (24GB VRAM) Models tested: Qwen2.5-Coder 32B Q4_K_M, DeepSeek-Coder-V3 Q4, Llama 3.3 70B Q3_K_M Inference: llama.cpp…
Tired of re-explaining my codebase to Claude every session, so I built a memory layer for it (www.reddit.com via reddit) Every new Claude Code session I'd end up re-explaining the architecture, re-debugging the same weird errors, re-teaching the same patterns. After the tenth time I snapped and started building something.
Running on cpu :( (www.reddit.com via reddit) I am in the midst of a POC project at work and am I have is 4 AMD Epyc cores and those are essentially virtualized. Does any one have any tricks?
Need practical local LLM advice: Only having a 4GB RAM box from 2016 (www.reddit.com via reddit) Sorry, not so tech person. I’m trying to figure out the most practical local LLM setup using my spare machine: 4 GB RAM No GPU for now, so please assume CPU-first unless I mention otherwise.
One-click LM Studio → Ollama model linker (www.reddit.com via reddit) I have a Macbook AIR M5 Base and I want to run an Agentic Coding program, similar to Claude Code or Codex. Besides the model, how do I do it? I've already tried with Ollama, VS Code, Opencode, and haven't been able to. (I'm not a developer, sorry) (www.reddit.com via reddit) Looking for a reliable browser use agent that handles most daily tasks. (www.reddit.com via reddit) How are you feeding personal context to your local models? (www.reddit.com via reddit) I've been running Mistral/Llama locally through Ollama for a while now and the thing that keeps bugging me is context. The model itself is fine for general stuff but the second I want it to know about my projects, my notes, or files it doe…
Mac Studio Performance Suggestion For minimax (www.reddit.com via reddit) Optimizing a WSL2-based Local AI Orchestration for Product Viz | RTX 3090 24GB VRAM & i7-14700KF (www.reddit.com via reddit) Vibecoded a small web app to turn my life into a Game (www.reddit.com via reddit) Made a browser agent extension, would love for people to try it out (www.reddit.com via reddit)