model roundup
Qwen 2.5
-
I have been coming to this subreddit to understand what the optimal config is to run a model on a given hardware setup. I referred to specific benchmarks, but they are too generic and do not consider the underlying hardware.
-
Hermes w/cloud LLM and w/local LLM does it work? (www.reddit.com)
I’ve tried openclaw locally for about a month. Hardware: M5 Pro w/48 gb ram.
-
Wanted to share a workflow I tested on a real flight, in case anyone else is trying to set up offline Claude Code. The core idea: using ollama to pull the needed model of what you need, and then use it to run claude code The setup, in orde…
-
Show HN: Charm – on-device spelling, grammar, and prediction for macOS (www.theodorehq.com via hn)
I've spent the last year building Charm, a native macOS menu bar app that corrects spelling, fixes grammar, and predicts your next word. Three features: - Spells: NSSpellChecker plus a local LLM for context-aware corrections (catches "defi…
-
Hey r/ClaudeAI, If you are using Claude Code or building terminal agents, you know the exact moment the context window starts degrading during long-running tasks. I wanted to build a persistent runtime layer to offload those heavy, multi-s…
-
Hey r/LocalLLaMA, Production ML compiler stack is brutal: TVM is 500K+ lines of C++. PyTorch piles Dynamo, Inductor, and Triton on top of each other.
-
spent today auditing my own model catalog and noticed 39 of my own pages confidently reference "qwen 3 72b" with apache 2.0 licensing, a 2025-09-15 release date, and a 131k context window. seemed normal — qwen 2.5 had a 72b, why wouldn't q…