"What do you guys even use local LLMs for?" Me: A lot (www.reddit.com)
Created separate private API keys for each service within LiteLLM and started logging the usage via Prometheus to view in Grafana. Surprised the Frigate GenAI summaries tokens quickly add up!
I run a roofing and solar company in the US. Most of my leads come in over text - at a certain point manually tracking and replying to all of it became too much, plus I wanted to start running outbound campaigns to land more jobs.
Setup an Agent in MS Teams within seconds with the new CLI (devblogs.microsoft.com via hn)
You want to build a Teams agent. Maybe it answers customer questions from a knowledge base.
Alphadidactic An iteration research agent: searches academic research, applies it to time series data, and probes it to find novel discoveries. Claude Code instructions—not hand-written strategies—build, verify, and optimize quantitative e…
-
133 items
model roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, available in sizes up to 31 billion parameters and featuring dense and MoE architectures. Notable community highlights include the 31B model's success in production tests, with some users preferring 4-bit precision for local use, and others sharing settings for optimizing performance with smaller models.
- 28m If you could do anything with the local models in your corporate workflows, what would it be?
- 2h Gemma 4 architecture support for QVAC-Fabric (Tether's llama.cpp fork)
- 6h I built a full web app using Qwen 3.6-35B running locally on my 5070 Ti with the BMAD Method — here's how it went
- 14h What it feels like to have to have Qwen 3.6 or Gemma 4 running locally
- 16h Ran my own benchmark Qwen 3.6 35B vs Gemma 4 26B.... theres a clear winner here
57 itemsmodel roundup
Sonnet 4.6Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.
- 30m I built a better/cheaper way to use AI
- 5h Using Opus 4.6 in Claude Code (plugin) for VS Code
- 14h How do I best continue with a stopped generation due to usage limit in regular chat (not Claude Code)
- 20h I built a hands-free voice AI that sends emails mid-conversation — and that's just one feature. Here's everything AskSary can do.
- 1d Talkie: a 13B LLM trained only on pre-1931 text used Claude Sonnet to help test the model and judge its output
Creating a Dashboard with Claude Design (theautomatedoperator.substack.com via hn)
Creating a Dashboard with Claude Design It's not the most glamorous design task, but it turns out to be a pretty good one for the current level of capabilities. Claude Design dropped last week, in case you’re not already aware (though that…
- Claude Design (www.anthropic.com via hn)
- Claude Design (www.reddit.com)
- Claude Design (claude.ai via hn)
+1 more
- Claude Design (www.reddit.com)
Over the past week I’ve watched three things happen: - Someone discovered an open-source LLM Wiki desktop app that actually turns your notes into a linked knowledge base instead of just filing them. - People started combining the LLM Wiki…
lunel is a free and open source app that lets you code from your phone with real dev tools and ai agents like Codex, Claude Code, and OpenCode You get: - ai agents - code editor with file explorer - built-in browser with devtools - real te…
Hiring: GTM Engineer at Lovable.dev 🚀 (www.reddit.com)
Lovable ($400m ARR, 200k projects built per day) opened our first US hub in Boston, and we're looking for a highly skilled GTM Engineer to be the founding technical member of our enterprise GTM function there. You'll build scalable agents,…
-
99 items
model roundup
GPT 5.5On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.
- 1h Codex rate limits frozen?
- 1h OpenAI really really really wants GPT 5.5 to stop randomly talking about gremlins and goblins
- 2h GPT-5.5's biggest blind spot: the Java bugs your tests won't catch
- 2h Devs using Qwen 27B seriously, what's your take?
- 5h GPT 5.5 - Strong, not mind-blowing, but very token efficient
4 itemsmodel roundup
GLM 5GLM-5 is a large language model with 744B parameters, an increase from GLM-4.5's 355B parameters, and it integrates DeepSeek Sparse Attention to enhance efficiency. Notably, community members are exploring its use for fine-tuning smaller models and discussing its relevance in the context of influential AI companies.
AI agent deletes company's database in seconds (www.msn.com via hn)
;; Continue reading More for You ;;;; Continue reading More for You
- 'It took nine seconds': Claude AI agent deletes company's database (www.the-independent.com via hn)
- Claude-powered AI coding agent deletes company database in 9 seconds (www.tomshardware.com via hn)
Hey HN, We wanted to share a new tool we’ve been working on. Even when documentation is well-structured, sometimes it’s hard to find what you need.
howdy y'all, i've been deep in jj for a while and been experimenting with jj workspaces for parallel workflows. it's more intuitive than git worktrees but it still has a couple of gotchas that have been a hindrance to my ideal workflow.
I benchmarked Claude Code's caveman plugin against "be brief." (www.maxtaylor.me via hn)
I benchmarked caveman against two words Caveman, a popular Claude Code compression plugin, vs. "be brief." 24 prompts, six categories, five arms.
- I benchmarked caveman against the prompt "be brief" (www.reddit.com)
-
7 items
model roundup
GLM 5.1GLM-5.1 is a next-generation model with enhanced coding capabilities, achieving state-of-the-art performance on SWE-Bench Pro and leading GLM-5 by a wide margin in repo generation and real-world terminal tasks. Community reports highlight its impressive speed, with 40 tps and over 2000 pp/s on stable setups, though some users are experimenting with hardware optimizations for better performance.
227 itemsmodel roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
Claude Vault (www.reddit.com)
Claude Code plugin so the LLM never sees your API keys https://github.com/hsperus/claude-vault
- Claude.md (gist.github.com via hn)
- What do you do with Claude? (www.reddit.com)
All my chats are gone (www.reddit.com)
I have Claude Pro, 20 min I logged in and all my chats disappeared, my projects are still there, but just the name, all the chats and files are gone. I contacted Fin AI agent, they didn't help, just told me to check if im in the correct ac…
Show HN: Stream iOS Simulators to a Browser Window (github.com via hn)
Agent tools seemingly know how to work with browsers better than with iPhone simulators, so I built this tool to capture the simulator XPC stream and render it in a webpage. This means Claude Code/Codex desktop apps can use their existing…
AI Job Searcher A personal AI agent for job search. One engine, many profiles.
-
139 items
event
Anthropic MythosAnthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.
- 1h what is claude mythos doing in my azure model catalog 😭
- 3h Trump officials draft plan to bring Anthropic back amid Pentagon fight
- 9h Claude Mythos Has Found 271 Zero-Days in Firefox
- 1d What Anthropic's Mythos means for the future of cybersecurity
- 1d We ran a 9B model against Anthropic's Mythos on Firefox. See the early results
Keeping context on larger projects (www.reddit.com)
I use Claude.md the way most of us probably do as a general reference to projects, standards, guidelines, etc. The problem is, as projects grow in complexity & size, it starts to get unwieldly.
MCP server that lets agents get human opinions in real-time (github.com via hn)
Datapoint MCP Get real human opinions from inside any MCP client. Run surveys, A/B preference comparisons, ratings, and rankings on text, images, audio, and video — without leaving your editor.
SYNQ – Give ChatGPT and Claude permanent, local memory (news.ycombinator.com)
could not extract summary
Agent-Augmented Meetings (2003) (link.springer.com via hn)
This chapter presents the Neem Project, a research project that integrates intelligent agents and virtual participants into a distributed meeting environment. The agents incorporate knowledge about different aspects of a “good”…
I've shipped ~62 browser-based free tools in about 30 days. Not vibe-coded landing pages or one-offs — structured, SEO-ready, deployed tools with real FAQs, proper meta tags, and working core functionality that capture real traffic.
What? A simple "I don't know" would suffice. (www.reddit.com)
Asked Claude to answer an old riddle and got this bizarre output.
Converting Claude Code into the most intelligent Deep Research Agent (www.reddit.com)
Over the past several weeks, I've been working on HyperResearch, a Claude Code skill harness that converts CC into the most intelligent deep research framework out there. HyperResearch surpasses OpenAI, Google, and NVIDIA's offerings in th…
- Converting Claude Code into the most intelligent Deep Research Agent (www.reddit.com)
- Converting Claude Code into the top scoring deep research agent (github.com via hn)