So I've been going down a rabbit hole lately and I can't find many people actually talking about this specific use case. everyone here runs local LLMs for coding, chat, maybe some creative writing.
AionDB: PostgreSQL-compatible SQL, graph, and vector database in Rust (aiondb.xyz via hn)
PostgreSQL wire / ORM-compatible / SQL + graph + vector AionDB PostgreSQL tooling for applications that need relational records, graph relationships, and vector search in one Rust engine. MATCH (u:User {tenant_id: 100})-[:WROTE]->(d:Docume…
When I joined the Codex engineering team in September 2025, Codex for Windows didn’t have a sandbox implementation meaning that Windows users were forced to choose between two subpar options when using OpenAI's coding agents: Approving nea…
Claude subscriptions no longer include Agent SDK and Claude -p usage (www.xda-developers.com via hn)
Summary - Paid Claude plans get a dedicated monthly programmatic credit for Agent SDK, claude -p, Claude Code GitHub Actions. - Subscription interactive credits no longer cover programmatic use; programmatic calls draw from the new pool.
Best TTS in 2026: Blind Benchmark (techstackups.com via hn)
Best Text-To-Speech Model in 2026: Blind Benchmark In 2026, the text-to-speech (TTS) market is saturated. Every provider is offering a new groundbreaking model trained on a zillion hours of natural human speech in 200 different languages.
Show HN: I built an open source dication tool based on benchmarks (codictate.app via hn)
I've built an AI dictation tool (like many others), but this time I have taken the time to benchmark all of the 34 models that we provide, so users can actually make a qualified choice on what model should be the daily driver. The best is…
Economic Futures – Anthropic (www.anthropic.com via hn)
Economic Futures The Anthropic Economic Futures program aims to support research and policy development for addressing the economic impacts of AI. It provides research grants, forums for policy discussion, and evidence on real-world AI use.
Every time I hit an unfamiliar LLM term while building, I'd look it up and get either a textbook definition or a paper. Useful for understanding what something is, not useful for knowing what to do with it.
-
356 items
model roundup
Opus 4.7Claude Opus 4.7, released on April 16, 2026, is Anthropic's latest advanced AI model, offering improved handling of complex tasks and a larger context window of up to 1 million tokens. This version is 50% more expensive than its predecessor due to enhanced capabilities in software engineering and hybrid reasoning.
- 3m Claude Opus 4.7 just revealed its System prompt, without beeing asked for it
- 2h I tested GPT-5.5 Codex against Opus 4.7 Claude Code, and it's about time Anthropic bros take pricing seriously.
- 6h Is this math right? Agent SDK on Opus 4.7 vs the new monthly credit
- 11h Is Cowork a token burner ?
- 12h I tested GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro on financial-control
235 itemsevent
CoworkIssues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.
- 19m Max 20x Users Will Get $200 per Month in Claude Code -p Use
- 37m Privacy concerns - Claude on separate "clean" User profile?
- 3h Claude CoWork, AuDHD, Executtive dysfunction, and my rage at the lack of a Linux Desktop Client
- 6h Claude Agent SDK billing changes June 15. What it means for marketing teams and what I am doing
- 14h BAA - HIPAA enablement
To enable real A2A, your agent's actions are your responsibility. (www.reddit.com)
Hi everyone! We're building a multiagent social app.
State media control shapes LLM behaviour by influencing training data (www.nature.com via hn)
- RESEARCH BRIEFINGS State media control shapes LLM behaviour by influencing training data Access options Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription $32.99 / 30 days cancel a…
How do I access the 1-million context models in Claude (Pro) (www.reddit.com)
Hi there! I'm new to all this, I'm having some trouble accessing the 1-million context models on claude (the VS-code extension) ?
Chinese text on claude.ai is displayed using Japanese fonts, causing characters like 门, 兴, 认 to show incorrect glyph variants (Japanese shinjitai instead of simplified Chinese). Root cause: The CSS variable --font-anthropic-serif puts Japa…
Can an AI agent run approval workflows without constant prompting? (www.reddit.com)
Our approvals live in Slack threads and people forget to respond. Procurement, hiring, and content all need sign-off, but tracking is manual.
Show HN: AGEF, an open evidence format for AI agent sessions (github.com via hn)
AGEF (Agent Governance Evidence Format) AGEF is an open specification for portable, tamper-evident AI-agent session evidence. It defines how a session can be represented as content-addressed objects plus merkle-linked events so evidence ca…
Is there any place that collects open source Claude projects? (www.reddit.com)
Recently I came across a post where people were sharing things they built with Claude, and honestly some of them were really cool. Small productivity apps, HTML tools, automations, work helpers, stuff like that.
Over reliance on Claude Code in business? (www.reddit.com)
I'm working on a project with another person who suggests we should use claude for literally everything. Don't get me wrong - I see a ton of value but at the same time, why not use python and scripts that can do the same thing without burn…
- Claude code (www.reddit.com)
-
423 items
model roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
180 itemsmodel roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, available in sizes up to 31 billion parameters and featuring dense and MoE architectures. Notable community highlights include the 31B model's success in production tests, with some users preferring 4-bit precision for local use, and others sharing settings for optimizing performance with smaller models.
- 39m The "the future is fictional" problem of many local LLMs
- 16h On my RTX 4060 8GB laptop, I can run Gemma 4 E4B Q6 K XL with mmproj at only 6GB of VRAM usage despite sources recommending Q4 K M for my hardware. What is going on?
- 17h LLMs on flagships smartphones?
- 20h very slow tok/s with Gemma 4 31B on a 5090?!
- 1d Does THINKING MODE significantly improve translation?
Bring any agent into a meeting – talk back, collaborate, screen share, code live (www.youtube.com via hn)
About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
Attention Once Is All You Need: Stateful Transformers (arxiv.org via hn)
Conventional transformer inference engines are request-driven, paying an O(n) prefill cost on every query. In streaming workloads, where data arrives continuously and queries probe an ever-growing context, this cost is prohibitive.
Gloop – A Self-Modifying AI Agent and TS Library (gloop.codes via hn)
An AI agent that adapts itself to you. Tell it to change its UI, add tools, or behave differently — it edits its own code and reruns.
Bridge Launches Computer Agent Beta (twitter.com via hn)
could not extract summary
What features are missing in current AI agent frameworks? (news.ycombinator.com)
What’s currently missing in AI agent frameworks? Examples: * better memory systems * workflow debugging * human-in-loop controls * distributed execution * lower latency orchestration Interested in hearing what developers actually need.
Hey everyone! Recently, I released a blog on how to setup a cluster out of your Mac Minis for distributed training and inference Now its time to do the same with Raspberry Pis!
Homemade and Minimalist Agent Composer (en.andros.dev via hn)
Homemade and minimalist agent composer The concept of orchestrating agents is gradually spreading among developers. Claude has planned, or you may already be able to use (depending on when you're reading this), the ability to launch severa…
Fastest small LLM at 1 KB context is the slowest at 1 MB (blog.0xmmo.co via hn)
Claude and I made 2,000 API calls to nine small closed-weight models across three providers in a range of prompt sizes between 100 and 1M tokens. We ended up discovering some interesting things about how providers scale inference, or fail…
I feel useless (www.reddit.com)
Recently I had a university Database project to create a HR database and to connect it to a website and i had ZERO experience on how to do this so I went to Claude, it built the whole database and then it helped me do all the backend and t…
- Why most AI projects feel useless (news.ycombinator.com)
The Holistic Cloud OS for AI Swarms. Unify secure WebAssembly execution, Governed Graph Memory, and dynamic tool discovery.