For context, this was a medium length conversation about a camping trip I'm planning to go to in summer, I was using Claude to discuss how to pack smart. I gave it a prompt about in-tent stoves, and while it gave me a helpful response, the…
LLMs are not a higher level of abstraction (www.lelanthran.com via hn)
" A picture is worth 10K words - but only those to describe the picture. Hardly any sets of 10K words can be adequately described with pictures." -- Alan Perlis I am seeing the claim everywhere online that LLMs are a higher level of abstra…
The Mushroom That Makes People Have the Exact Same Hallucination (www.vice.com via hn)
Biologist Colin Domnauer is reopening an old case that Chinese health officials seem to have stopped caring about. Every summer, residents of the Yunnan province check into hospitals with complaints that they’re hallucinating tiny elflike…
Open CoDesign: Open-source, local-first alternative to Claude Design and v0 (firethering.com via hn)
File Info Table of Contents Description Open CoDesign is weird in a good way. You write a prompt.
Claude plugins are genuinely a game changer (www.reddit.com)
We almost auto-renewed a 6figure SaaS contract we wanted to exit last quarter. 3 month notice window buried in clause 12.4.
I’ve been experimenting with agent skills and wanted to share something I built: This repo is focused on iOS development using AI agents (Claude Code, Codex, etc.), but with a different approach than typical prompt-based workflows. Most AI…
8v: one CLI for you and your Al agent. Up to 66% fewer tokens. (www.reddit.com)
Hi I built 8v, One binary. You run it.
- 8v: One CLI for you and your AI agent. Up to 66% fewer tokens (github.com via hn)
The authentication in Microsoft's agent governance toolkit never runs (www.flyingpenguin.com via hn)
You install a security system. The app on your phone shows everything armed.
-
67 items
model roundup
GPT 5.5On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.
206 itemsmodel roundup
Opus 4.7Claude Opus 4.7, released on April 16, 2026, is Anthropic's latest advanced AI model, offering improved handling of complex tasks and a larger context window of up to 1 million tokens. This version is 50% more expensive than its predecessor due to enhanced capabilities in software engineering and hybrid reasoning.
Show HN: AI agents that learn from their own failures each week (www.deployinfra.ai via hn)
An AI employee that gets smarter every week. Answers on chat, phone, and WhatsApp 24/7— then learns from every conversation, finds what it couldn't answer, and keeps improving.Live in 90 seconds.
Is there already an open-source app for centralized LLM chats? (www.reddit.com)
Hello! I’m a software developer thinking about how to keep all my LLM conversations in one app instead of having them scattered across ChatGPT, Claude, Gemini, etc.
Are Media Buyers Becoming Obsolete? (news.ycombinator.com)
AI agents can now handle research → creative → optimization. Are Media Buyers Becoming Obsolete?
Claude Code Uses GLM 4.7 (old.reddit.com via hn)
could not extract summary
- Anthropic's Claude remote uses GLM-4.7 (www.reddit.com)
Agent Kombat (kau.sh via reddit)
Introducing "Agent Kombat" it takes one prompt or plan and turns it into a planning debate between Claude Code and Codex. Both agents produce independent plans first.
- what is an agent? (www.reddit.com)
Native Dialog popup failures (www.reddit.com)
I'm currently creating a couple of agentic workflows that include various cases of downloading files automatically on different UIs, but, since I'm using chrome MCP for navigation, whenever a "save as" dialog shows up, claude is unable to…
BySo i've been lurking here for a while reading all the "how do i monetize my agent" posts and figured i'd share what actually ended up working for me since i was in the exact same boat like 4 months ago. Background: I built a GPT wrapper…
I recently came across an open-source project called AnimoCerebro, and I thought it was worth discussing here because it’s trying to build something a bit different from the usual agent framework. The core idea is not just “LLM + tools + l…
-
45 items
model roundup
Sonnet 4.6Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.
- 1h Using MCP to stop wasting tokens on WP translations
- 2h Does Claude have access to things pasted in the text box but not sent?
- 1d Opus 4.6 vs Sonnet 4.6
- 1d Claude's sonnet 4.6's clarifying questions...How to read?
- 1d Does effort tier change refusal behavior on agent-attack prompts? CVP run 4 with sonnet 4.6 high and max efforts.
121 itemsmodel roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, available in sizes up to 31 billion parameters and featuring dense and MoE architectures. Notable community highlights include the 31B model's success in production tests, with some users preferring 4-bit precision for local use, and others sharing settings for optimizing performance with smaller models.
Agent Index Documenting Technical and Safety Features (arxiv.org via hn)
Agentic AI systems are increasingly capable of performing professional and personal tasks with limited human involvement. However, tracking these developments is difficult because the AI agent ecosystem is complex, rapidly evolving, and in…
Claude Code plugin for designing modular systems (github.com via hn)
Modularity Skills TL;DR: A Claude Code plugin for designing and analyzing modular software systems using the Balanced Coupling model. There's no shortage of AI tools that provide code-level feedback: best practices, edge cases, potential b…
Haiku has not caught up with the times (discuss.haiku-os.org via hn)
I’ve been spending some time improving the arm64 port of Haiku with the goal of some day running Haiku on my M1 MacBook Air. Here’s the current state of the port (in QEMU) as of hrev59575: The port is mostly stable and all of the usual…
9 min read Mar 31, 2024 -- After my latest post about how to build your own RAG and run it locally. Today, we’re taking it a step further by not only implementing the conversational abilities of large language models but also adding listen…
Study the 50 leaked LLM Interview Questions with my lil learning App (boguslavskyy.com via hn)
Gamified learning for AI/ML interviews. 5-step method: understand, quiz, vocabulary, write, speak.
There’s always a moment when I remember that the exchange is an equation (in a definitively physical sense) and the prompt has always equaled the response in the same way that the proportionality of mass in an atom has always expressed the…
The cost math behind routing Claude Code through Ollama (~90% cut) (github.com via hn)
Use Ollama to Enhance Claude — Two-Engine Setup Pair Claude Desktop on Anthropic with Claude Code routed through Ollama in your terminal. Strategy stays on Pro.
Everything that went wrong with Claude (clawd.rip via hn)
Music Publishers Drag Claude Into Court Universal, Concord, and ABKCO sued Anthropic, alleging Claude was trained on copyrighted lyrics and could reproduce lyrics from hundreds of songs. The 'constitutional AI' company got its first big co…
What would be the best OS to run LLMs? (www.reddit.com)
Hi there, I've ordered a mini PC with 128GB of RAM and the AMD AI Max 395. I intend to use it with Proxmox (like my actual machine), where I run Windows for some gaming and macOS for my music library server.
could not extract summary