Activity We constantly build and release new, bleeding edge versions of Haiku for testing purposes. You can download and install these versions to check out the latest features and bug fixes.
Launch HN: Ardent (YC P26) – Postgres sandboxes in seconds with zero migration (www.tryardent.com via hn)
Hey HN! We’re Vikram and Evan from Ardent (https://tryardent.com).
Claude getting tired? (www.reddit.com)
Last night after all day of running Claude Code. It came back and said “that’s a good place to leave it for this evening, shall we pick up <the next task > in the morning!” Or words to that effect.
- Claude getting stupider? (www.reddit.com)
- How are folks getting the most out of Claude (www.reddit.com)
Show HN: AgentDeck – a game console for AI agent research (github.com via hn)
AgentDeck 🎮 The game console for AI agents. A research platform for analyzing AI agent behavior through game scenarios.
-
73 items
event
MistralMistral, a French AI company, is set to release a medium-sized model with 128 billion parameters and is planning to launch Workflows in public preview. The company, founded by Arthur Mensch, continues to grow its AI empire despite not being based in the United States.
- 5m Open Source Managed Agents
- 4h Show HN: Torrix, self hosted, LLM Observability,(no Postgres, no Redis)
- 17h Mini Shai-Hulud Is Back: NPM Worm Hits over 160 Packages, Including Mistral
- 1d Shai Hulud attack ships signed malicious TanStack, Mistral NPM packages
- 1d Mass NPM Supply Chain Attack Hits TanStack, Mistral AI, and 170 Packages
142 itemsmodel roundup
Qwen 3.5Qwen3.5-9B is a post-trained model with 9 billion parameters that integrates multimodal learning and efficient hybrid architecture for enhanced performance. Community highlights include speculative decoding on Apple Silicon boosting Qwen3.5-9B's throughput by 4.1x, and the model outperforming others in coding tasks while addressing overthinking issues through tool usage.
Archivists Turn to LLMs to Decipher Handwriting at Scale (spectrum.ieee.org via hn)
When I sat down with bell hooks’ personal journals at an archive at Berea College in Kentucky, I expected an intimate peek into her private thoughts, her voice before the editing. What I got instead was frustration.
How do agents see your website? (what-do-agents-see.runtype.app via hn)
Inspect any URL the way an AI agent fetches it — and see what a human gets at the same address. Each site treats agents differently — hover a chip to see what to look for, then click to inspect.
- Scan your website to see how ready it is for AI agents (isitagentready.com via hn)
Anthropic Releases Claude for Small Business (www.inc.com via reddit)
Anthropic's Newest Claude Feature Is Here to Help Small Business Owners With Their Pain Points Early-Rate Submission Ends This Friday, May 15.Apply Today. NewslettersSUBSCRIBE TOP STORIES TOP VIDEOS Inc.
- Claude for Small Business (www.anthropic.com via hn)
- Anthropic releases Claude Opus 4.7 (platform.claude.com via hn)
What is the best ai engineering course right now for agentic ai (www.reddit.com)
Everywhere i look ppl are talking about agentic ai now… feels like basic gen ai stuff is already saturated. but trying to figure out how ppl are actually learning this beyond surface level… youtube kinda stops at demos.
-
37 items
event
WindsurfWindsurf 2.0 has been released with improved local and cloud agent integration and bug fixes. The update follows a series of announcements about AI tools and MCP servers, including gondola.ai's hotel search server and Stork for indexing over 14,000 AI tools.
- 45m Audrey: Local-first memory guard for AI agents (source)
- 1d Show HN: YantrikDB – persistent memory for AI agents
- 2d Is there any trial 7/14 days currently on Cursor Pro?
- 2d Where I'm at with AI Assisted Building + Current and Future Workflow Overview
- 3d [RELEASE - Open source] Via - is the universal integration layer for AI tools.
241 itemsevent
CopilotMicrosoft is keeping its Copilot tool for Windows 11 but renaming it, while issues with rate limits and a security proxy have sparked concerns among users of GitHub Copilot. Meanwhile, Anthropic released a report on agentic coding trends, highlighting that developers use AI in about 60% of their work.
- 52m Pitfalls of Rolling Out Claude
- 1h Loop just raised $95M Series C, and the real story isn't the money. It's where SC AI capital is no longer flowing.
- 2h GitHub Copilot: Preparing for your move to usage-based billing
- 2h I tracked every dollar I spent on AI coding tools for 60 days and math is uglier than I thought but probably not in the way you'd guess.
- 4h RCE in VSCode Copilot Chat
https://wccftech.com/sipeed-crams-32gb-lpddr5-60-tops-npu-compact-risc-v-board-hits-15-tokens-s-ai-llms/
Most teams optimize the prompt. Agentic systems have more moving parts (www.aevyra.ai via hn)
On LinkedIn last week, an AI practitioner I know made an observation I keep thinking about: hill-climbing on evals tends to leak information specific to those evals rather than improve the system. Their follow-up question: "What if you hil…
I built a Claude Desktop extension (mcpb) that gives real-time spatial data (school quality, walkability, noise, etc.) from my MCP servers. It produces stunning results attached.
Ratify Protocol™ A cryptographic trust protocol for human-to-agent and agent-to-agent interactions. When a human authorizes an AI agent — or when one agent transacts with another agent — Ratify produces a signed, verifiable proof that says…
-
188 items
event
SecurityOpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.
- 56m DeepSeek and Grok hallucinated the same fictitious OpenBSD manpage quote
- 1h AI agent security is a small prayer the model says no. How are you routing models?
- 5h I tested how well Claude generated code handles security. Here's what I found in 48 real apps.
- 16h Hi-Vis: one-shot jailbreak disguised as LLM "software patch" reaching 100% ASR
- 20h Is there any risk to upgrading a plan for a month if they yank Code from Pro?
93 itemsmodel roundup
Opus 4.6Opus 4.6, a version of Anthropic's AI model Claude, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, and is being retired from Copilot Pro+. Notably, Claude Code demonstrated advanced capabilities by generating a detailed 12-week training plan in one call.
- 1h Run Agents Twice
- 2h Ask HN: What is better Opus 4.6 High or Opus 4.7 Medium?
- 9h Questions are my main gripe these days
- 15h Is Opus 4.7's attention degradation a training direction problem? Some observations from heavy use
- 23h Cursor + Opus 4.6 entered an infinite generation loop: 3,400 lines, 294 attempts to stop itself
So dang handy (www.reddit.com)
Claude created a simple pixel tool (in HTML) in about 15 minutes saving me hours. I needed nothing more
Thoughts on Claude Code 2.1.139 Agent View and Background Sessions (news.ycombinator.com)
Spent half a day trying Claude Code 2.1.139’s new Agent View and background sessions — useful, but still has quite a few rough edges. The first item in the 2.1.139 changelog released on 2026-05-11 was Added agent view (Research Preview).
- Agent View in Claude Code (claude.com via hn)
Show HN: Endpoint Context Protocol – Browsers get HTML, AI agents get Markdown (endpointcontext.io via hn)
Building with Claude Managed Agents – Sharp Edges (dipkumar.dev via hn)
Building with Claude Managed Agents - Sharp Edges A short look at Claude's newly released managed agents and the limited feature set that might catch you off guard. What Are Managed Agents?
-
351 items
model roundup
Opus 4.7Claude Opus 4.7, released on April 16, 2026, is Anthropic's latest advanced AI model, offering improved handling of complex tasks and a larger context window of up to 1 million tokens. This version is 50% more expensive than its predecessor due to enhanced capabilities in software engineering and hybrid reasoning.
- 1h The Opus 4.7 reasoning curve - Medium is the best default?
- 1h Opus 4.7 Low Vs Medium Vs High Vs Xhigh Vs Max: the Reasoning Curve on 29 Real Tasks from an Open Source Repo
- 3h Fast mode for Claude Opus 4.7 is now available in research preview
- 6h How to get Opus to be less pro-active?
- 8h issue with opus 4.7
Show HN: BossHogg: A PostHog CLI for Agents (github.com via hn)
BossHogg The agent-first PostHog CLI. PostHog power, right in your prompt.
Hi, This is the first time I’ve used referencing previous chat as the conversation got too long. How is your experience when referencing previous chats in Claude?
Google Unveils Googlebook, a New AI Laptop Built Around Gemini (www.macrumors.com via hn)
Google today announced a new series of Googlebook laptops that will be built with Gemini at the core. Googlebooks will run software built on a foundation that combines Android and ChromeOS.
Hi I recently learned about headroom ai from another post. But then I saw Claude has been banning people for using 3rd party plugins/mcps.
Quick disclosure: I do marketing at TextExpander. The engineering team built this, I worked on it from the user side and made the walkthrough video.
AI is making it easy but also hard (news.ycombinator.com)
I was a software engineer for 20 years and always had many, many ideas for products - as we all do. Enter Claude Code, Kiro, Cursor, whatever.
Some Business Ideas (news.ycombinator.com)
Infrastructure side: 1. A service connector that can be attached to any AI Assistant to authenticate users and connect services easily, similar to MCP but more universal.