Our agent found a bug with WireGuard in Google Kubernetes Engine (lovable.dev via hn)
How Lovable's infrastructure team tracked down sporadic networking errors in Kubernetes, from crashing anetd pods to MTU mismatches, using AI-assisted debugging and deep packet inspection.
96.8% of MCP tool descriptions don't warn the agent about destructive behaviour (policylayer.com via hn)
The State of MCP Security What 1,787 MCP servers can actually do to your systems. We classified every tool on every Model Context Protocol server we could enumerate from the public registries — 25,329 tools across 1,787 working servers.
Docs REST API gRPC Pricing Search ⌘ K Toggle theme
- Grok 4.3 is out in the API (www.reddit.com)
- Grok hallucinations (www.reddit.com)
- Grok (www.reddit.com)
+2 more
- Where is Grok-2 Mini and Grok-3 (mini)? (www.reddit.com)
- Grok 4.3 Beta (grok.com via hn)
Bouncy – A small Rust web scraper with built-in MCP support (github.com via hn)
Tiny Rust headless browser for scraping. bouncy is a web scraper.
Validating a startup idea: automatic agent harness optimisation (www.reddit.com)
I’m validating a startup idea around agent *harness* optimisation. The idea is to take a task plus the resources available to an agent, and automatically find the best surrounding setup (*harness) for that task.
Claude is hilariously petty (www.reddit.com)
could not extract summary
Hey everyone, If you’ve been building with AI agents, you know that orchestrating text is one thing, but stepping into multimodal workflows (Text + Image + Vision) is incredibly messy. If you want an agent to act as a "Prompt Engineer," pa…
I've been experimenting with creating an AGENTS.md file for my React Native/Expo project. It's basically a structured document that tells Claude Code (and Cursor) about your project's: - Folder structure and file naming conventions - Theme…
-
61 items
model roundup
Sonnet 4.6Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.
- 45m 1M context beta retired yesterday on Sonnet 4.5 / 4. Here's the actual fix if you missed it.
- 7h Claude Code usage spike from long-context cache writes?
- 17h A medicine student with no coding experience tried to create a studying agent: Felicity.
- 21h Can't replicate Reddit numbers with Qwen 27B on a 3090TI.
- 1d Why Claude is not consistent?
137 itemsevent
CoworkIssues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.
- 50m Cowork can't even get my Notion tasks - Can anyone help?
- 5h How would you build this?
- 7h I Gave Claude Cowork an Obsidian Second Brain. Here Is What It Remembered After 11 Sessions
- 9h Can I use claude interface with locally hosted models which support Open AI api compatipility?
- 13h Update: Where do Skills Live? I'm so confused. (I'm no longer confused)
I wanted to share my experience using Claude as a coding partner/mentor for the past month. I'm not a complete beginner, but I'd never built a production app with payments, auth, and real users.
When Dawkins met Claude – Could this AI be conscious? (unherd.com via hn)
Richard Dawkins Apr 30 2026 - 12:03am 7 mins The Turing Test is shorthand for a 1950 thought experiment that the great mathematician, logician, computer-pioneer, and cryptographer Alan Turing (1912-1954) called the “Imitation Game”. He pro…
Not Claude searching for it's own system prompt leak (www.reddit.com)
could not extract summary
OpenAI caught making sockpuppet accounts to attack its critics (www.reddit.com)
link to source - https://www.modelrepublic.org/articles/is-this-openai%E2%80%99s-anonymous-twitter-sockpuppet-account
What differentiates agents that ship real work from ones that don't (www.reddit.com)
Sharing some thoughts on AI agents. Right now, one axis differentiates them: are you inside the agentic loop or outside it Inside works.
Hey everyone, I'm not a senior engineer. I'm just a guy who got obsessed with what you can actually do when you stop using one AI at a time and start running a small team of them.
What is OpenClaw, and why is it so enjoyable to use in real workflows? In this video, I break down what OpenClaw is, how it works, and why it stands out as a practical tool for people who want a fast, flexible, and hands-on way to build AI…
Help me choose between Claude, ChatGPT, Marketing AI (www.reddit.com)
I’ve been using an AI marketing tool (\\\~$39/month) for social media posts, carousels, and website generation. The website output is solid, but the reels aren’t good enough to rely on.
- Help me choose between Claude, ChatGPT, Marketing AI (www.reddit.com)
-
68 items
model roundup
DeepSeek 4DeepSeek-V4-Pro is a 1.6T parameter Mixture-of-Experts model supporting one million-token context, with significant improvements in efficiency and stability through hybrid attention and manifold-constrained hyper-connections. Community highlights include its cost-effectiveness via the official API and exceptional performance in large code change evaluations, with some noting its surprisingly robust output capability despite a 384K max token limit.
33 itemsevent
HallucinationClaude Opus 4.6, Anthropic's flagship model, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, highlighting a significant regression in handling certain tasks. Meanwhile, biologists are revisiting cases of mushroom-induced hallucinations in China, suggesting ongoing research into natural causes of similar phenomena.
- 1h Grok 4.3 achieves higher overall intelligence over 4.20 with less of a cost, at the price of slightly higher hallucination rate.
- 22h Reasoning models hallucinate tool calls more, not less. There's a paper.
- 1d Improve claude code on Opus 4.7
- 2d Ran my own benchmark Qwen 3.6 35B vs Gemma 4 26B.... theres a clear winner here
- 2d Open Source Knowledge Graph With Versioning
Show HN: Git Shield – local hooks for secrets and PII (news.ycombinator.com)
I made this after worrying that AI coding sessions, copied logs, or quick test fixtures could leak real data into a repo. Git Shield installs pre-commit/pre-push hooks.
I rewrote my multi-agent AI system from TypeScript to Rust (www.reddit.com)
I’ve been building a small multi-agent AI system called TigrimOS. The basic idea is to let multiple AI agents work together in a workflow, instead of having one assistant do everything.
Claude primary desktop (www.reddit.com)
I run Claude Desktop on 2 Windows machines (laptops) but I want only one of those to be reachable via Dispatch. I had done Dispatch work earlier with the 'correct' laptop, but yesterday all of a sudden the mobile app said the desktop was o…
I created a full UX system built around proven UX laws and rules. It forces AI to think about all the things top apps like Apple, Stripe, Linear, Notion and Figma implement, and what makes them convert - like cognitive load, Hick’s Law, Fi…
- I created a UX / Design System for AI tools like Claude & Codex. (www.reddit.com)
Is AGI the End For Local LLMs? (www.reddit.com)
If leading AI conpanies are after AGI and the whole chatbot/agentic AI is just a phase for them to get to the end goal, then what does that mean for local LLMs? I would like to believe local LLMs are the future, but if AGI is achieved, do…
Claude Code Source Code Breakdown (kuber.studio via hn)
Earlier today (March 31st, 2026) - Chaofan Shou on X discovered something that Anthropic probably didn’t want the world to see: the entire source code of Claude Code, Anthropic’s official AI coding CLI, was sitting in plain sight on the np…
- Claude code (www.reddit.com)
Local query autocomplete with "classical" ML, no LLM needed (www.reddit.com)
Hey guys! I know this is not fully LLM related (its still local though :D), mods feel free to delete this if you think its off topic, but I just wanted to share something I experimented with, local autocomplete without the use of LLMs or f…
Lightweight OpenCode profile for routine dev work with focused agents (github.com via hn)
Supersimple Supersimple is a lightweight OpenCode profile for routine software work. It keeps a small core agent set, uses orchestrator as the default entry point, and adds a focused set of local skills and commands for planning, implement…
Claude explains How Claudes Are Made (www.reddit.com)
src - u/anthrupad
Google DeepMind Researchers Map Out Ways Hackers Hijack AI Agents (sumsub.com via reddit)
- Apr 03, 2026 - 1 min read Google DeepMind Researchers Map Out Ways Hackers Hijack AI Agents Google DeepMind researchers have released a paper detailing how autonomous AI agents can be hijacked. Photo credit: NorthSky Films / Shutterstock…