Agentic Code Must Be Human Auditable (dockyard.com via hn)
I have been AI-pilled for over a year at this point. It's pathetic, I rarely touch-code any more.
Would Claude Fable's shadownerfing making an anticompetitive class action case? (news.ycombinator.com)
Show HN: A 150M model that extracts verbatim evidence spans for RAG, no LLM call (huggingface.co via hn)
Verbatim-RAG Extractor Chill, I Ground! 🌶️ Model Name: verbatim-rag-modern-bert-v2 Organization: KRLabsOrg Github: https://github.com/KRLabsOrg/verbatim-rag Overview The Verbatim-RAG Extractor is a query-conditioned token classifier that h…
Anthropic's Fable 5 Is Opus on a Good Day (www.williamangel.net via hn)
Anthropic's Fable 5 is Opus on a Good Day Published 2026-06-10 So far a couple of hours with Antrhopic's new Fable model in Claude Code feels like it consistently does what Opus does on a good day. Claude Code with Fable more consistently…
What a Regex Can't Do Catching a wasted tool call is a hash set's job. A governor that learns your agent, calibrates, and trades harm against cost in one currency is not — and that gap is the whole point.
Show HN: Private Wealth Tracker (apps.apple.com via hn)
Your data stays on your phone. No bank sync track, no AI advice.
Apache Burr: Build reliable AI agents and applications (burr.apache.org via hn)
New Anthropic privacy policy: age/identity verification for consumer accounts (www.anthropic.com via hn)
Privacy Policy This policy was published on June 8, 2026 with an effective date of July 8, 2026. Anthropic is an AI safety and research company working to build reliable, interpretable, and steerable AI systems.
-
332 items
event
Anthropic MythosAnthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.
105 itemsevent
MistralMistral, a French AI company, is set to release a medium-sized model with 128 billion parameters and is planning to launch Workflows in public preview. The company, founded by Arthur Mensch, continues to grow its AI empire despite not being based in the United States.
- 20m Built a minimalist coding agent optimized for memory footprint and speed
- 3d LLM delegation - probing task handoff efficiency and economics
- 4d Self-hosted LLMs
- 6d Show HN: Free AI agent audit for Shopify catalogs (1.2M open captures)
- 11d Mistral says Europe has two years to build its own AI infrastructure
Show HN: Learn while you wait for your agents to code (github.com via hn)
Hi HN, While waiting for Claude Code to finish running, It's very tempting to start another task or browse the internet. This is what happened to me so I built Foyer to try to learn about what the agents are working on instead of losing fo…
As autonomous systems evolve (we see what AI agents are doing now), we open-sourced TKeeper, which allows you to build guardrails around their actions using typed intents, policy checks, and cryptographic proofs. It allows you to restrict…
Built a macOS debugger for AI agents — visualize LLM call chains in real time (www.reddit.com via reddit)
Spent too much time parsing logs to understand where my agent failed. Built Tether: a local proxy that captures every LLM call and renders it as a live node graph.
MarkSentry – zero-trust document-to-Markdown for RAG pipelines (sunilgentyala.github.io via hn)
Path traversal jailing, SSRF blocking, VBA macro stripping, zip-bomb detection, multi-column PDF, and PII redaction. Everything MarkItDown skips.
Testing MiniMax M3 on refactoring, screenshot debugging, music recommendations (andlukyane.com via hn)
A hands-on look at MiniMax M3 through Claude Code — what its new MiniMax Sparse Attention (MSA) is and how it differs from the lightning-attention and full-attention designs of earlier MiniMax models, plus three real tasks: auditing and re…
Claude’s em dash addiction routinely breaks powershell scripts (www.reddit.com via reddit)
This is more of a vent, I know I can fix this with better prompts. But why Claude?
Killed by GPT (killedbygpt.com via hn)
KilledByGPT Which will likely die first? 8,609 votes Submit product Vote Leaderboard Mantis VC Venture capital fund from The Chainsmokers and partners investing in early-stage technology startups.
- GPT-4 (openai.com)
AI Voice Agent Architecture: How Real-Time Conversational Systems Work (www.faridfadaie.com via hn)
I built the same production voice agent three times. Same requirements, same telephony stack, same speech models available — three fundamentally different architectures.
-
73 items
model roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, including sizes up to 31B parameters and featuring Dense and Mixture-of-Experts architectures. Notable community highlights include the release of Gemma 4 12B as an encoder-free unified model for laptops, its availability via llama-server on a RTX 5070 Ti GPU, and detailed visual guides showcasing its capabilities.
- 55m Gemma-4-31B at 256K context on a $1,400 AMD GPU – measured, with patches
- 8h I installed: HONCHO local hosted no docker (TUTORIAL)
- 10h Anyone gotten Gemma 4 12B (unified audio) to actually attend to speech with a large system prompt?
- 14h I wired up Agentic Coding with Code Context Graphs, results are interesting
- 14h I'm brand new to running LLMs and the sheer number of tools is overwhelming
39 itemsevent
Gpt 4Recent developments in AI automation include a sales team entirely run by bots achieving $28k MRR, and new tools like Arc Gate blocking prompt injection before it reaches GPT-4. Meanwhile, users are exploring workflows to reduce cross-checking time and improve insights from large language models.
Global watchdog calls for tighter controls on agentic AI in finance (www.reuters.com via hn)
paywalled
Show HN: AI watched my screen for a year. Weather beat sleep (donethat.ai via hn)
Made an open source Claude desktop app but for any harness (www.reddit.com via reddit)
Repo: github.com/proliferate-ai/proliferate, can download at proliferate.com I kept bouncing between Codex, OpenCode and Claude Code, but got tired of switching harnesses every few days, so I built a desktop app that runs all of them in on…
Show HN: Papermill Press – An AI-friendly markup language for PDF generation (news.ycombinator.com)
AgentCarousel Write tests for your AI agent. Run them in CI.
Diffusion Gemma 26B MOE (www.reddit.comhttps)
Pretty exciting, wonder what it will take from llama.cpp to get it working locally
Show HN: Extend UI – open-source UI kit for modern document apps (www.extend.ai via hn)
Anthropic support does not exist (mg0x7be.github.io via hn)
Ask HN: Should the term "cognitive surrender" apply to writers who publish slop? (news.ycombinator.com)
The Classifer flagged Anthropic own documentation (www.reddit.com via reddit)
https://preview.redd.it/9gxb0u2xwg6h1.png?width=1833&format=png&auto=webp&s=0b1508dbc4b065e1f948c43b4457f581ab13b102 https://preview.redd.it/o9kdg697zg6h1.png?width=1029&format=png&auto=webp&s=f8ec410bab1d9689b12b36d8e66bbb602f09f695 lmao…