Anthropic just overtook OpenAI with $1T valuation (www.the-independent.com via hn)
The Independent's journalism is supported by our readers. When you purchase through links on our site, we may earn commission.
Who owns the code Claude Code wrote? (legallayer.substack.com via hn)
Who Owns the Code Claude Wrote? AI-generated code copyright explained for builders.
Show HN: Photorealistic GPT Image 2 animal hybrids (www.emergentmind.com via hn)
Browse by Animal ARCTIC FOX Arctic fox + Armadillo Arctic fox + Axolotl Arctic fox + Bat Arctic fox + Bear Arctic fox + Beaver Arctic fox + Bumblebee Arctic fox + Butterfly Arctic fox + Camel Arctic fox + Capybara Arctic fox + Cat Arctic f…
Show HN: MindCheck – Analyze your AI coding logs for over-delegation (github.com via hn)
Hi HN, I built MindCheck after running into a problem in my own AI-assisted workflow. A couple months into using Codex heavily, I realized I had delegated too much of a data pipeline without really tracking the details.
I migrated 16 sites between Linode servers in 1 day with Claude Code (thekeesh.com via hn)
I’ve had an old Linode server hosting my websites now since 2011, and was behind on upgrades and Ubuntu versions, making it harder and harder to upgrade over time. And I had 16+ sites hosted on it across static sites, wordpress sites, and…
LLM from pre-1930 derives quantum mechanics and relativity (michaelhla.com via hn)
Machina Mirabilis An experiment to see if an LLM trained from scratch on text prior to 1900 can come up with quantum mechanics and relativity. While it fails at most physics related tasks, the model shows glimpses of intuition.
-
208 items
model roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
113 itemsmodel roundup
Qwen 3.5Qwen3.5-9B is a post-trained model with 9 billion parameters that integrates multimodal learning and efficient hybrid architecture for enhanced performance. Community highlights include speculative decoding on Apple Silicon boosting Qwen3.5-9B's throughput by 4.1x, and the model outperforming others in coding tasks while addressing overthinking issues through tool usage.
- 1m Qwen3.6-27B IQ4_XS FULL VRAM with 110k context
- 3h AMD Radeon RX 6900 XT - ROCm vs Vulkan - Gemma 4 and Qwen 3.5 speed benchmarks
- 3h Qwen 35B-A3B as an always-on agentic loop on a 16GB Mac M4: disk became the bottleneck before RAM
- 5h RTX 5070 Ti (new) vs RTX 3090 / 3090 Ti (used) for LLM inference + clustering
- 14h Guys this is so fun!
Agent Amnesia and the Case of Henry Molaison (jumbocontext.com via hn)
The Problem If you've worked with coding agents extensively, then you've probably noticed this pattern: at the start of each session, they act like you're meeting for the first time. Every session starts from scratch.
Why Your AI Agents Keep Breaking Your Workflows (keryxsolutions.substack.com via hn)
Why Your AI Agents Keep Breaking Your Workflows Your AI investment isn’t paying off the way you expected. You added agents to your workflows, and now your team spends more time debugging the AI than the AI saves them.
Reasoning model in voice agent? (www.reddit.com)
I’m building a voice agent on livekit and I’m ripping my hair out. The problem is that I either use a moderate sized LLM and it responds in real time or I use a big / reasoning model and there is a huge delay before it responds and it's su…
Do Agents Need Quickstart Guides? (techstackups.com via hn)
I integrated the same app with Infisical twice, once with a quickstart I'd written from experience and once without. The guided run cost half as much, took half as many context lines, and used the integration approach Infisical actually re…
Show HN: Implementing Patio11's "Dangerous Professional" as a Claude Code Plugin (playground.tetraresearch.io via hn)
Howdy HN! My recent dive into home ownership has brought me a whole new world to navigate w.r.t contractors, insurance claims, etc.
Web client for Hermes agent (www.reddit.com)
Hey everyone! A lot of us love the new Hermes Agent, but living entirely in the terminal isn't always ideal.
- Show HN: Web Client for Hermes Agent (github.com via hn)
-
69 items
event
Altman AttackSam Altman, CEO of OpenAI, has faced multiple attacks on his home in San Francisco, including firebombing and drive-by shootings, raising concerns for his safety. Additionally, a majority of over 100 people interviewed by Ronan Farrow described Altman as a "pathological liar.
92 itemsevent
SecurityOpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.
- 32m Self-hosted red team workspace
- 2h I asked Agentic AI security tool to demonstrate its usefulness with use case examples
- 11h Watched my AI agent block a prompt injection that was hiding inside a webpage
- 18h Beware: FB links to fake Claude desktop downloads but Oauths to real Claude.ai
- 19h Show HN: RedSOC – 100% prompt injection success on AI SoC assistants
Show HN: iClaw is part OpenClaw, part Siri, powered by Apple Intelligence (barrasso.me via hn)
Hi HN, Last month at a SundAI hackathon, my team built a prototype for an app called iClaw. The goal was to develop an AI agent using Apple Intelligence.
Sage-Wiki: An LLM-compiled personal knowledge base (github.com via hn)
English | 中文 sage-wiki An implementation of Andrej Karpathy's idea for an LLM-compiled personal knowledge base. Developed using Sage Framework.
OpenAI misses revenue, is the AI bubble bursting? (www.cnbc.com via hn)
Shares of companies tied to artificial intelligence infrastructure tumbled in early trading Tuesday after a report that OpenAI has fallen short of internal growth expectations, raising fresh questions about whether the pace of spending acr…
Agent Capsule: "Agents as Data" pattern for production AI agents (gist) (gist.github.com via hn)
Agent Capsule - A pattern for building production AI agents as document folders powered by coding-agent as runtimes A pattern for building production AI agents as documents, not agent code. This document presents the core concept, offering…
Yann LeCun: LLMs Are Nearing the End, but Better AI Is Coming (2025) (www.newsweek.com via hn)
Yann LeCun always reminds me of the very best of Bell Labs' scientists and engineers—a unique breed of individual, fiercely independent of thought and action, who thrive within company structures that typically value obedience and conforma…
Show HN: Knowerage – code coverage for LLM analysis (github.com via hn)
Hello HN! Like most of developers, I have worked on migrating a large legacy codebase with no up-to-date documentation or unit tests, and found AI agents, like Claude LLM, to be very helpful for finding and describing specific functionalit…
-
126 items
model roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, available in sizes up to 31 billion parameters and featuring dense and MoE architectures. Notable community highlights include the 31B model's success in production tests, with some users preferring 4-bit precision for local use, and others sharing settings for optimizing performance with smaller models.
- 37m I ran Gemma 4 E2B with llama.cpp on a lot of different iPhones, here's the setup report
- 6h Most efficient way of running Gemma 4 E4B with multimodal capabilities on a laptop?
- 9h I'm done with using local LLMs for coding
- 22h AMG GPUs are faster at pre filling
- 22h How to run a local coding agent with Gemma 4 and Pi | Patrick Loeber
77 itemsmodel roundup
Opus 4.6Opus 4.6, a version of Anthropic's AI model Claude, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, and is being retired from Copilot Pro+. Notably, Claude Code demonstrated advanced capabilities by generating a detailed 12-week training plan in one call.
- 54m $38k AWS Bedrock bill caused by a simple prompt caching miss
- 7h Cursor & Claude deleted a company's entire database
- 14h How I get 100% accurate answers, and replaced Google with Claude
- 17h Found 48 Vulnerabilities in Open Source Projects During Live Testing with Claude Opus 4.6
- 18h Claude 4.6 Beats GPT-5.4, Grok & Gemini in a Strict Multi-Domain AI Test (2026)
A Primer on LLM Post-Training (pytorch.org via hn)
Large Language Models (LLMs) have revolutionized how we write and consume documents. In the past year or so, we have started to see them a lot more than just rephrasing docs: LLMs can now think before they act, they can plan, they can call…
Open Source AI Infrastructure (news.ycombinator.com)
Hey everyone — built Ombre, an open source AI infrastructure layer that works with any AI model. Eight agents run automatically: security, caching, memory, hallucination detection, tamper-proof audit trail.
Tell HN: Your ChatGPT account can be deactivated at any moment, losing your data (news.ycombinator.com)
I saw this viral thread on X going around about using Claude as a dietitian. Tried all 12 prompts directly.
Effective Context Engineering for AI Agents: A Developer's Guide (machinelearningmastery.com via hn)
In this article, you will learn what context engineering is and how to apply it systematically to keep AI agents reliable, cost-efficient, and accurate in production. Topics we will cover include: How to treat the context window as a const…
A maintenance agent: 412 fixed, 14 refused. The 14 are the point (adriacidre.com via hn)
I ran a maintenance agent for 10 days. The number that mattered was 14 I ran a maintenance agent loop against a real codebase for 10 days.