Hi all, RipStop is a node package implementing a set of rules that consumers can use to protect their repos from wilder actions by LLM agents. A consumer needs only a few lines of code to configure the rules they wish to apply.
Now Is the Perfect Time to Change Sudo (news.ycombinator.com)
The sore pain is here, LLM agents, NPM ecosystem. The timing looks perfect.
Morning Everyone! Big one today (104 changes!): Claude Code just went async.
I think the most interesting AI use cases right now aren’t the flashy demos- it’s the weird internal AI employees people quietly build for their businesses. For example, I saw a Reddit post from an ecommerce operator who built what was bas…
-
166 items
model roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, available in sizes up to 31 billion parameters and featuring dense and MoE architectures. Notable community highlights include the 31B model's success in production tests, with some users preferring 4-bit precision for local use, and others sharing settings for optimizing performance with smaller models.
- 2m Gemma 4 E4B is great for short transcriptions
- 5h Will unsloth release MLX versions of the MTP qwen3.6 and gemma 4 models?
- 15h Gemma 4 running fully offline on WebGPU with Transformers.js, controlling Reachy Mini over WebSerial.
- 17h Terrible Vulkan pp/tg on Arrow Lake iGPUs
- 1d ExLlamaV3 Major Updates!
177 itemsevent
SecurityOpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.
- 3m Agents need a local bouncer before they run tools
- 37m Mass NPM Supply Chain Attack Hits TanStack, Mistral AI, and 170 Packages
- 1h I made an AI concierge for my wedding guests. The second most popular thing they did with it was try to jailbreak it.
- 2h OpenAI Launches Daybreak for AI-Powered Vulnerability Detection and Patch Validation
- 11h $392M in AI agent security funding at RSAC 2026 - the market just validated what we've been building
Might be a noob question. Suppose I get Claude Design (CD) to mockup.
Hey all, We're a three-person AI consultancy that just passed initial review for the Claude Partner Network. This was quite unexpected but we're excited about it,.
Microsoft researchers find AI models and agents can't handle long-running tasks (www.theregister.com via hn)
MOST POPULAR EVENTS - Securing the Untrusted Agentic Development Layer Join us to learn how to architect a development environment where your builders and their agents can move fast and securely. - Toxic Flows: When Your AI Agent Skill Bec…
Warning about Claude on financial advice (www.reddit.com)
I lost hundreds of dollars following Claude’s health insurance recommendations. I explained my situation completely, but Claude never asked what healthcare services I actually use before confidently recommending I buy insurance.
-
13 items
model roundup
Sonnet 4.5On May 4, 2026, multiple automated status updates reported elevated errors for Claude Opus 4.5 and Sonnet 4.5 around the same time, with Anthropic introducing a feature called E-STEER that applies emotion intervention to these models.
- 34m I may have uncovered the real reason they're sunsetting Sonnet 4.5. They could barely contain its true power
- 7h PSA: How to preserve your account's access to Sonnet 4.5 beyond June 15th
- 10h Does the sudden removal of Sonnet 4.5 violate Claude's Constitution?
- 23h Claude Code using extra usage despite my Pro plan being at 0%.
- 1d Which model and version do you prefer for programming?
55 itemsevent
HallucinationClaude Opus 4.6, Anthropic's flagship model, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, highlighting a significant regression in handling certain tasks. Meanwhile, biologists are revisiting cases of mushroom-induced hallucinations in China, suggesting ongoing research into natural causes of similar phenomena.
- 1h LLM Hallucinations in the Wild
- 10h Counterfactual samples synthesizing for mitigating hallucination in LLMs
- 13h OpenAI Cooked This Week!
- 19h Why "Consensus" Is Failing AI: My Research into the Hallucination Tax
- 21h I’ve built a tool with Claude that reduces AI model hallucinations and answer error rates, allowing you to get far more accurate results when asking AI models questions.
Build, edit, and analyze forms directly in Claude (www.jotform.com via hn)
Jotform Claude App Build Jotform forms directly in Claude using simple prompts. This integration connects your forms to Claude, allowing you to generate, edit, and manage them conversationally.
I am currently looking to get into automation for German Mittelstand and I am now talking to an SME, which got an offer from a consulting firm for document processing automations and trying to figure out if the pricing is normal or inflate…
Show HN: An implementation of Common Lisp in development, reached version 1.6 (savannah.nongnu.org via hn)
It reached version 1.6, now it covers more than 80% of the standard. alisp ships with ASDF and is capable of loading many real-world systems, let me know if your favorite system succeeds!
Agent FM Ambient radio for AI coding agents on macOS. Agent FM turns every Claude Code and Codex session into a live radio station.
- Show HN: OpenClawdex – Open-Source Orchestrator UI for Claude Code and Codex (github.com via hn)
- Show HN: Open-Source Harness for Claude Code, Codex and Cursor (github.com via hn)
-
392 items
model roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
- 1h Estimate inference speed of local Qwen3.6-35B on Mac M5...
- 2h New Qwen3.6 35B finetune - 0GM-1.0-35B-A3B-0427
- 11h What are the best opensource coding models for 8x A6000 setup
- 14h Does anyone else have issues with Qwen-3.6-27B stability in the Codex harness?
- 15h Will there be any more Qwen3.6 series models?
90 itemsmodel roundup
Opus 4.6Opus 4.6, a version of Anthropic's AI model Claude, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, and is being retired from Copilot Pro+. Notably, Claude Code demonstrated advanced capabilities by generating a detailed 12-week training plan in one call.
Gave Claude a local LLM as assistant on my Mac (www.reddit.com)
Hi there! I was playing around with Ollama and LMstudio, testing local models and had the idea of letting Claude evaluate a few models on their actual capabilities rather than doing it myself.
building a zine-making app (90s/y2k aesthetic, hot pink, chunky outlines, all that). the templates are real designed layouts (y2k chat bubbles, riot grrrl flyer collages, myspace-style pages).
A few months ago I was a traditional magazine editor with zero coding background. This year I somehow ended up building and launching my first iOS app using Claude and Claude Code.
been running agents in production for a while now and the failure handling question keeps coming up. in testing agents fail cleanly.
-
133 items
model roundup
GPT 5.5On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.
- 1h GPT-5.5 was used to flag fatal errors in FrontierMath problems
- 4h Claude vs GPT for PhD academic writing — my experience so far, and curious about yours
- 5h The AI market moves so fast that your business idea can expire before launch
- 7h OpenAI launches Daybreak cybersecurity initiative using GPT-5.5
- 8h openai/gpt-5.5-pro API In=$30.00 Out=$180.00
3 itemsmodel roundup
Qwen 3Qwen3-0.6B is a large language model in the Qwen series, featuring dense and mixture-of-experts architecture, excelling in reasoning, instruction-following, and multilingual support with seamless switching between thinking and non-thinking modes. Community feedback suggests it's favored for default chat and coding tasks over newer models like Llama 3, though specific benchmarks are not provided.
could not extract summary
corrigé code apk (www.reddit.com)
bonjour a tous je travaille sur un projet apk et j'ai rencontré quelques problème au niveau des notifications ect...si quelqu'un peut m'aider a corriger les code et faire fonctionner apk . je utilise en ce moment Android studio.
Show HN: World Cup History MCP – every FIFA tournament 1930–2026 (api.zafronix.com via hn)
Hi HN — my previous post got flagged for some reason so re-posting to spread the word as well as get some actionable feedback. When I was a kid and was playing soccer in my home town, my Dad had an idea - what if there was a correlation be…
Claude Team v Individual account experience (hint: its not good) (www.reddit.com)
Been using Claude Team with a client and the experience is not good - is it just me? The #1 reason is the inability to work in accept-all mode as you can with an individual account.
could not extract summary
Should Brokers Disclose Profitable Recommendations? (www.reddit.com)
A seemingly insignificant question at the moment, but it might become important in the future: If an AI agent recommends a product, a tool, or a service, and its developer can earn revenue through click-through rates, registration numbers,…