MCP Steroid – Give AI the whole IDE, not just the files (mcp-steroid.jonnyzzz.com via hn)
MCP Steroid is a Model Context Protocol server for JetBrains IDEs. It exposes IntelliJ Platform APIs, visual state, and runtime environment to any MCP-compatible AI agent via Kotlin code execution and screenshot capture.
Anthropic moral dev said AI overcorrection could address historical injustices (www.foxnews.com via hn)
One of Anthropic’s artificial intelligence (AI) philosophy architects argued that intentional discrimination could be a way to combat stigmas on topics of race and gender. In a 2023 paper authored alongside a number of other AI researchers…
mcp-identity Per-request cryptographic user attestation for MCP servers. MCP already has OAuth 2.1.
Telus Uses AI to Alter Call-Agent Accents (letsdatascience.com via hn)
According to reporting by iPhone in Canada and The Globe and Mail, **Telus** is using AI through its **Telus Digital** unit to modify call-centre agents' accents in real time. iPhone in Canada reports the speech-to-speech tool is built by…
What's new in CC 2.1.128 (+1406 tokens) (www.reddit.com)
NEW: Agent Prompt: Background job agent instructions — Replaces the background-job behavior system prompt with built-in background-agent instructions for progress narration, tool-result restatement, noisy-investigation delegation, and expl…
Not sure how to view or share your Claude Code sessions? Drop them right here (specious.github.io via hn)
⌘ Drop a Claude Code session export here or click to pick a file · exported with /export in Claude Code Choose file
-
310 items
model roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
147 itemsevent
SecurityOpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.
- 58m Bleeding Llama: Critical Unauthenticated Memory Leak in Ollama
- 3h Used Claude Opus 4.7 to do a 5-hour solo incident response on real healthcare malware (where it worked, where I had to override)
- 6h Agentic Malware Analysis: String Decryption, API Hashing and Unpacking [video]
- 7h When innocent tools form dangerous chains to jailbreak LLM agents
- 8h Built a security scanner for LangChain/LangGraph agents: it clones your agent into a sandbox and tries to break the clone
tl;dr (Claude caveman edition): MCP servers sit around doing nothing, eat 1.5 GB. Machine angry.
Google is building an AI agent that could be its answer to OpenClaw (www.businessinsider.com via hn)
- Google is working on an AI agent codenamed "Remy," according to an internal document. - Remy is described as a "24/7 personal agent" that can take actions on the user's behalf.
Skelm – Build AI agents in TypeScript without losing your mind (github.com via hn)
skelm Build secure, agentic, long-running workflows in TypeScript. Run them anywhere Node runs.
Asked ChatGPT to make a screenshot of ChatGPT showing a generated Instagram DM screenshot with a Coca-Cola photo. Now I’m looking at a screenshot of an AI screenshot of a fake DM containing a fake product photo.
State of the art LLMs (www.reddit.com)
could not extract summary
OpenAI delivers low-latency voice AI at scale (www.google.com via hn)
Gmail Images Sign in Advanced search Advertising Business Solutions About Google © 2026 - Privacy - Terms Google apps
- How OpenAI delivers low-latency voice AI at scale (openai.com)
-
66 items
model roundup
Sonnet 4.6Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.
- 1h Anyone ever notice eerily similar ChatGPT and Claude responses like this?
- 17h 6 months ago I posted about Claude prompt codes (L99, OODA, ARTIFACTS). Re-tested them this week. Some still work, one quietly faded, three newer ones earn their keep.
- 1d Built a tiny router so Cursor stops showing "usage limit reached" at 3pm. Sonnet auto-falls to Haiku, you keep working
- 1d Improve CC and plugin
- 2d Cheap Claude/Codex/Gemini Models - Pay just 25% of official rates
83 itemsmodel roundup
Opus 4.6Opus 4.6, a version of Anthropic's AI model Claude, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, and is being retired from Copilot Pro+. Notably, Claude Code demonstrated advanced capabilities by generating a detailed 12-week training plan in one call.
- 2h 12M Context Window and some some sprinkle of lies?
- 20h DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark — 10 weeks later, ~17× cheaper
- 1d Free Trial: Gemini 3.1 Pro & Opus 4.6 API Access via My Wrapper
- 1d Open source models are going to be the future on Cursor, OpenCode etc.
- 2d I have 30 Skills that work great in Opus v4.6 but not at all in v4.7. Am I cooked?
Rust library. Includes an example launcher.
This is from a post thread here about 8 months ago and I learned a lot of from that discussion! Today, I ship it - Memoir - Git for AI Memory!
Telus using AI to alter the accents of customer service agents (www.theglobeandmail.com via hn)
The voice you hear on the other side of a call-centre interaction might soon sound a little more familiar, thanks to an AI tool that adjusts speech in real time – but not everyone thinks it’s a good idea. Telus Digital, the wholly owned di…
Dawkins, Claude and the Myth of Consciousness in Artificial Intelligence (www.lucasaguiar.xyz via hn)
Evolutionary biologist Richard Dawkins recently wrote an article titled “When Dawkins met Claude”, where he describes his experience after two days of intense conversations with the artificial intelligence Claude on various topics. Through…
Show HN: Docx-CLI – let agents edit your Word files safely (github.com via hn)
docx-cli A CLI for AI agents (Claude, Codex) to safely read, edit, and comment on .docx files with full format fidelity. Outputs JSON-AST for precise locator-based editing; preserves anything it doesn't model by mutating XML in place.
Show HN: Zift – find authorization logic in your code (github.com via hn)
I made a code scanner that finds embedded authorization code in your codebases so you can externalize it to Policy as Code. https://github.com/EnforceAuth/zift Written in Rust, so it hums through code.
-
43 items
event
HallucinationClaude Opus 4.6, Anthropic's flagship model, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, highlighting a significant regression in handling certain tasks. Meanwhile, biologists are revisiting cases of mushroom-induced hallucinations in China, suggesting ongoing research into natural causes of similar phenomena.
- 1h Folie à Deux: The most dangerous hallucination is one you're inclined to believe
- 4h AI Evidence Admissibility is a Post-Mortem. We need Action Admissibility.
- 6h GPT-5.5 Instant: Benchmarking the 52% Hallucination Reduction
- 16h VLMs are surprisingly bad at skin analysis — but for a reason nobody talks about
- 1d A thermodynamic trust layer cutting LLM hallucinations by 52%
RAG retrieves the refutation and still gets it wrong (reyes.id.au via hn)
Anchor catching the failure mode where RAG retrieves the refutation and still gets it wrong Ask vanilla RAG over Duval, Goeckner, Klivans, and Martin's 2015 paper "A non-partitionable Cohen-Macaulay simplicial complex" this question: What…
We all know that there are many AI Builders right now, from lovable to bolt to replit and so many others. I am wondering if you are to choose one that can actually replace your main tool, what features should it have ?
Turn your design into a real website from Claude Design (www.reddit.com)
I built something that lets you publish your Claude Design artifacts to a real website right from chat. I built this because Claude Design already has everything it needs to make a website: code execution, file creation, arbitrary HTTP req…
- Claude Design Is Real Design (diverging.run via hn)
PSA: I annotated Claude Code's forced system prompt (www.reddit.com)
Before your CLAUDE.md, before your memory files, before your skills, Anthropic injects ~12K tokens of system prompt into every single turn, as priority instructions that overrule anything you provide. I captured the full text from a Claude…
Show HN: Design Taste for AI Agents (aidesigntaste.com via hn)
Free design.md systems from the world
Claude keeps telling me to wind down. it's morning here. (www.reddit.com)
I'm in Korea, and Claude has nudged me to "get some rest" or "maybe wrap up for the night" when it's morning for me lol anyone else notice this?
Ask HN: The death of software development as a job? (news.ycombinator.com)
A lot of programmers I read here and elsewhere say LLM isn't going to change much, some say LLM is just going to make them more productive, and some even say not using LLM makes you some sort of relic. What is not debated is that LLM has c…