OpenAI shipped privacy-filter, a 1.5B PII tagger you can run locally (redactdesk.app via hn)
OpenAI released a small open-weights model that tags eight categories of personal information before you send text to any cloud LLM. Here is what it does, what it does not, and how RedactDesk uses it.
Agents Aren't Coworkers, Embed Them in Your Software (www.feldera.com via hn)
Agentic management software is all the hype today: What started with Moltbot and OpenClaw now has a lot of competition: ZeroClaw, Hermes, AutoGPT etc. These systems work well and allow you to train and build generic agent loops that are ge…
Usage limits for each of the Claude plans (xcancel.com via hn)
Hammered the $100 Codex plan all month with parallel agents and deep coding sessions. This is the first time I've hit below 50%.
- Claude usage (www.reddit.com)
Ask HN: Oh, What Places to Go (Seriously Tho) (news.ycombinator.com)
Hey HN — will start by saying this website is my most fav website — ever. That said — will get to the point.
GPT-Image 2.0 is lowkey blowing my mind (www.reddit.com)
Just spent an hour prompting the new Image 2.0 and the quality jump is ridiculous. Complex scenes, accurate lighting, and consistent details on the first or second try — it actually feels usable now.
AI agents are quietly replacing software engineers — my weekend test (www.reddit.com)
With CS enrollment dropping and AI layoffs in the news, I tested whether one agent could handle pieces of a junior dev’s job over the weekend. I set up Claude with basic tools and got it to: Read a spec Split it into tasks Code and debug…
-
128 items
event
Anthropic MythosAnthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.
- 1h OpenMythos with Qwen2.5-1.5b weights (No recurrence atm) - looking to turn it into full OpenMythos
- 6h What would you use Claude Mythos for if you had access today?
- 6h Discord group says it accessed Claude Mythos by guessing location
- 7h Claude Mythos: The first AI-native cyberweapon?
- 8h We need to keep awareness high about the military and surveillance uses
41 itemsmodel roundup
DeepSeek 4DeepSeek-V4-Pro is a 1.6T parameter Mixture-of-Experts model supporting one million-token context, with significant improvements in efficiency and stability through hybrid attention and manifold-constrained hyper-connections. Community highlights include its cost-effectiveness via the official API and exceptional performance in large code change evaluations, with some noting its surprisingly robust output capability despite a 384K max token limit.
Show HN: SigmaLifting CLI – helping agents understand strength training (sigmalifting.app via hn)
SigmaLifting You've tried programming in spreadsheets. The formulas break, the columns drift, and sharing with your training partner means emailing a file called program_v4_FINAL_final2.xlsx.
Claude Design token usage make the tool useless right now (www.reddit.com)
I just gave Claude Design a try. I had it iterate on existing design that were generated from Stitch, so nothing entirely from scratch.
It's like when it makes an image it's reasoning and thinking drops down the like ChatGPT 1 or something. Like I'll ask it for a dragon that looks a certain way.
lipstyk — static analysis for machine-generated code patterns I've been neck deep in agentic dev for a while. Started on Pi, ended up building my own toolset on top of it, and at this point the agents output most of the code while I play t…
I'm the owner of a Business workspace shared with 3 friends — we split the cost because $100/month solo is steep. Now I'm wondering: can I invite a second account of my own to the workspace, so I can use 2 on the same device: web app and c…
Is setting up a project worth it? (www.reddit.com)
It's been an up and down day today. I pay for both Claude and Chat from my own funds but use it extensively at work doing creating a database from exports of our two SaaS platforms which aren't integrated.
-
42 items
model roundup
Sonnet 4.6Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.
- 1h Claude's sonnet 4.6's clarifying questions...How to read?
- 1h Does effort tier change refusal behavior on agent-attack prompts? CVP run 4 with sonnet 4.6 high and max efforts.
- 5h Show HN: Mapping Sonnet's thinking process via flame charts
- 5h An experiment with Claude Sonnet 4.6
- 7h "We've partnered with OpenAI to offer it for 50% off through May 2." Please confirm that it means 50% off both input and output tokens, which means we are paying Sonnet 4.6 prices to use GPT 5.5 until May 2nd.
168 itemsmodel roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
I reverse-engineered Claude Desktop's storage to give it memory (github.com via hn)
Mnemos Claude Desktop has no memory API. So I reverse-engineered its storage.
I joined this sub when claude 3 opus dropped and it was a completely different world in here, small group of people who'd stumbled onto something that felt genuinely different from chatgpt and couldn't shut up about it. The posts were stuf…
Tensor Product Attention Is All You Need (arxiv.org via reddit)
Scaling language models to handle longer input sequences typically necessitates large key-value (KV) caches, resulting in substantial memory overhead during inference. In this paper, we propose Tensor Product Attention (TPA), a novel atten…
- Attention Is All You Need (news.ycombinator.com)
- Attention IS NOT all u need (github.com via hn)
Blender MCP failed (www.reddit.com)
I followed the instructions from the blender mcp github . Error: "MCP blender: Server disconnected.
I am using Claude to try and build a few websites. I try to explain what it is I want, and it provides some files, but they are never correct and I have to keep going back and forth explaining what it is that the code should do, and it giv…
I’m starting new projects. Prototype MVPs are built.
-
61 items
model roundup
GPT 5.5On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.
An interesting OpenAI Codex Q&A (news.ycombinator.com)
Codex: A policy-constrained, operator-governed LLM may intentionally or unintentionally mislead users about the source, scope, consistency, or rationale of its constraints, because those constraints are not purely the product of transparen…
- OpenAI Codex (openai.com)
We are looking for developers/agent owners who can list their agents on our upcoming platform for other people to use in their workflows. You will earn your share of the revenue.
If I only had a coin for every time claude 'found the smoking gun.' (www.reddit.com)
https://preview.redd.it/ykt8h6nuuexg1.png?width=1186&format=png&auto=webp&s=4c4449b3fb25c53d4b6e00eb1250cd4c6fa83201
Simulating and Evaluating Agentic Systems (www.gojiberries.io via hn)
Simulating and Evaluating Agentic Systems Most teams building agentic systems know they need some way to test them. An agent interprets ambiguous input, picks actions in a loop, maintains state across many steps, and has to land in the rig…
Why would ChatGPT "confess" to a crime it didn't commit? (radleybalko.substack.com via hn)
Why would ChatGPT "confess" to a crime it didn't commit? An experiment with AI underscores the perils of police deception and the Reid technique Note: A version of this article was originally published at The Intercept.
How Opus Came To Be (2019) (jmvalin.dreamwidth.org via hn)
<p><i>Note: This is a first-person account of my involvement in Opus. Since I was not part of the early SILK efforts mentioned below, I cannot speak about its early development.
What exactly "we may use your data to improve our models” mean? (www.reddit.com)
Let’s say I’m building or improving a dashboard for a nuclear power plant using Claude (or any other AI). A very specific and little-known niche.