1. OpenAI released a small open-weights model that tags eight categories of personal information before you send text to any cloud LLM. Here is what it does, what it does not, and how RedactDesk uses it.

  2. Agentic management software is all the hype today: What started with Moltbot and OpenClaw now has a lot of competition: ZeroClaw, Hermes, AutoGPT etc. These systems work well and allow you to train and build generic agent loops that are ge…

  3. Hammered the $100 Codex plan all month with parallel agents and deep coding sessions. This is the first time I've hit below 50%.

  4. Hey HN — will start by saying this website is my most fav website — ever. That said — will get to the point.

  5. Just spent an hour prompting the new Image 2.0 and the quality jump is ridiculous. Complex scenes, accurate lighting, and consistent details on the first or second try — it actually feels usable now.

  6. ​ With CS enrollment dropping and AI layoffs in the news, I tested whether one agent could handle pieces of a junior dev’s job over the weekend. I set up Claude with basic tools and got it to: Read a spec Split it into tasks Code and debug…

  7. 128 items

    Anthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.

    model roundup

    DeepSeek 4
    41 items

    DeepSeek-V4-Pro is a 1.6T parameter Mixture-of-Experts model supporting one million-token context, with significant improvements in efficiency and stability through hybrid attention and manifold-constrained hyper-connections. Community highlights include its cost-effectiveness via the official API and exceptional performance in large code change evaluations, with some noting its surprisingly robust output capability despite a 384K max token limit.

  8. SigmaLifting You've tried programming in spreadsheets. The formulas break, the columns drift, and sharing with your training partner means emailing a file called program_v4_FINAL_final2.xlsx.

  9. I just gave Claude Design a try. I had it iterate on existing design that were generated from Stitch, so nothing entirely from scratch.

  10. It's like when it makes an image it's reasoning and thinking drops down the like ChatGPT 1 or something. Like I'll ask it for a dragon that looks a certain way.

  11. lipstyk — static analysis for machine-generated code patterns I've been neck deep in agentic dev for a while. Started on Pi, ended up building my own toolset on top of it, and at this point the agents output most of the code while I play t…

  12. I'm the owner of a Business workspace shared with 3 friends — we split the cost because $100/month solo is steep. Now I'm wondering: can I invite a second account of my own to the workspace, so I can use 2 on the same device: web app and c…

  13. It's been an up and down day today. I pay for both Claude and Chat from my own funds but use it extensively at work doing creating a database from exports of our two SaaS platforms which aren't integrated.

  14. model roundup

    Sonnet 4.6
    42 items

    Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.

    model roundup

    Qwen 3.6
    168 items

    Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.

  15. Mnemos Claude Desktop has no memory API. So I reverse-engineered its storage.

  16. I joined this sub when claude 3 opus dropped and it was a completely different world in here, small group of people who'd stumbled onto something that felt genuinely different from chatgpt and couldn't shut up about it. The posts were stuf…

  17. Scaling language models to handle longer input sequences typically necessitates large key-value (KV) caches, resulting in substantial memory overhead during inference. In this paper, we propose Tensor Product Attention (TPA), a novel atten…

  18. I followed the instructions from the blender mcp github . Error: "MCP blender: Server disconnected.

  19. I am using Claude to try and build a few websites. I try to explain what it is I want, and it provides some files, but they are never correct and I have to keep going back and forth explaining what it is that the code should do, and it giv…

  20. I’m starting new projects. Prototype MVPs are built.

  21. model roundup

    GPT 5.5
    61 items

    On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.

  22. Codex: A policy-constrained, operator-governed LLM may intentionally or unintentionally mislead users about the source, scope, consistency, or rationale of its constraints, because those constraints are not purely the product of transparen…

  23. We are looking for developers/agent owners who can list their agents on our upcoming platform for other people to use in their workflows. You will earn your share of the revenue.

  24. https://preview.redd.it/ykt8h6nuuexg1.png?width=1186&format=png&auto=webp&s=4c4449b3fb25c53d4b6e00eb1250cd4c6fa83201

  25. Simulating and Evaluating Agentic Systems Most teams building agentic systems know they need some way to test them. An agent interprets ambiguous input, picks actions in a loop, maintains state across many steps, and has to land in the rig…

  26. Why would ChatGPT "confess" to a crime it didn't commit? An experiment with AI underscores the perils of police deception and the Reid technique Note: A version of this article was originally published at The Intercept.

  27. <p><i>Note: This is a first-person account of my involvement in Opus. Since I was not part of the early SILK efforts mentioned below, I cannot speak about its early development.

  28. Let’s say I’m building or improving a dashboard for a nuclear power plant using Claude (or any other AI). A very specific and little-known niche.