1. From fabian on 𝕏: https://x.com/fabianstelzer/status/2051260931758272863

  2. No, I won't tell you how. No this is not for anyone who is not already a proven contributor to the fine-tuning space.

  3. Process Reward Models (PRMs) have achieved remarkable success in augmenting the reasoning capabilities of Large Language Models (LLMs) within static domains such as mathematics. However, their potential in dynamic data analysis tasks remai…

  4. Atlas Trust Infrastructure Atlas Trust Infrastructure is the public-facing trust model and documentation surface for Atlas: a metadata-first trust control plane for authorized security workflows, evidence retention, release trust, and busi…

  5. I have a "second brain" filesystem as markdown files that I have been maintaining for months that started out in Claude Code as the interface + file read/write layer... This system just stores a collection of personal todo items, long/shor…

  6. NEW: System Reminder: File modification detected (budget exceeded) — Tells the agent when a user or linter changed a file but the diff was omitted because other modified files already exceeded the snippet budget, and directs it to read the…

  7. model roundup

    GPT 5.4
    31 items

    OpenAI has released GPT-5.4-Cyber for testing and claims it will compete with Claude Mythos. Meanwhile, GPT-5.4 Pro has solved the Erdős Problem #1196, showcasing its advanced capabilities in mathematics.

    event

    Cowork
    168 items

    Issues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.

  8. liteflow A ~1000-line C program that runs YAML-defined DAGs, where an LLM can edit the graph mid-run. When a task fails, a planner LLM gets the stderr and emits one of four verbs: RETRY, PATCH, INSERT_BEFORE, ABORT.

  9. If you’ve got memory turned on, here’s a fun prompt: “Imagine the ultimate video game. Tailored specially to me and my interests.

  10. ctx (Context) ctx is a system, not a prompt. A lightweight, file-based system that enables AI coding assistants to persist, structure, and rehydrate project context across sessions.

  11. For quite a while, I've enjoyed to have claude panel and codex panel in my cursor application. For me it was practical that I didn't need to use three applications at once, but had everything in one: in cursor.

  12. could not extract summary

  13. paywalled

  14. 118 items

    Sam Altman, CEO of OpenAI, has faced multiple attacks on his home in San Francisco, including firebombing and drive-by shootings, raising concerns for his safety. Additionally, a majority of over 100 people interviewed by Ronan Farrow described Altman as a "pathological liar.

    model roundup

    Qwen 3.6
    291 items

    Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.

  15. I Forced ChatGPT Into Adversarial Tests—Here’s What It Actually Does Under Uncertainty | by Chris Russell | May, 2026 | Medium Sitemap Open in app Sign up Sign in Get app Write Search Sign up Sign in Chris Russell Follow 2 min read · 1 hou…

  16. Claude and I are debating the stack for a new project, when ..... 🤣 I felt like I had to share this exchange after I read #3

  17. Blog A Mental Model for Agentic Work May 5, 2026 - AI Agents - Company Operations - Software Engineering Something shifted in the first quarter of 2026. Not a feature launch, not a new product - a structural change in how work happens.

  18. It occurred to me that I'm (successfully) micromanaging Claude (code), but that it might be capable of doing complex long horizon tasks. What's the most complex thing you've done in a single (or tiny number of) prompts?

  19. Been using codex for a few months now. I use it in VScode.

  20. I’m not playing a gotcha game here. AI is undeniably changing software engineering and I can’t think of a better AI use case than coding.

  21. event

    Security
    131 items

    OpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.

    model roundup

    Qwen 3.5
    132 items

    Qwen3.5-9B is a post-trained model with 9 billion parameters that integrates multimodal learning and efficient hybrid architecture for enhanced performance. Community highlights include speculative decoding on Apple Silicon boosting Qwen3.5-9B's throughput by 4.1x, and the model outperforming others in coding tasks while addressing overthinking issues through tool usage.

  22. I've been on Max for two months and I finally sat down and tracked where my tokens actually go. breakdown of a typical day: - ~40% file reads, git status, project context scanning: stuff that doesn't need opus at all - ~25% test generation…

  23. As we all are, I've been experimenting with ways to reduce external saas spend, and continually bring traditionally external pieces of context (prs, docs, trello boards) into the one mono repo. I have toyed with a markdown todo list and se…

  24. AgentShield A spending firewall for autonomous AI agents. Before an agent executes a payment, it submits a spend intent to AgentShield.

  25. Hey everyone, thinking about upgrading to Claude Max pretty soon and before I pull the trigger I wanted to ask if anyone has good full guides or tutorials on actually getting the most out of it. Not just "here's what the plan includes" typ…

  26. Seed An autonomous AI agent that builds other autonomous AI agents Running 24/7 on a $25 Raspberry Pi Zero 2W. No API keys.

  27. I built a tool that lets you publish your Claude Design artifacts to a real website directly from chat. I built this because chats in claude.ai already have everything they need to make a full stack web app: code execution, file creation,…