1. could not extract summary

  2. Most agent architectures treat memory like a rigid database, but that leads to the "stochastic drift" everyone complains about. My partner is a neuroscientist and we've spent the last year modeling an agent’s memory on biological systems r…

  3. I only use Sonnet as my main model. I instruct it to delegate indexing and similar grunt work to Haiku, and whenever something genuinely needs deeper thinking, I tell it to "consult Opus." Sonnet then explains the situation to Opus, gets t…

  4. could not extract summary

  5. I just set up OpenClaw on my docker container, currently with almost no tool access. I've heard of security issues around Openclaw, but I don't know what else to use.

  6. AI Visibility Monitor A small toolkit for tracking whether your website appears in AI search results (ChatGPT, Claude, Perplexity, Gemini) and Google search, and for diagnosing the technical layer underneath that determines whether AI engi…

  7. model roundup

    Qwen 3.6
    167 items

    Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.

    event

    Cowork
    97 items

    Issues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.

  8. Claude got more useful for me when I stopped asking it to write the reply. The actual problem was after a post went up.

  9. Should we plan with Codex, then code with Claude? Or should we plan with Claude, then code with Codex?

  10. I'm trying as hard as I can to get a local setup somewhere in the ballpark of proprietary LLMs for code generation. My computer is running a Intel(R) Core(TM) Ultra 7 265K (3.90 GHz) with 128 GB of DDR5 RAM and an Nvidia Geforce RTX 5090 t…

  11. could not extract summary

  12. Routiium is a self-hosted, OpenAI-compatible LLM gateway I built. It does the table-stakes things you'd expect — managed keys, routing, rate limits, analytics — but the part I want to flag for HN is what it does on the agent side.

  13. 1Gaoling School of Artificial Intelligence, Renmin University of China 2ByteDance Seed *Work was done during their internship at ByteDance Seed †Corresponding Author What is Agent-World? A self-evolving training arena that unifies scalable…

  14. model roundup

    Sonnet 4.6
    40 items

    Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.

    event

    Security
    81 items

    OpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.

  15. LLM-Rosetta — A Python library for converting between different LLM provider API formats using a hub-and-spoke architecture with a central IR (Intermediate Representation). Full documentation is available at: - English: https://llm-rosetta…

  16. ive seen a lot of people talk about automating their work using ai agents, i tried a couple of them this week and all of them seem to have failed when it comes to real life applications either they're way too complex to set up or they just…

  17. I know what an llm is, what makes one agentic? is it the fact that the results of what it produces goes back in as a prompt?

  18. This is an automatic post triggered within 2 minutes of an official Claude system status update. Incident: Investigated elevated errors and slower responses on claude.ai Check on progress and whether or not the incident has been resolved y…

  19. For all of the Claude-oriented tutorials and resources out on the web, I prefer the insights from experienced developers and software engineers here about how to think through building a project, instead of just prompt bashing until a brit…

  20. Enterprise systems often avoid "monolithic" AI to prevent context rot and hallucinations. The standard fix is task-decoupling: splitting logic between specialized agents or deterministic code.

  21. model roundup

    GPT 5.5
    59 items

    On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.

  22. I've been keeping a tally for two weeks and claude has recreated existing utilities 11 times across three repos. clean code every time.

  23. Hey, im making a project which includes using LLM to act as "search engine" I need LLM to use tool calling to request for which category of products to search from with pipeline: Category (LLM gets all main categories) LLM picks sub catego…

  24. Created this quick code-review claude plugin for myself and wanted to share it with the community :) I guess github/graphite and others could use from features like these: Clustering topics in the PR TL/DR of file changes and descriptions…

  25. Hey everyone! I built Claude Squad, a self-hosted tool that lets multiple people share one coding session through Claude Code.

  26. Anyone not able to get their Claude extension authorised to their account in a VSCode Dev Container recently? I have exactly the same setup on another machine that works perfectly.

  27. Hey everyone, I’m currently building a custom AI Agent designed specifically for B2B YouTube optimization (Titles and Descriptions). The goal isn't just "good enough" copy—I need it to sound like a high-level strategic partner, not a gener…

  28. I've been running a multi-agent system in production for a few months — a co-CTO agent + specialist agents (PM, dev, ops) that handle real engineering work end-to-end: design specs, code review, PR implementation, deploys, monitoring. The…