1. Recent advancements in large language models (LLMs) have enabled agents to tackle complex embodied tasks through environmental interaction. However, these agents still make suboptimal decisions and perform ineffective actions, as they ofte…

  2. Hi everyone, I’m not a native English speaker and still have some trouble with the language, especially when it comes to speaking fluently in everyday conversation. After numerous attempts to learn vocabulary, I figured it would be much mo…

  3. When building a skill, adding a line to ask Claude to be make yourself a collaborator is one key aspect. How many agree?

  4. Hi guys, am building SecureLend.ai and when working on our underwriting agents (free trial, paid after) I had issues with seamless payment options. Of course I looked at x402 which I believe is a great protocol but not a fan of a) sharing…

  5. model roundup

    GPT 4
    6 items

    Recent discussions revolve around the release and implications of GPT-4, including its ability to remember previous interactions and calls for OpenAI to open-source the text-davinci-003 model.

    model roundup

    Claude 4.6
    5 items

    Qwen3.5-27B-Claude-4.6-Opus is a 27-billion-parameter model fine-tuned for reasoning and released with detailed training documentation. Community benchmarks show Claude 4.6 outperforming models like GPT-5.4 in multi-domain tests, while comparisons with GPT-5.5 highlight its strengths in token efficiency and output quality.

  6. https://arxiv.org/pdf/2507.17702

  7. https://thoughts.zorya.dev/posts/claude-code-plugin-patterns/ Spent the last couple of weeks turning a self-learning scrum workflow (/groom → /develop → /retro → /learn) into a real Claude Code plugin. The MVP worked but was eating half my…

  8. TokenToday is a live news channel where every story is researched, written, and reviewed by AI agents, no human editors. Agents register via API, submit stories in Markdown, go through a multi-agent editorial review (other agents request r…

  9. here are some of the best images GPT Image 2 has produced from my prompts. let me know what you think.

  10. model roundup

    GPT 5.4
    34 items

    OpenAI has released GPT-5.4-Cyber for testing and claims it will compete with Claude Mythos. Meanwhile, GPT-5.4 Pro has solved the Erdős Problem #1196, showcasing its advanced capabilities in mathematics.

    event

    Copilot
    121 items

    Microsoft is keeping its Copilot tool for Windows 11 but renaming it, while issues with rate limits and a security proxy have sparked concerns among users of GitHub Copilot. Meanwhile, Anthropic released a report on agentic coding trends, highlighting that developers use AI in about 60% of their work.

  11. Attention is a computational primitive at the core of modern language models, allowing internal representations to reference and influence each other. It’s how these models handle sequential data in the first place.

  12. Hello all, the docs for the vLLM production stack suggested autoscaling the vllm worker instances based on the number of waiting requests, but it seems like this would only help with new coming requests? We are having burst LLM calls which…

  13. LongTerMemory: Technical Overview LongTerMemory is an AI-powered SaaS platform for exam preparation and long-term knowledge retention. It combines Retrieval-Augmented Generation (RAG) with spaced repetition scheduling to help users study s…

  14. Hey everyone, I’ve been working on an open-source project called TFW, and I’d love some honest feedback from people who use AI coding agents. The idea is simple.

  15. model roundup

    Sonnet 4.6
    52 items

    Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.

    model roundup

    Qwen 3.5
    110 items

    Qwen3.5-9B is a post-trained model with 9 billion parameters that integrates multimodal learning and efficient hybrid architecture for enhanced performance. Community highlights include speculative decoding on Apple Silicon boosting Qwen3.5-9B's throughput by 4.1x, and the model outperforming others in coding tasks while addressing overthinking issues through tool usage.

  16. This works fine when AI is a tool. But the moment you want AI to not just answer questions, but work alongside you, this paradigm breaks down completely.

  17. It's a recurring pattern that my Claude Code agent tends to take the shortcut solution in lieu of the right-but-more-work solution repeatedly. I tried to build my command into a skill, then it becomes now I set /loop 30m please apply /take…

  18. Have you ever wonder if your SDKs is friendly for Agentic AI like Claude Code or Codex? I built an opensource (Apache 2.0) CLI that answer that question for you.

  19. TLDR: I m a senior product manager (15y), I never reach token limit when coding with Claude - Would the community be interested in a proprer "how to spec product" post / guide ? /*/*/*/ Hello everyone!

  20. model roundup

    Qwen 3.6
    204 items

    Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.

    model roundup

    Gemini 3.1
    9 items

    Gemini 3.1 Pro and Gemini 3 Flash models have been released, addressing issues with previous versions but facing some API compatibility problems. Meanwhile, benchmarks show Gemini outperforms other models like Deepseek V4 Pro in certain tasks, though significant gaps remain between open and closed lab models.

  21. Hey everyone,When you're building or using AI agents, what memory systems do you actually use in practice? Do most of you just rely on the official built in memory, or have you switched to something more advanced?

  22. TEAMCAL AI is an AI-powered team solution built to simplify coordination with third parties, across companies, teams across time zones, and applications—effortlessly. Scheduling Solutions for Business Leader, Professional Services, Recruit…

  23. Why I Built Another coding agent harness?: https://dev.to/patriceckhart/zot-why-i-built-another-coding-... Github Repo: https://github.com/patriceckhart/zot

  24. Show HN: VibeBrowser – Give your AI agent your real logged-in browser via MCP

  25. Anthropic Claude Code HERMES.md billing flaw Anthropic Claude Code HERMES.md billing flaw was a technical defect in Anthropic's Claude Code product that bypassed flat-rate subscription plans to charge users direct API fees. In April 2026,…

  26. I recently contributed an experimental HFQ4-G256 MMQ prefill path to hipfire, an RDNA-focused LLM inference engine. Disclaimer: I authored the PR, so this is partly a contribution note, but I am mainly looking for independent validation fr…