1. Subscribe to discover Om’s fresh perspectives on the present and future. Om Malik is a San Francisco based writer, photographer and investor.

  2. Evaluating RAG feels easy in theory, but production is a different challenge. We’ve been looking into why RAG benchmarking is such a moving target.

  3. There is word AI agents everywhere. Each company should use it.

  4. Hi Folks, been working on something for a good few months. I created via GPT researcher a compiled list of data of peoples complaints across this subreddit.

  5. Hello! It’s me again, the developer of ADT.

  6. On building small CLI tools for myself – and now for my agents too. Walks through a recent one for querying CloudWatch Insights, and how I use Claude to analyze the logs it pulls down.

  7. Mosaic Local MCP server for structured agent memory — hex lattice, hybrid retrieval, governed writes, and budgeted context — backed by HexxlaDB. Why Mosaic Mosaic keeps agent memory on infrastructure you operate: MCP on localhost, optional…

  8. I just interviewed Michael Maximilien, former CTO at IBM and Chairperson of NodeJS Foundation, who spent a year shipping production RAG to multiple customers. His lesson was uncomfortable.

  9. model roundup

    GPT 5.3
    5 items

    OpenAI has reportedly announced GPT-5.3-Codex-Spark, though the exact release date remains uncertain; meanwhile, users of chat API models like GPT-5.3-chat have noticed discontinuation with newer versions from OpenAI.

    148 items

    Anthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.

  10. Agentic Manifesto When Karl Marx analyzed capitalism, one of his central ideas was surplus value. Profit comes from extracting more value from labor than workers receive in wages.

  11. Most agent systems have prompts, tools, and memory, but no operating model. I just open-sourced a small kit built around a different assumption: treat the agent like a micro AI company.

  12. Hi! I wanted to share my new blog on the costs of running AI Evals.

  13. I've always had the urge to have my two macbooks communicate. Having one idle while working on the other felt like underutilization of resources.

  14. Good idea or not really?

  15. Bonjour à tous et à toutes ! Je suis un nouvel addict à l'IA et Claude me plait énormément.

  16. Research Article Template A modern, interactive template for scientific writing that brings papers to life. Interactive diagrams, math, citations, dark mode, PDF export - all with minimal setup.

  17. Hey r/AI_Agents, Sharing something I am actively building right now. **The problem:** Businesses receive thousands of complaints daily.

  18. model roundup

    Gemma 4
    141 items

    Gemma 4 is a family of open-source multimodal models from Google DeepMind, available in sizes up to 31 billion parameters and featuring dense and MoE architectures. Notable community highlights include the 31B model's success in production tests, with some users preferring 4-bit precision for local use, and others sharing settings for optimizing performance with smaller models.

    model roundup

    Qwen 3.5
    124 items

    Qwen3.5-9B is a post-trained model with 9 billion parameters that integrates multimodal learning and efficient hybrid architecture for enhanced performance. Community highlights include speculative decoding on Apple Silicon boosting Qwen3.5-9B's throughput by 4.1x, and the model outperforming others in coding tasks while addressing overthinking issues through tool usage.

  19. How Lovable's infrastructure team tracked down sporadic networking errors in Kubernetes, from crashing anetd pods to MTU mismatches, using AI-assisted debugging and deep packet inspection.

  20. TL;DR Claude Desktop's Code tab is excellent for developers, but the same underlying capability — Claude as a stateful, file-aware agent over a git-backed workspace — would unlock a much larger market if reframed for knowledge workers. A n…

  21. For the past 15 days I have noticed that Claude Code follows my instructions as it is from CLAUDE.md regarding any action which is specified in the file. Which is a huge improvement and while some people would disagree but I would rather u…

  22. Docs REST API gRPC Pricing Search ⌘ K Toggle theme

  23. I lead marketing at a B2B integrations SaaS. We've been running a multi-agent setup for our content function for a few months now, including research, writer, fact-checker, critic, publisher, the usual chain.

  24. The State of MCP Security What 1,787 MCP servers can actually do to your systems. We classified every tool on every Model Context Protocol server we could enumerate from the public registries — 25,329 tools across 1,787 working servers.

  25. About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC

  26. I have been building products in this space for 3-4 months now, but do not see any traction for them. I am curious as to the problems people are actually facing in this space, that is not solved to a satisfactory level by a competitor in t…

  27. Persistent AI memory is often reduced to a retrieval problem: store prior interactions as text, embed them, and ask the model to recover relevant context later. This design is useful for thematic recall, but it is mismatched to the kinds o…

  28. Advanced Quantization Algorithm for LLMs English | 简体中文 User Guide | 用户指南 🚀 What is AutoRound? AutoRound is an advanced quantization toolkit designed for Large Language Models (LLMs) and Vision-Language Models (VLMs).