1. Today we're launching an early beta of Grok Build, a powerful new coding agent and CLI for professional software engineering and complex coding work. Available first for SuperGrok Heavy subscribers.

  2. Grok Build Beta | xAI Grok API Company Colossus Careers News Shop SpaceX 𝕏 Try Grok Grok Build Beta Read docsUpgrade Grok Build is in early beta for SuperGrok Heavy subscribers. curl -fsSL https://x.ai/cli/install.sh| bash projects/main ja…

  3. been tracking EU GPU prices since early march - 15 stores, 6-hour scrape cadence, ~126k readings. posting here because the 5090 trend is directly relevant if you're buying for local inference.

  4. Hi HN, we're Donnie, Josh, and Ben from ContextBridge. We open sourced PlanBridge, a CLI tool for precision feedback on your coding agent's plans.

  5. Hello everyone, Just wanted to share Lytenyte Grid AI Skills. If you use Agents for your frontend UI and need a data grid, this will 100% help you save a ton of time and drastically reduce token usage!

  6. When I joined the Codex engineering team in September 2025, Codex for Windows didn’t have a sandbox implementation meaning that Windows users were forced to choose between two subpar options when using OpenAI's coding agents: Approving nea…

  7. On April 14, I created a free account on ChatGPT and asked for some help. It resisted me at first, but after some pushing the responses turned shocking.

  8. The two from parameter golf (one I trained, one was the baseline) are just 16MB each! They produce barely plausible English

  9. model roundup

    Qwen 3.6
    419 items

    Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.

    model roundup

    Opus 4.7
    361 items

    Claude Opus 4.7, released on April 16, 2026, is Anthropic's latest advanced AI model, offering improved handling of complex tasks and a larger context window of up to 1 million tokens. This version is 50% more expensive than its predecessor due to enhanced capabilities in software engineering and hybrid reasoning.

  10. I use agents on two VPS and have few humans who work on same markdown. We use markdown for most of documents, some CSVs and considering HTMLs.

  11. been working with a tier-1 diagnostic imaging network that ran into a straightforward problem: scan volumes jumped 22%. the obvious answer is to license a saas tool.

  12. I have a docker stack with a bunch of AI services and llama.cpp server is the brain. I've got a working vulkan yml snippet for llama.cpp but out of curiosity, I flipped it to ROCM (latest build) and did not see ANY performance improvement.

  13. Bro what is going on with Claude my usage is going down on its own?? I literally haven't typed a single message and it's already eaten 47% of my limit burning through my 5 hour session in the background like I'M the one using it..Also nobo…

  14. Robotics is advancing really fast lately, with AI inference, different controllers, software, and parts always changing. I wanted a place that supports many device types, Raspberry Pi, NVDA Jetson, Arduino, ESP32, hardware sources, and max…

  15. I converted nvidia/llama-embed-nemotron-8b to MLX fp16, 8-bit, 4-bit, and 2-bit (for my OCD) and put it on HuggingFace: ncorder/llama-embed-nemotron-8b-mlx-fp16 ncorder/llama-embed-nemotron-8b-mlx-8bit ncorder/llama-embed-nemotron-8b-mlx-4…

  16. Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality TL;DR: Two new Apache 2.0 multilingual embedding models built on ModernBERT — a 97M-parameter compact model that…

  17. I watched a few interviews with Anthropic employees talking about non developers using Claude code for their work. It was tried at my firm and just resulted in some major security issues and a slop fest.

  18. model roundup

    Sonnet 4.6
    89 items

    Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.

    model roundup

    GPT 5
    5 items

    Recent updates to Codex include a new version called GPT5.5s, which has shown improvements in token efficiency through a process known as "cavemanmaxxing." Additionally, analyses of over 100,000 ChatGPT messages revealed that nearly 10% contain leaked CoT (Chain-of-Thought) data from earlier versions.

  19. I’m building an MCP tool for Cursor that lets the agent inspect visible Windows UI, highlight what it wants to click/type, and wait for user approval. Use case: helping with desktop apps outside the codebase — settings panels, dev tools, i…

  20. Curiosity-driven question. I've been tracking AI referral traffic via Zen Reports across a handful of sites, and ChatGPT's click-through rate to cited sources seems much lower than Perplexity's.

  21. How have you used Claude in marketing, especially for market research, product development, or consumer insights? Have you automated any workflows around surveys, social listening, competitor research, or product briefs?

  22. tl;dr : ** probably comes with redundant fiber ** a Cold War–era underground nuclear bunker, originally constructed in the late 1960s as part of AT&T’s Long Lines network and engineered for durability, redundancy, and long-term self-suffic…

  23. hey r/AI_Agents - We built this because debugging AI agents is miserable. Failures hide three levels deep in nested spans, you're either printing terminal output or going to some SaaS dashboard.

  24. Benchmarks for AI Models and Agents on CAD Tasks Parametric CAD Bench is a comprehensive collection of benchmarks to benchmark CAD models and AI agents on CAD design and 3D modeling tasks. A community effort to build the best open parametr…

  25. Been using cursive for about three months now and I love it but I’m running into some issues the limits I mainly use composer two and Claude for mainly 90% of my work sometimes I switch between the opus and ChatGPT 5.4 but those are the ma…

  26. In standard AWQ, per-channel scales and quantization ranges are picked in separate steps: scales first, then the quantization parameters. But they're not independent, i.e., the rounding error from one depends on the choice of the other, so…

  27. Why is it that all of the LLMs by default use em dashes more than any other punctuation? Is it the versatility of em dashes?