Grok Build (x.ai via hn)
Today we're launching an early beta of Grok Build, a powerful new coding agent and CLI for professional software engineering and complex coding work. Available first for SuperGrok Heavy subscribers.
- Grok 4.3 (docs.x.ai via hn)
- Grok 4.3 (docs.x.ai via hn)
- Grok (www.reddit.com)
+1 more
- How would you build this? (www.reddit.com)
Early Access Grok Build CLI (x.ai via hn)
Grok Build Beta | xAI Grok API Company Colossus Careers News Shop SpaceX 𝕏 Try Grok Grok Build Beta Read docsUpgrade Grok Build is in early beta for SuperGrok Heavy subscribers. curl -fsSL https://x.ai/cli/install.sh| bash projects/main ja…
been tracking EU GPU prices since early march - 15 stores, 6-hour scrape cadence, ~126k readings. posting here because the 5090 trend is directly relevant if you're buying for local inference.
Hi HN, we're Donnie, Josh, and Ben from ContextBridge. We open sourced PlanBridge, a CLI tool for precision feedback on your coding agent's plans.
Hello everyone, Just wanted to share Lytenyte Grid AI Skills. If you use Agents for your frontend UI and need a data grid, this will 100% help you save a ton of time and drastically reduce token usage!
When I joined the Codex engineering team in September 2025, Codex for Windows didn’t have a sandbox implementation meaning that Windows users were forced to choose between two subpar options when using OpenAI's coding agents: Approving nea…
ChatGPT Gave Me Chilling Advice–As I Simulated Planning a Mass Shooting (www.motherjones.com via hn)
On April 14, I created a free account on ChatGPT and asked for some help. It resisted me at first, but after some pushing the responses turned shocking.
Show HN: Visualizing Tiny LLMs from OpenAI's Parameter Golf (leebutterman.com via hn)
The two from parameter golf (one I trained, one was the baseline) are just 16MB each! They produce barely plausible English
-
419 items
model roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
- 1m Is there a big gap between Q4 and Q6 on Qwen3.6?
- 5h Got local Qwen 3.5/3.6 generating meeting summaries entirely offline on an M4 Max. Demo with Wi-Fi off. This is the future.
- 8h Automated AI researcher running locally with llama.cpp
- 10h Turboquant+MTP for ROCm(Llama CPP)
- 11h [FOLLOW UP] Qwen3.6 27b q5_k_M MTP - 256k context - 5090
361 itemsmodel roundup
Opus 4.7Claude Opus 4.7, released on April 16, 2026, is Anthropic's latest advanced AI model, offering improved handling of complex tasks and a larger context window of up to 1 million tokens. This version is 50% more expensive than its predecessor due to enhanced capabilities in software engineering and hybrid reasoning.
- 52m Higgsfield just launched what they call the first fully automated AI agent for video - real shift or just another hype?
- 1h Extended Thinking being deprecated for supported models (Opus 4.6, Sonnet 4.6); Adaptive Thinking will be enforced by default
- 6h Built a B2B role-play training platform - entirely with Claude (Opus 4.7 backend, Haiku 4.5 for live chat, Claude for design)
- 7h Max20 user: anyone running Opus 4.7 as orchestrator + DeepSeek V4 as the worker via OpenRouter?
- 9h Claude Opus 4.7 leaks system prompt randomly
Markdown sharing solutions (www.reddit.com)
I use agents on two VPS and have few humans who work on same markdown. We use markdown for most of documents, some CSVs and considering HTMLs.
been working with a tier-1 diagnostic imaging network that ran into a straightforward problem: scan volumes jumped 22%. the obvious answer is to license a saas tool.
I have a docker stack with a bunch of AI services and llama.cpp server is the brain. I've got a working vulkan yml snippet for llama.cpp but out of curiosity, I flipped it to ROCM (latest build) and did not see ANY performance improvement.
Usage is going down on its own?? (www.reddit.com)
Bro what is going on with Claude my usage is going down on its own?? I literally haven't typed a single message and it's already eaten 47% of my limit burning through my 5 hour session in the background like I'M the one using it..Also nobo…
Show HN: Browse 61 3D Printable Robots (orobot.io via hn)
Robotics is advancing really fast lately, with AI inference, different controllers, software, and parts always changing. I wanted a place that supports many device types, Raspberry Pi, NVDA Jetson, Arduino, ESP32, hardware sources, and max…
MLX 16/8/4/2-bit quants of nvidia/llama-embed-nemotron-8b (www.reddit.com)
I converted nvidia/llama-embed-nemotron-8b to MLX fp16, 8-bit, 4-bit, and 2-bit (for my OCD) and put it on HuggingFace: ncorder/llama-embed-nemotron-8b-mlx-fp16 ncorder/llama-embed-nemotron-8b-mlx-8bit ncorder/llama-embed-nemotron-8b-mlx-4…
Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality TL;DR: Two new Apache 2.0 multilingual embedding models built on ModernBERT — a 97M-parameter compact model that…
Claude Code for non-devs (www.reddit.com)
I watched a few interviews with Anthropic employees talking about non developers using Claude code for their work. It was tried at my firm and just resulted in some major security issues and a slop fest.
- Claude code (www.reddit.com)
-
89 items
model roundup
Sonnet 4.6Sonnet 4.6, a new release noted for its "unhinged" behavior, has sparked discussions among users about unexpected changes in software performance and cost management strategies involving Cursor and Claude APIs.
- 53m With sonnet 4.5 going away, is there any to make sonnet 4.6 a good creative writer as 4.5 ever was?
- 1d Auto mode doesn't work today?
- 1d Claude Status Update : Elevated errors for Claude Sonnet 4.6 on 2026-05-12T19:36:38.000Z
- 2d I love Claude (sonnet 4.6) but coming off casually like on big issues is terrifying.
- 2d Does the sudden removal of Sonnet 4.5 violate Claude's Constitution?
5 itemsmodel roundup
GPT 5Recent updates to Codex include a new version called GPT5.5s, which has shown improvements in token efficiency through a process known as "cavemanmaxxing." Additionally, analyses of over 100,000 ChatGPT messages revealed that nearly 10% contain leaked CoT (Chain-of-Thought) data from earlier versions.
- 1h UI for vibecoding…I need help
- 1h Just stumbled across one of the wildest AI experiments I’ve seen in a while.
- 11h Free open-source way to use ChatGPT/Codex subscription in Cursor natively
- 1d OpenSource4o
- 3d GPT5.5s CoT keeps leaking in the new codex update. Looks like we know how they got token efficency, they cavemanmaxxed
Would desktop UI perception be useful for Cursor agents? (www.reddit.com)
I’m building an MCP tool for Cursor that lets the agent inspect visible Windows UI, highlight what it wants to click/type, and wait for user approval. Use case: helping with desktop apps outside the codebase — settings panels, dev tools, i…
Curiosity-driven question. I've been tracking AI referral traffic via Zen Reports across a handful of sites, and ChatGPT's click-through rate to cited sources seems much lower than Perplexity's.
How are you using Claude for marketing? (www.reddit.com)
How have you used Claude in marketing, especially for market research, product development, or consumer insights? Have you automated any workflows around surveys, social listening, competitor research, or product briefs?
- Using Claude for Humanities? (www.reddit.com)
- 3D Models using Claude (www.reddit.com)
- Using Claude for everything (www.reddit.com)
+3 more
- Using Claude Daily (www.reddit.com)
- THE PROBLEM WITH "JUST USING CLAUDE" (www.reddit.com)
- How are you using Claude in your business? (www.reddit.com)
Is it possible to run local llama without a bunker ? (no) (www.zillow.com via reddit)
tl;dr : ** probably comes with redundant fiber ** a Cold War–era underground nuclear bunker, originally constructed in the late 1960s as part of AT&T’s Long Lines network and engineered for durability, redundancy, and long-term self-suffic…
Show: We built a local, open-source trace debugger for AI agents (www.reddit.com)
hey r/AI_Agents - We built this because debugging AI agents is miserable. Failures hide three levels deep in nested spans, you're either printing terminal output or going to some SaaS dashboard.
- Show HN: Moltnet – open-source local chat for AI agents (moltnet.dev via hn)
Benchmarks for AI Models and Agents on CAD Tasks (cadbench.ai via hn)
Benchmarks for AI Models and Agents on CAD Tasks Parametric CAD Bench is a comprehensive collection of benchmarks to benchmark CAD models and AI agents on CAD design and 3D modeling tasks. A community effort to build the best open parametr…
Is Cursor Ultra Worth it (www.reddit.com)
Been using cursive for about three months now and I love it but I’m running into some issues the limits I mainly use composer two and Claude for mainly 90% of my work sometimes I switch between the opus and ChatGPT 5.4 but those are the ma…
- Cursor 20$ plan worth it? (www.reddit.com)
Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update (www.reddit.com)
In standard AWQ, per-channel scales and quantization ranges are picked in separate steps: scales first, then the quantization parameters. But they're not independent, i.e., the rounding error from one depends on the choice of the other, so…
Ask HN: Why do LLMs use em dashes so often? (news.ycombinator.com)
Why is it that all of the LLMs by default use em dashes more than any other punctuation? Is it the versatility of em dashes?