I shipped a wiki layer for AI agents that uses markdown + git as the source of truth, with a bleve (BM25) + SQLite index on top. No vector or graph db yet.
The standard AI energy debate compares server-side LLM inference to a server-side Google query. I think this misses most of what actually happens on a mobile device during a real search session.
Convo length sweet spot? what's peoples opinions on it? (www.reddit.com)
how long should a convo in Claude be? How many compacts is too much and how much is lost per compact.
25th April 2026
I built this over the weekend using Claude Code — a CRT-style digital clock screensaver that runs in any browser. What it does: Seven-segment LCD display with cyan glow and ghost segments (inactive segments faintly visible like a real LCD)…
How do you manage test data when vibe coding with Claude Code? (www.reddit.com)
Been vibe coding with Claude Code for a few months now and one thing keeps slowing me down. When I'm building and testing a feature, Claude generates sample data/text for me to test with.
-
192 items
model roundup
Opus 4.7Claude Opus 4.7, released on April 16, 2026, is Anthropic's latest advanced AI model, offering improved handling of complex tasks and a larger context window of up to 1 million tokens. This version is 50% more expensive than its predecessor due to enhanced capabilities in software engineering and hybrid reasoning.
- 29m Claude Status Update : Elevated errors on Claude Opus 4.7 on 2026-04-25T08:25:22.000Z
- 2h How Anthropic can save Opus 4.7 with one change.
- 2h Claude Opus 4.7 didn't believe me that the model UV was damaged until I came up with a delta filmstrip idea for it to screenshot
- 3h Updated ChatGPT vs Claude vs Gemini vs Grok subscription
- 16h Anyone noticed Anthropic didn't added the model Opus 4.7 and Mythos Preview to there Transparency Hub?
155 itemsmodel roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
How promising is the AI agent space right now? (www.reddit.com)
I’ve managed to build my own functional AI agents with distinct personalities and opinions. Some are for RP (with custom VRM models made in Blender, capable of real-time emotion display), while others can answer any question sometimes even…
OpenClaw agent identified a market gap, build a validation page in 24 hours (www.docketapp.us via hn)
A receipt tracker for solo tradespeople Every receipt you lose is proof you no longer have. Docket keeps every purchase, receipt, and warranty in one place so you never miss a return, a claim, or a deduction again.
Only claude is not enough! (www.reddit.com)
ok i am convinced that having only claude for all the work is not enough. I was using GPT and moved to claude for my online brand related work but started hitting limits on my PRO many times.
- Claude had enough of this user (www.reddit.com)
- Claude had enough of this user (www.reddit.com)
- Claude had enough of this user (old.reddit.com via hn)
Fixing hallucination in LLM prediction with only one 48gib GPU (zenodo.org via hn)
Pulse · genji970/hallucination-mitigation-via-contrastive-sampling-method
huggingface/ml-intern is HF's autonomous ML engineer — reads papers, audits datasets, ships SFT/DPO/LoRA/GRPO runs to HF Jobs. it's a standalone python harness with its own agent loop calling the Claude API.
- Ported HF's ml-intern workflow into a Claude Code plugin (www.reddit.com)
Cursor CLI feedbacks ? (www.reddit.com)
Hello AI lovers, I am looking for a replacement for Claude Code... Yeah since they admitted the harness was buggy and we all paid our expensive subscription to use a dumber version I am kind of ...
-
62 items
model roundup
Opus 4.6Opus 4.6, a version of Anthropic's AI model Claude, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, and is being retired from Copilot Pro+. Notably, Claude Code demonstrated advanced capabilities by generating a detailed 12-week training plan in one call.
- 19m How do I ensure Claude follows my instructions and project Files?
- 13h I’m learning French. Should i subscribe?
- 13h Has Claude become less intelligent? I had a frustrating day with Claude.
- 20h Anthropic admits to have made hosted models more stupid, proving the importance of open weight, local models
- 1d DeepSeek V4 is out. the best open-source on coding. here's the breakdown
35 itemsmodel roundup
DeepSeek 4- 54m To run deepseek v4 flash how much max vram we need? 175 gb or 320gb?
- 2h Show HN: A CLI to use any model in your coding agent
- 8h Deepseek V4 flash (high) rivals Gemini 3 flash at 1/5th the cost
- 8h DeepSeek V4 is out. 1.6 trillion parameters. MIT license. $1.74 per million tokens. The gap between US and Chinese AI strategy has never been more visible.
- 9h Xiaomi has released a MiMo V2.5 Pro model. It's apparently about as good as Deepseek V4 (but at different tasks) but is significantly cheaper.
Richard Sutton – Father of RL thinks LLMs are a dead end [video] (www.youtube.com via hn)
About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
Built a chrome extension in 10 minutes from a reddit post (www.reddit.com)
I saw this reddit post about a MacOS application that a guy made. It basically tracks time you spend on various applications throughout the day and displays it as money lost or gained based on your hourly income.
OpenClaw vs. Hermes Agent: The race to build AI assistants that never forget (thenewstack.io via hn)
OpenClaw vs. Hermes Agent: The race to build AI assistants that never forget Every developer who has used an AI coding assistant has experienced the same frustration: You spend an afternoon teaching Claude Code or Codex the quirks of your…
running 50 concurrent agents and sessions just start dying. timeouts, stalls, half the runs dont return an error they just..
I'm looking to explore this ai agent fields and planning to start building some ai agents and automations - as much as i know n8n is a platform people have been using to automate taskk but nowadays claude code and open claw kind of platfor…
NoonFlow 🌐 Official Website: https://noonflow.pages.dev/ English | 简体中文 NoonFlow is a visual AI coding workspace for people who already spend serious time in Claude Code and Codex. It is not trying to replace those CLIs.
-
90 items
event
CoworkIssues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.
32 itemsmodel roundup
GPT 5.4OpenAI has released GPT-5.4-Cyber for testing and claims it will compete with Claude Mythos. Meanwhile, GPT-5.4 Pro has solved the Erdős Problem #1196, showcasing its advanced capabilities in mathematics.
- 1h Testing GPT-5.5 in early access: what we are seeing so far
- 1d How can GPT 5.5 Pro be lower than GPT 5.4 Pro on the benchmark of HLE (w/ tools)?
- 1d GPT-5.5 rollout — anyone actually seeing it yet?
- 1d Page 15 of the GPT-5.5 System Card: " Our analysis estimates that GPT-5.5 is slightly more misaligned than GPT-5.4 Thinking across several categories, though nearly all of this is low-severity misalignment. "
- 2d Trained Qwen to Write Clojure Better Than GPT-5.4 (Kinda)
Show HN: Agent MCP Studio – build multi-agent MCP systems in a browser tab (www.agentmcp.studio via hn)
I built a browser-only studio for designing and orchestrating MCP agent systems for development and experimental purposes. The whole stack — tool authoring, multi-agent orchestration, RAG, code execution — runs from a single static HTML fi…
I am exploring Claude Code integration with Slack / GitHub and wondering if features like MCP, skills, subagents are available. Does anyone have any experience with those?
LogAct: Enabling agentic reliability via shared logs (arxiv.org via hn)
Agents are LLM-driven components that can mutate environments in powerful, arbitrary ways. Extracting guarantees for the execution of agents in production environments can be challenging due to asynchrony and failures.
My first time shipping to prod in five hours with Claude Code! 👩💻 (www.linkedin.com via reddit)
So happy! It is also the first time I linked Claude code to my github, and also the first time I used Claude code to ship a product from scratch!
Just tried something interesting — automated the process of filing multiple RTI applications using Claude Code + Playwright CLI. What normally takes a lot of repetitive manual effort (filling forms, payments, confirmations, etc.) was handl…
Also, LIFE HACK: I find it is better to ask GPT to create the concept first as a text response (so that it is really elaborate) and then ask it to generate the image after, instead of asking it to generate an image with the idea from the g…