Lessons We Learned Building a RAG Assistant Without a Separate Vector Database (blog.devgenius.io via hn)
How we used StarRocks, Gemini, and tool-based retrieval to power grounded Q&A in a developer community Slack. 9 min read 7 hours ago Author:Billy Chang, Software Engineer at Phoenix AI Press enter or click to view image in full size StarRo…
Claude dropped Fable 5 and the API pricing genuinely shocked me (www.reddit.comhttps)
So Claude just dropped Fable 5 and I got curious, went to check the API pricing… and wow 😭 $50/M feels crazy expensive depending on what you’re building. Maybe I’m just broke founder mode right now, but seeing that number actually made me…
Has anyone deployed a multi-agent AI employee in production? (www.reddit.com via reddit)
I mean I know that most AI employee discussions seem focused on making a single agent smarter, but I'm curious about the opposite approach. But has anyone deployed a multi-agent AI employee where different agents handle planning, execution…
Let me say this upfront: I belong to the camp that feels the outrage/disappointment/frustration around fable not being part of the subscription and/or the safeguards as unfounded in reality. Fable is not meant for you and I - developers si…
Claude is breaking the records (www.reddit.comhttps)
Will it be the game changer or is it just hype, what do you think about it?
Show HN: Claude Code Context Analyzer (github.com via hn)
Context window usage analyzer for Claude Code. Tracks how context is consumed across tools, compaction, skills, and user interactions — then visualizes it so you can optimize your sessions.
- Claude code context window (www.reddit.comhttps)
- Show HN: Claude Code Manager (claude.ldlework.com via hn)
- Show HN: Claude Code Web UI (github.com via hn)
-
343 items
event
CoworkIssues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.
- 2m Fable 5 Claude code
- 11h Can't select Fable 5 for Claude Code. It is shown on Claude Chat and Cowork but not on Code. Anyone else experiencing this?
- 15h Cowork: what is the point of the google drive connector + refresh in the side panel?
- 15h Is using claude for creating product listings considered heavy lifting?
- 1d I asked Claude to use ChatGPT for game assets. It eventually turned my entire screen into a texture.
307 itemsevent
Anthropic MythosAnthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.
- 7m Everyone is talking about Fable 5's benchmarks. I think they're missing the real story
- 1h [AINews] Anthropic Claude Fable 5 — Mythos but Safe, with Controversial Terms
- 1h We do not
- 2h Claude Fable 5 (Mythos) lands near the top of MindTrial — 80/98 with zero hard errors
- 4h Claude is keeping your Mythos/Fable data no exceptions and not even for enterprise partners it seems
Anthropic is intentionally nerfing Fable when asked to develop other LLMs (www.reddit.comhttps)
Reason 458 why local LLMs are going to be a necessity
Me using Claude Fable 5 just to repost this on another Subreddit (www.reddit.comhttps)
could not extract summary
A small CLAUDE productivity hack that has been surprisingly useful for me. (www.reddit.comhttps)
Hey guys 👋 We all struggle with the 5-hour reset, especially as token usage gets higher with the newer models. So I started using a simple workflow to maximize my productivity.
Grok Build (docs.x.ai via hn)
- Grok Build 0.1 on API (x.ai via hn)
- Grok Build (grok.com via hn)
- Grok Build (x.ai via hn)
+5 more
- xAI Launched Grok Build (abz.global via hn)
- Grok 4.3 (docs.x.ai via hn)
- Grok 4.3 (docs.x.ai via hn)
- Grok (www.reddit.com)
- How would you build this? (www.reddit.com)
unsloth/North-Mini-Code-1.0-GGUF · Hugging Face (huggingface.co via reddit)
GGUF for the new Cohere 30B A3B model I haven't had a chance to test this yet, but I think it's related to https://github.com/ggml-org/llama.cpp/pull/24260
Getting Started with OpenAI Models on Amazon Bedrock (developers.openai.com via hn)
- OpenAI Models on Amazon Bedrock (aws.amazon.com via hn)
-
27 items
model roundup
DeepSeek 4DeepSeek-V4-Pro is a 1.6T parameter model with 49B activated, supporting one million-token context and achieving significant efficiency gains over previous versions. Notably, DeepSeek-V4-Flash (284B parameters) has been successfully run on a Raspberry Pi 5, demonstrating impressive performance despite the low hardware specifications.
- 10m Can I finetune Deepseek V4-flash with two rtx pro 6000s
- 8h DOA model by Cohere Labs
- 12h Running DeepSeek-V4-Flash on a Raspberry Pi
- 1d FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention
- 1d Here are some tips on hitting nearly 200 tok/s for DeepSeek v4 Flash on Hopper
71 itemsmodel roundup
Opus 4.8Claude AI has released Opus 4.8, an upgrade to their Opus class of models available in version 2.1.154 of their software on March 16, 2023, which includes enhanced coding and professional task capabilities along with improved judgment and honesty. Users are reporting usage resets following the update.
SafeAgentDB – Isolated databases for every AI agent branch (github.com via hn)
Anthropic out here stopping me from mass producing bioweapons (www.reddit.com via reddit)
Claude Fable 5 feels less like a launch and more like a preview of AI inequality (old.reddit.com via hn)
could not extract summary
- Claude Fable 5 feels less like a model launch and more like a preview of AI inequality (www.reddit.com via reddit)
Foundation-model agents are increasingly long-lived systems that remember users across interactions, making memorization an explicit deployment-time function rather than solely a property of model weights. Existing work addresses parametri…
Multimodal Large Language Models (MLLMs) can listen and see, but how do audio and visual signals actually travel through the network to shape an answer? Despite their growing role in research and real-world applications, the internal pathw…
-
130 items
event
Fine TuningFine-tuning is a hot topic in the AI community, with various projects and releases focusing on it. Notable examples include OpenAI's decision to wind down its fine-tuning API, Anthropic co-founder Jack Clark's prediction that AI research could become automated by 2028, and several new datasets and models released for fine-tuning purposes.
- 1h Supervised Fine-tuning with Synthetic Rationale Data Hurts Real-World Disease Prediction
- 1h Two to Tango: Coupled Task-Reference Selection for Safe LLM Fine-tuning
- 1h Fisher-Guided Progressive Parameter Selection for Adaptive Fine-Tuning
- 1h A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design
- 1h The Order Matters: Sequential Fine-Tuning of LLaMA for Coherent Automated Essay Scoring
Large language models deployed as autonomous agents for enterprise workflows face a key challenge: verbose tool responses from enterprise systems can cause context overflow, stale-state errors, and high inference cost. We study this proble…
Open-pit mine scheduling is a critical process for maximizing economic return under complex geotechnical and operational constraints. While Mixed-Integer Linear Programming (MILP) provides mathematically optimal baselines, its exponential…
When large language models generate from retrieved or augmented contexts, conflicts between external context and parametric priors remain a central reliability bottleneck. Existing contrastive decoding methods follow a \emph{context-aware}…
Language-agent "memory palace" systems anchor each memory to a world coordinate, on the intuition that geometry adds something text cannot. We make that intuition testable and report three results.
Although the study of human trajectory anomalies is critical for advancing spatial data mining, empirical research remains severely hindered by a pervasive lack of ground-truth datasets. Despite the availability of several real-world and s…
Reinforcement learning has become a key paradigm for eliciting reasoning abilities in large language models, where exploration is crucial for discovering effective solution trajectories. Existing exploration methods typically encourage div…
AI agents in supply chains face a fundamental epistemic gap: large language models (LLMs) interpret policies but lack physical grounding, while reinforcement learning (RL) optimizes flows but is semantically blind to unstructured constrain…