#rag
32 items
Benchmarked Gemma 4 E2B: The 2B model beat every larger sibling on multi-turn (70%) (aiexplr.com via reddit) Show HN: AI support chatbot with RAG and citations – one back end file, no infra (github.com via hn) Upload markdown docs, get a support chatbot that answers with citations. The entire backend is one JS file — storage, search, and conversation history are handled by the runtime.
Struggling to balance high-volume orchestration (www.reddit.com via reddit) RAG/Retrieval as a solution (www.reddit.com via reddit) hi folks, I am new to the community and I have gone through the rules and I hope I am not breaking any of them with this post and will try to maintain 1/10 ratio. For building RAG, there are many tools out there each solving a piece of t…
Turning RAG pipelines into enterprise-grade Data Subscriptions (halcyon.io via hn) Total idiot needs some build advice (www.reddit.com via reddit) Looking for some advice here because I made a hasty purchase. "Cut your losses and move on" is totally a reasonable answer, but I figured I'd look for some additional help. So, I just started working on a local RAG pipeline with about 15,0…
Show HN: SynapseKit – Async-native Python framework for LLM pipelines and agents (github.com via hn) []() Documentation · Quickstart · API Reference · Changelog · Discord · Report a Bug Build production LLM apps with 2 dependencies. Async-native RAG, Agents, and Graph workflows — no magic, no SaaS, no bloat.
The 'Dark Code' Problem and Milla Jovovich's New Open Source Agent Memory System (www.reddit.com via reddit) I open sourced a local-first LLM wiki for research and durable memory (www.reddit.com via reddit) Zuver – Build your enterprise Agents with just 10MB RAM (news.ycombinator.com via hn) It's tax time... agent-built RAG app end-to-end with Claude Code + an SDK skill (www.reddit.com via reddit) It's tax time, so I whipped up a tax doc assistant with our new Ragie skill. Concrete example of agent-assisted development that goes further than toy demos.
Cursor AI not using sub-agents (www.reddit.com via reddit) Show HN: NRC nuclear licensing RAG pipeline and regulatory embeddings dataset (huggingface.co via hn) I've been building an AI system to automate parts of the NRC Combined Operational License process: gap analysis against the Standard Review Plan, FSAR strength scoring, and RAI prediction using vector similarity to historical NRC requests.…
Good multi-agent harness with db-based long term context? (www.reddit.com via reddit) I'm looking for suggestions for an agent harness that uses a database (SQLlite, RAG, what ever) for long-term context. I plan to use my RTX3080 & 3090 for local AI, though I expect to use APIs for some tasks.
Show HN: GraphifyAI – Turn Any CSV/Excel into a Neo4j or LangChain Graph (graphify.midlantics.com via hn) Converting spreadsheets to graph databases (Neo4j, Neptune, etc.) usually means manually defining nodes, relationships, and writing Cypher from scratch. It's tedious.
How to diagnose RAG failures from traces (www.siquick.com via hn) How to diagnose RAG failures from traces If a RAG system fails in production, the first question we should be asking is "what broke in this trace?". Until you can answer that, most scorers or dashboards aren't going to help you.
CDRAG: RAG with LLM-guided document retrieval — outperforms standard cosine retrieval on legal QA (www.reddit.com via reddit) Free Red Team Security Audit for AI Agents & RAG Systems (limited) (www.reddit.com via reddit) Reports of RAG's death have been greatly exaggerated (atomicapp.ai via hn) Two-Stage Semantic Chunking for RAG in Python (alessandrofuda.github.io via hn) Whats the SOTA embedding model for arabic Language (www.reddit.com via reddit) Hello! I’m working on RAG system on arabic documents any idea on the best embedding model out there?
Anyone here tried the "compile instead of RAG" approach? (www.reddit.com via reddit) Beginner in Langraph with no dev experience. How to build projects from scratch (www.reddit.com via reddit) tested async performance across LangChain, LlamaIndex, and Haystack under concurrent load. The results were worse than I expected and here's what we found. (www.reddit.com via reddit) Been running LLM pipelines in production for a while. Kept noticing throughput numbers that didn't make sense for "async" code.
Why are so many Creating "local Chat" inference models? (www.reddit.com via reddit) I'm a novice but so confused by the tech driving the tech. Whats the use cases that are being driven by so many spending on 20K local modelling hardware, that cant compete with the pending dramatic decrease in cost per token let alone the…
Running on cpu :( (www.reddit.com via reddit) I am in the midst of a POC project at work and am I have is 4 AMD Epyc cores and those are essentially virtualized. Does any one have any tricks?
Any way to try Claude Pro ? (www.reddit.com via reddit) Need your help — creating a 2 min RAG video for a DevRel interview, what would actually be useful to you? (www.reddit.com via reddit) How are you feeding personal context to your local models? (www.reddit.com via reddit) I've been running Mistral/Llama locally through Ollama for a while now and the thing that keeps bugging me is context. The model itself is fine for general stuff but the second I want it to know about my projects, my notes, or files it doe…
GLM OCR for Arabic (www.reddit.com via reddit) I’m looking for advice on setting up a local AI model that can generate Word reports automatically. (www.reddit.com via reddit) MCP servers vs Agent Skills: I think most people are comparing the wrong things (www.reddit.com via reddit)