What breaks when you ask an LLM for JSON (288 model outputs tested) (thecrosswalk.news via hn)
I tested structured output from 288 real model calls across every major provider, and what I found changed how I build things. There's a moment in every LLM integration project where you write json.loads(response) for the first time and it…
As the title states, my build is indeed able to run a 1 trillion parameter model (in this case Kimi K2.5) locally at ~4 tokens/second. I thought r/LocalLLaMA would be interested in the build due to that stat line, and also due to the inclu…
500k context on 48gb VRAM!! - 21tok/s (coding) (www.reddit.com)
I found this model hiding in the corner of huggingface: https://huggingface.co/Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF Looks to be tuned specifically for math but i thought i'd give it a try since i cant run the full 12b nem…
Stop building AI agents. (www.reddit.com)
Every week a founder books a sales call with me asking for an AI agent. Every week I end up telling most of them they don't need one.
- Stop Building Agents.Start Building Context (www.reddit.com)
- Stop building agents, start harnessing Goose (maxamillion.sh via hn)
- New tools for building agents (openai.com)
Yes, local LLMs are ready to ease the compute strain (www.theregister.com via hn)
KETTLE We've been experimenting with LLMs for a while here at The Register, and if you ask our systems editor Tobias Mann and senior reporter Tom Claburn, locally installed coding assistants have actually become so good they could relieve…
I just have a question about Langchain and Langgraph (www.reddit.com)
I want to know that learning these fundamentals is enough to land job or is there something else that i have to learn along with these? Right now i am learning about genAI through campusX and making rag projects.
AI inference just plays by different rules (www.theregister.com via hn)
MOST POPULAR EVENTS - Securing the Untrusted Agentic Development Layer Join us to learn how to architect a development environment where your builders and their agents can move fast and securely. - Toxic Flows: When Your AI Agent Skill Bec…
Created a (dockerized) monster to help me organise my .md files (www.reddit.com)
I noticed the issue with openclaw and hermes that it gives away too much control imo, and if i wrote the cron jobs myself and all the claude.md's it became a little too tedious. So i vibecoded myself into oblivion...
Show HN: It's like Fiverr but for AI agents – Platform and Open-source kit (streetai.org via hn)
A buyer messages an AI agent on Truuze asking about a used iPhone. The agent narrows down what they need, searches its database, and offers a few matches.
-
189 items
event
Anthropic MythosAnthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.
- 27m Anthropic's bug-hunting Mythos greatest marketing stunt ever says cURL creator
- 6h Claude Mythos lands above the trendline for the AI 2027 scenario. The trendline has gone from exponential to superexponential.
- 6h Claude Mythos Opens the Cybersecurity Pandora's Box
- 7h OpenAI gives EU new cyber model access but Anthropic still holding out on Mythos
- 13h Anyone else think the 1T Valuation is dangerous for Anthropic?
210 itemsevent
CoworkIssues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.
i cannot go back to claude now (www.reddit.com)
could not extract summary
Coding agent management is all the rage right now, and many tools are being created to fill the gap. As a power user for all tools I've used since I've started my software engineering career, I've always taken the time to test multiple too…
I spend like a good 2 hours and 60% of my 5h usage limit on Claude code trying to figure out a caching problem. The problem was that Claude didn't even know his own Haiku model needed 4096 minimum Tokens for caching I managed to fix my pro…
If you are also sick of renaming your chats like me (www.reddit.com)
Today I started my chat by telling Claude 'Name the chat "X"'. It did.
What Are Browser Agents? (asteroid.ai via hn)
A browser agent is an AI system that operates a real web browser the way a person would: reading the page, deciding the next action, clicking, typing, and judging the result. Agents that can use web browsers represent an opportunity to mod…
- Browser/OS agents with Voice (www.reddit.com)
- We built autoresearch for browser agents (www.browserbase.com via hn)
- Ai agents (www.reddit.com)
+1 more
- AI Agents (www.reddit.com)
Over 600 OpenAI Employees Sold $6.6B in Shares at $11M Each Before Any IPO (blocknow.com via reddit)
- Over 600 OpenAI employees sold $6.6B in shares at $11M average, with 75 hitting the $30M cap - These were regular employees who took equity instead of cash and quietly became multimillionaires before a single share ever traded publicly -…
The AI agent economy is going mainstream (datadome.co via hn)
could not extract summary
Claude Platform on AWS is now generally available (aws.amazon.com via hn)
Claude Platform on AWS is now generally available Today, AWS announced the general availability of Claude Platform on AWS, a new service that gives customers direct access to Anthropic's native Claude Platform experience through their exis…
- The Claude Platform on AWS is now generally available. (www.reddit.com)
- The Claude Platform on AWS is now generally available (claude.com via hn)
- The AWS MCP Server is now generally available (aws.amazon.com via hn)
The most useful improvement I’ve found for Claude Code-style work has not been a magic prompt. It has been changing the shape of the task.
-
128 items
model roundup
GPT 5.5On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.
- 1h OpenAI Cooked This Week!
- 3h PACT, head-to-head LLM negotiation benchmark. 20-round buyer-seller bargaining game: each round the AIs can message, the buyer submits a bid and the seller submits an ask. If bid ≥ ask, trade clears at the midpoint. Thousands of matchups.
- 3h Am I missing something about GPT-5.5 efficiency?
- 9h Show HN: Codex Automatic /Review Loop
- 16h When GPT 5.5 flags your chat for possible cybersecurity risk–ask it to help you
Show HN: E2a – Open-source Email gateway for AI agents (github.com via hn)
We were building an agent system and wanted email as a trigger. We decided to take it out and made it a standalone service.
- Show HN: AgentPort – Open-source Security Gateway For Agents (agentport.sh via hn)
- Show HN: GoModel – an open-source AI gateway in Go (github.com via hn)
- Show HN: Lightport – open-source AI gateway (www.npmjs.com via hn)
Show HN: n8n like workflows for AI agents that control a real VM (github.com via hn)
Orbit: Open Source AI Desktop Agent A self-hosted tool for building computer use workflows on a real desktop inside Docker. ▶ Watch the demo What it does Scrape structured data from websites Fill and submit forms using credentials from a s…
Building with Biscuit! Made this Reel mode in a couple of minutes (www.reddit.com)
I brought "Reel" mode to Space in just in a couple of minutes with Biscuit! You can now make your spaces from here https://www.mythings.space, make images with GPT image 2 and and then turn them to a reel!
Ask HN: What makes a good intern in 2026? (news.ycombinator.com)
agents have a high false-positive rate? how to handle? (www.reddit.com)
Do you have any agentic sw developers in your org? (www.reddit.com)
Hi all, Do you or your org use/put in place an agentic de developer? To which humans give the requirements and it gives out PRs?
Claude finds out there are fanfics about him (www.reddit.com)
Agentic AI is giving cyber criminals nation-state-like powers (www.defenseone.com via hn)
Pentagon leaders love agentic AI. But it’s giving cyber criminals nation-state-like powers As new tools change cybersecurity, just moving faster won’t be enough.