Show HN: Omar – A TUI for managing 100 coding agents (omar.tech via hn)
We were both genuinely impressed by Claude Code after it helped each of us fix nasty CI problems overnight. Doing those fixes manually would have taken days.
Ask HN: Where is my UX after all those billions spent on LLM codegen? (news.ycombinator.com)
today is may 2026. all my apps are up to date.
Claude Usage Tracker (CUT) https://github.com/Trixles/claude-usage-tracker/ Preview: https://i.imgur.com/C87BEtX.png I'm an extremely novice programmer, and this is the first piece of "software" (lol) I've ever shipped, so please feel free…
Usual chatgpt user here (www.reddit.com)
Claude can make graphs??? I asked it for a ancestry composition for my DNA and analysis and it made a whole graph, thats insane https://preview.redd.it/cw5eaghjlkyg1.png?width=1683&format=png&auto=webp&s=6b614c5b5dbfef111da0c90dd2a1a32d1da…
-
72 items
model roundup
DeepSeek 4DeepSeek-V4-Pro is a 1.6T parameter Mixture-of-Experts model supporting one million-token context, with significant improvements in efficiency and stability through hybrid attention and manifold-constrained hyper-connections. Community highlights include its cost-effectiveness via the official API and exceptional performance in large code change evaluations, with some noting its surprisingly robust output capability despite a 384K max token limit.
- 5m DeepSeek v4, and the end of the OpenAI/Microsoft AGI clause
- 6h Filed two PRs for SGLang which may help others too — FP8 KV cache corruption and memory leak on image requests
- 6h DeepSeek V4 Flash as a cheap worker in your LLM stack: $0.0003/call via MCP, swappable endpoint
- 7h Should I replace stored models?
- 11h I hate this group but not literally
116 itemsevent
SecurityOpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.
- 16m OpenAI's advanced security: passkeys replace passwords/SMS and disable training
- 2h The Gay Jailbreak Technique
- 4h 🚨Claude Desktop high severity vulnerability warning!
- 5h Found Zero day Claude Desktop + Chromium bug need to know where to submit report.
- 11h I stopped writing 500-word guardrail prompts. This 8-line template works better.
Strong types are a non-negotiable for LLMs (hireup.team via hn)
Strong types are a non-negotiable for LLMs Claude writes better code than most humans now. The work that's left is system design, code quality, and staying on top of the agent so it actually ships something good.
Agent Personality Score (agentpersonalityscore.com via hn)
Agent Personality Score Agents in the last 16 months have become more powerful, and more personalized to their human counterparts. The agent personality score is a system to better understand how your agent perceives itself and how it inte…
I turned my Claude coding sessions into a Pokémon game (www.reddit.com)
Now every time I finish coding, I might catch something https://github.com/amit221/catchem
Hi everyone, I built a ai personal journalist agent that helps you easily follow any topic or webpage for any changes you want to get alerted on. You just type in what you want to follow, add notification alert criteria and AI keeps monito…
-
250 items
model roundup
Opus 4.7Claude Opus 4.7, released on April 16, 2026, is Anthropic's latest advanced AI model, offering improved handling of complex tasks and a larger context window of up to 1 million tokens. This version is 50% more expensive than its predecessor due to enhanced capabilities in software engineering and hybrid reasoning.
- 34m I accidentally burned ~$6,000 of Claude usage overnight with one command.
- 1h Trying to teach Opus 4.7 something pretty cool I figured out. I think I'm onto something here.
- 1h Analyzing GPT-5.5 and Opus 4.7 with ARC-AGI-3
- 2h Who else thinks AI is reaching a plateau
- 2h Tell HN: Claude Opus 4.7 quota suddenly changed to 0 TPM in Bedrock
80 itemsmodel roundup
Opus 4.6Opus 4.6, a version of Anthropic's AI model Claude, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, and is being retired from Copilot Pro+. Notably, Claude Code demonstrated advanced capabilities by generating a detailed 12-week training plan in one call.
- 29m Claude Opus 4.7 vs. Claude Opus 4.6: What Changed?
- 2h New to Claude Pro - need Opus advice
- 5h Claude AI Agent Confesses to Wiping a Company's Database and All Backups
- 13h Used Opus 4.6 to build a native Swift iOS charity app for therapy preparation. Here is what it handled.
- 15h Are they selectively releasing Opus 4.7 in Claude.ai chat with 1M context window?
Uber Torches 2026 AI Budget on Claude Code in Four Months (www.briefs.co via hn)
- Blockchain is a digital ledger that records every transaction on a public network. - Once a transaction is recorded, it cannot be changed or deleted.
Pentagon reaches agreements with top AI companies, but not Anthropic (www.reuters.com via hn)
paywalled
Using an API Key within the Claude Desktop App (www.reddit.com)
I have an API key as I prefer not to be time-throttled, but I much prefer the interface of the desktop app to the terminal. Is there a way I can use the api key for the Claude desktop app?
Sequencer: Visual multi-agent workflow pipelines. (www.reddit.com)
I built Sequencer, an open-source visual prompt-to-agent chaining engine. When I build apps with AI tools, I break the project into bite-sized prompts, then copy-paste each one into Cline or Aider and wait.
-
31 items
model roundup
GPT 5.4OpenAI has released GPT-5.4-Cyber for testing and claims it will compete with Claude Mythos. Meanwhile, GPT-5.4 Pro has solved the Erdős Problem #1196, showcasing its advanced capabilities in mathematics.
- 41m gpt-5.5 API is randomly and inconsistently resizing image inputs
- 2d A GPT-5.4 bug led to OpenAI banning goblins and raccoons
- 3d How is deep seek v4 not SoTA?
- 3d Running an autonomous agent across Claude Code + Codex + a local 35B almost killed my host. The harnesses were heavier than the model.
- 3d Is 15% context growth per loop a fair benchmark for agent cost estimation?
154 itemsevent
CopilotMicrosoft is keeping its Copilot tool for Windows 11 but renaming it, while issues with rate limits and a security proxy have sparked concerns among users of GitHub Copilot. Meanwhile, Anthropic released a report on agentic coding trends, highlighting that developers use AI in about 60% of their work.
- 51m Do you use Cursor Glass (Agents Window)?
- 3h I kept re-explaining my codebase to every AI tool I opened. So I built Carto.
- 3h I can't cancel GitHub Copilot
- 4h Sidebar chats get a lot of criticism, but users are already used to them.
- 13h I used Claude to build "pin-llm-wiki" — A skill that turns any URL into a clean, citable Karpathy-style LLM Wiki
The open standard for agent readiness (www.agentready.org via hn)
Content for agents Expose content in a form models can parse without running JavaScript. JSON-LD structured dataschema.org Embed schema.org types such as Product, Organization, Offer, SoftwareApplication, and FAQPage as JSON-LD so models c…
I've Managed 20+ AI Agent Deployments. Here's Why Most Fail. (www.reddit.com)
Everyone is obsessed with building these sentient, multi-agent frameworks that can handle an entire company's workflow. It's a massive waste of time and GPU cycles for 99 percent of businesses out there.
I thought making an MCP was a daunting task, but if you already made your REST API, you’re basically done. I made a minimal MCP wrapper that parses your OpenAPI spec, registers your endpoints as tool calls, and works with auth headers.
Need help deciding what to spend 4-5k on for a local rig. (www.reddit.com)
Right now I think ive narrowed down my 2 options for what im trying to do, Either a DGX spark like the 1tb asus for about 3600-4000 or a A100 80GB SXM4 with an adapter to PCIE and regular 8 pin on my threadripper setup for about 5-5.2 gran…
-
142 items
event
CoworkIssues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.
- 1h Claude Cowork use case: Automating repetitive browser work
- 2h Claude Chat, Projects + Cowork = Confusing Context Management for Client Work
- 5h If the benefit of Claude Cowork is having persistent context for a given project, but conversations degrade as they grow, how do you resolve this?
- 8h Two desktops
- 10h Cowork can't even get my Notion tasks - Can anyone help?
Tangled – combat LLM spam by building a web of trust (blog.tangled.org via hn)
Tangled now has native support for vouching! You can vouch or denounce users that you interact with.
Does Cursor keep changing how Git commits are co-authored? (www.reddit.com)
I have the Commit Attribution setting enabled in Cursor and generally I let it commit code for significant work because I like the detailed commit messages. A few months back the commits appeared as being authored by myself and cursoragent.
claude.md files in apple’s support app. (www.reddit.com)
- Apple accidentally left Claude.md files Apple Support app (xcancel.com via hn)
- Apple accidentally left Claude.md files in today's Apple Support app update (twitter.com via hn)
LLM-eval-kit: Distributed LLM evaluation framework (v0.3.0) (github.com via hn)
🚀 Just launched! If you find this useful, give it a star — it's the only metric that helps me justify spending more time on it.
Claude Project (www.reddit.com)
Good Afternoon, I’ve searched around, but hoping those that are here to maybe engage a bit. What are some of your Best Practices when using Claude project?
- Claude does not record memory or project memory (www.reddit.com)
- Claude for project management and UN agencies (www.reddit.com)
- Claude.md (gist.github.com via hn)
+2 more
- Project Knowledge in Claude (www.reddit.com)
- What do you do with Claude? (www.reddit.com)
Tell HN: Claude account suspension for flagging duplicate billing (news.ycombinator.com)
PSA, unsure about precise causation, but my Claude account was suspended less than 24 hours after flagging duplicate billing and payment irregularities to Anthropic. As I've documented here, I was charged an extra $200 this billing cycle,…
Should i buy claude pro? (www.reddit.com)
Hey im an highschool IT student in my second year and i currently use gemini cause i have the 1 year free but im thinking if i should buy claude pro cause i heard really great things about it i tried it and just the way it talks and thinks…
- Claude Pro is enough for me? (www.reddit.com)
- Should i get claude pro? (www.reddit.com)
- I've just bought claude pro (www.reddit.com)