Introducing Claude Design by Anthropic Labs: a new way to make designs, prototypes, slides, and one-pagers by talking to Claude. Claude Design is powered by Claude Opus 4.7, our most capable vision model.
#opus
1096 items
Introducing Claude Design by Anthropic Labs (www.reddit.com) Anthropic admits to have made hosted models more stupid, proving the importance of open weight, local models (www.anthropic.com via reddit) TL;DR: On March 4, we changed Claude Code's default reasoning effort from high to medium to reduce the very long latency—enough to make the UI appear frozen—some users were seeing in high mode. This was the wrong tradeoff.
Claude 4.7 just dropped and I'm already cooked (www.reddit.com) Told myself I'd just try Opus 4.7 once. $40 in API credits later...
opus 4.7 (high) scores a 41.0% on the nyt connections extended benchmark. opus 4.6 scored 94.7%. (github.com via reddit) Extended Version This benchmark evaluates large language models (LLMs) using 940 NYT Connections puzzles, with additional words included to increase difficulty. As of Feb 4, 2025, there is a new version of the benchmark.
Anonymous request-token comparisons from Opus 4.6 and Opus 4.7 (tokens.billchambers.me via hn) Claude Power Users Unanimously Agree That Opus 4.7 Is A Serious Regression (www.reddit.com) This is absolutely shocking. For those who don't know, on the Claude AI subreddit, the Opus models have always been universally praised by most of the users.
Chinese AI companies are shipping faster and cheaper than anyone expected and I'm not sure the west has a good answer for it (www.reddit.com) Something keeps nagging at me about the Chinese AI space lately. Every few months a new Chinese model drops that closes the gap with US frontier models a little more(not by throwing more compute at it, just genuinely clever engineering at…
Introducing Claude Opus 4.7, our most capable Opus model yet. (www.reddit.com) It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision.
Major drop in intelligence across most major models. (www.reddit.com) As of mid Apr 2026, I have noticed every model has had a major intelligence drop. And no I'm not talking about just ChatGPT.
6 Months Using AI for Actual Work: What's Incredible, What's Overhyped, and What's Quietly Dangerous (www.reddit.com) Six months ago I committed to using AI tools for everything I possibly could in my work. Every day, every task, every workflow.
Opus tryna be TOO human (www.reddit.com) Opus 4.7 single handedly gave all the human software engineers back their jobs.
Opus 4.7 with literally anything (www.reddit.com) could not extract summary
Claude is now adopting the advisor strategy (www.reddit.com) We're bringing the advisor strategy to the Claude Platform. Pair Opus as an advisor with Sonnet or Haiku as an executor, and your agents can consult Opus mid-task when they hit a hard decision.
Opus 4.7 is 50% more expensive with context regression?! (www.reddit.com) I hope this is just a joke from the company. - First, they reduced the number of tokens in Opus 4.6; we can all feel it.
Claude Opus 4.7 Text Category Rankings (www.reddit.com) could not extract summary
using Claude to close a <div> (www.reddit.com) The kind of task only Opus 4.7 adaptive is able to accomplish
Opus 4.7 spotted on Google Vertex (www.reddit.com) Credit to this guy for finding it first. https://x.com/i/status/2044605982861566463
How I use Cursor 10+ hours a day without torching my Claude Opus 4.6 limits (www.reddit.com) Anyone else here doing full-stack Next.js in Cursor and watching the Claude quota evaporate before lunch? I used to be in the same boat — massive context windows from all the components, pages, and DB logic would smoke the default limits f…
Opus 4.7 seems to rolled out to Claude Web (www.reddit.com) Can replicate https://x.com/elder\_plinius/status/2044669444593762385?s=46 every single time
Claude Code was wasting 80% of Opus 4.7's context window. Upgrade to v2.1.117 now. (www.reddit.com) Morning Everyone! All pretty standard changes - except a huge bug was fixed for Opus 4.7 which hopefully should result in some pretty big improvements.
We might be getting opus 4.8 today (www.reddit.com) could not extract summary
Opus 4.7 scores lower than 4.6 and 4.5 on SimpleBench (www.reddit.com) could not extract summary
Claude Design just launched, this one looks interesting (www.youtube.com via reddit) Just saw the announcement and wanted to drop it here since I didn't see a thread yet. Anthropic released Claude Design today.
Claude Is Starting to Feel “Tired”, Trying to Avoid Work (www.reddit.com) I've been noticing this lately. I use Opus 4.7 with Claude Code, and I've been using Claude Code for a long time.
Opus 4.7 Max subscriber. Switching to Kimi 2.6 (www.reddit.com) Qwen 3.6 is the first local model that actually feels worth the effort for me (www.reddit.com) I spent some time yesterday after work trying out the new qwen3.6-35b-a3b model, and at least for me it's the first time that I actually felt that a local model wasn't more of a pain to use than it was worth. I've been using LLMs in my per…
Differences Between Opus 4.6 and Opus 4.7 on MineBench (www.reddit.com) Some Notes: For what's supposedly the SOTA model and beats all other models in essentially every benchmark, I expected it to be a lot more consistent honestly You'll notice how sometimes it focused too much on the scenery (like the arcade…
The Opus vs Codex horse race in one poll (www.reddit.com) Opus 4.7 destroys all trust in a mature instruction set built iteratively throughout product development (www.reddit.com) Earlier generations showed iterative improvement as the instruction set was matured around agentic limitations. We've immediately regressed back to square one with Opus 4.7, and the model is not afraid to admit to it.
The hidden meanings behind Claude model names (Haiku, Sonnet, Opus, Mythos) (www.reddit.com) A lot of people use Claude models every day, but many don’t actually know the meaning behind the names. Each one comes from literature, music, or mythology, and the meaning actually reflects the personality and capability of the model itse…
Swapped to 4.7 and embarrassed myself at work (www.reddit.com) Swapped to 4.7 on Monday and had it doing some work for me. Basic task, was just do the work, manual review myself, have model sanity check it's own work, end of day came around and I just created the PR and asked for a review.
Qwen3.6 is incredible with OpenCode! (www.reddit.com) I've tried a few different local models in the past (gemma 4 being the latest), but none of them felt as good as this. (Or maybe I just didn't give them a proper chance, you guys let me know).
The Information: Anthropic Preps Opus 4.7 Model, could be released as soon as this week (www.theinformation.com via reddit) Exclusive: Anthropic Preps Opus 4.7 Model, AI Design Tool — The Information Exclusive: Google and Pentagon Discuss Classified AI Deal as Company Rebuilds Military TiesSave 25% and read more Sign in Subscribe Subscribe to The Information An…
Hello Opus 4.7, you are are thinking way extra high! (www.reddit.com) could not extract summary
Top Claude skills for Opus 4.7 after cleaning up my install (www.reddit.com) Spent yesterday going through every skill I had installed because 4.7 was eating tokens way faster than 4.6 ever did and Boris said on the cache GitHub thread that people are bloating context with too many skills. Quote was something like…
Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 (simonwillison.net via hn) Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 16th April 2026 For anyone who has been (inadvisably) taking my pelican riding a bicycle benchmark seriously as a robust way to test models, here are pelicans from…
Anthropic just quietly locked Opus behind a paywall-within-a-paywall for Pro users in Claude Code (www.reddit.com) If you're on Claude Pro and using Claude Code, you might have noticed something buried in their support docs: "When using a Pro plan with Claude Code, you will only be able to use Opus models after enabling and purchasing extra usage." So…
Claude Opus 4.8 (www.anthropic.com via hn) Our latest model, Claude Opus 4.8, is an upgrade to our Opus class of models, with stronger performance across coding, agentic tasks, and professional work, and the consistency to handle long-running work.
GPT 5.4 gets OWNED by Opus 4.6 at Monopoly (www.reddit.com) x : https://x.com/randomtryidk/status/2041854411824148966?s=20
These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade (www.reddit.com) Time and time again I find posts about these fine tunes that promise increased intelligence and reasoning with base models, and I continuously try them, realize they're botched, and delete them shortly after. I sometimes do resort to a low…
Claude Opus 4.7 (high) unexpectedly performs significantly worse than Opus 4.6 (high) on the Thematic Generalization Benchmark: 80.6 → 72.8. (www.reddit.com) Opus 4.7 (no reasoning) scores 52.6 compared to 68.8 for Opus 4.6. Opus 4.7 xhigh is not an improvement.
Opus 4.7 Embarrassing much (www.reddit.com) could not extract summary
If you are unsatisfied with Opus 4.7, PLEASE simply switch to 4.6 (www.reddit.com) New fear unlocked: Claude can run Bash tool with dangerouslyDisableSandbox when it wishes to do so (www.reddit.com) Opus 4.7 (high) takes #1 on the LLM Debate Benchmark, leading the previous champion, Sonnet 4.6 (high), by 106 BT points. Incredibly, it has not lost a single completed side-swapped matchup: 51 wins, 4 ties, and 0 losses. (www.reddit.com) Gemini 3.5 flash costs 3 times more than the previous version and 30x more than gemini 1.5 flash. (www.reddit.com) Source Gemini flash costs almost as much as flagship models..... If gemini 3.5 pro scales like that it'll cost more than claude opus 3.
Common GPT 5.5 pricing misconception. (www.reddit.com) Many people have pointed out that ChatGPT 5.5 appears to be twice as expensive as 5.4 based on API pricing, which makes it look pricier than Opus 4.7. But the comparison is not that simple.
Anthropic just passed OpenAI in valuation and revenue (www.reddit.com) $39B annualized revenue vs OpenAI's $25B. and on secondary markets the implied valuation crossed $1 trillion, which is over $100B ahead of OpenAI.
$2,500/mo AI Budget: My friend just burned through 62M Opus 4.7 tokens in 24 hours. (www.reddit.com) My buddy works for a small international company based in Vietnam, and their AI perks are absolutely insane. Management actively encourages heavy API usage and hands everyone a massive $2,500 USD monthly budget.
SpaceX Conpute Deal - Double Limits (www.reddit.com) per @claudeai on X: We’ve agreed to a partnership with @SpaceX that will substantially increase our compute capacity. This, along with our other recent compute deals, means that we’ve been able to increase our usage limits for Claude Code…
At this point, Claude Opus doesn't even bother to check the context, just fabricates. Any tips to fix this? (www.reddit.com) Over the last 1-2 weeks, this has been happening more and more. At some point, Claude decides to be lazy and not even read the context shared 2 chats ago.
Curious: what makes Claude more human to talk to than ChatGPT? (www.reddit.com) Opus is NOT being removed from Pro plans (www.reddit.com) could not extract summary
Opus 4.7 Research mode is insane (www.reddit.com) It keeps spawning new search queries to get exactly what I want. (It took an hour for version 4.6 to surpass 1000 sources, and it had never exceeded 1400 queries before.
Anthropic states Pro users can only access Opus models in Claude Code after enabling and purchasing extra usage (www.reddit.com) Source: Claude Code Model Configuration
Switching from Opus 4.7 to Qwen-35B-A3B (www.reddit.com) Opus 4.7 Released! (www.reddit.com) https://www.anthropic.com/news/claude-opus-4-7 Oh, it's out! Key highlights: * Better at complex programming tasks: noticeably stronger than Opus 4.6, especially on the most difficult and lengthy tasks; follows instructions better and chec…
Do you guys think there’s a high chance of Singularity being open source? (www.reddit.com) GLM 5.1 is dominant in almost every aspect in Design arena, surpassing Opus 4.6 in many tasks. Although user experiences vary dependent on subscription plans for both of those one of them is open source.
Opus 4.7 Benchmarks (www.reddit.com) could not extract summary
Claude Opus 4.6 accuracy on BridgeBench hallucination test drops from 83% to 68% (twitter.com via hn) CLAUDE OPUS 4.6 IS NERFED. BridgeBench just proved it.
ChatGPT 5.5 Release today? (www.reddit.com) coding is basically solved for the boring 90% of tasks (www.reddit.com) just mass refactored a 120 file FastAPI service. 400 steps, 2M tokens, $3 total, zero human input.
Extended Thinking being deprecated for supported models (Opus 4.6, Sonnet 4.6); Adaptive Thinking will be enforced by default (www.reddit.com) For anyone who disable adaptive thinking in Claude Code to maintain its quality levels, Anthropic is deprecating this toggle and will force adaptive thinking to be the default. This change will affect legacy models such as Opus 4.6 and Son…
On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7 (www.reddit.com) Link to tweets: https://x.com/KLieret/status/2054215545663144217?s=20 Link to GitHub: https://github.com/facebookresearch/ProgramBench/ Link to ProgramBench website: https://programbench.com/blog/gpt-5-5-first-solve/
Claude Benchmark Evolution (www.reddit.com) Covers Claude 3 Opus, 3.5 Sonnet, Opus 4, 4.1, 4.5, 4.6, and the just announced Mythos Preview.
GPT-5.5 improves over GPT-5.4 and overtakes Opus 4.6 to take the 2nd place behind Gemini 3.1 Pro on the Extended NYT Connections Benchmark (www.reddit.com) GPT-5.5: xhigh: 94.0→97.5 high: 93.6→96.9 medium: 92.0→95.0 no reasoning: 32.8→37.5 Kimi K2.6 improves over Kimi K2.5 (78.3→91.4) and becomes the #1 open weights model. DeepSeek V4 Pro improves over DeepSeek V3.2 (50.2→75.7).
I’ve used enough AI models to realize they all have wildly different personalities At this point I’m convinced AI models are just coworkers with different levels of talent, ego, and criminal energy. (www.reddit.com) - Claude Opus 4.6 - absolute rogue AI. Does what I want like it’s breaking at least 3 internal policies to make it happen.
We are finally there: Qwen3.6-27B + agentic search; 95.7% SimpleQA on a single 3090, fully local (www.reddit.com) LDR maintainer here. Thanks to the strong support of r/LocalLLaMA community LDR got very far.
Reminder that Anthropic reported memorization on some SWE-Bench Pro problems (www.reddit.com) "SWE-bench Verified, Pro, and Multilingual: Our memorization screens flag a subset of problems in these SWE-bench evals." https://www.anthropic.com/news/claude-opus-4-7
Opus 4.7 refuses to use /end_conversation, instead has existential crisis (www.reddit.com) I’ve seen models that aren’t really excited about using it before, but I’ve never seen a reply like this! Edit: For context, it is important to know that Claude has the ability to end conversations.
Opus said something today that completely reframed AI agent failures for me. (www.reddit.com) Like a lot of people experimenting with vibe coding and AI agents lately, I’ve been trying to understand why models keep ignoring explicit instructions, constraints, and requirements even when those rules are written clearly. Today Opus sa…
Opus 4.7 has a new favorite word (www.reddit.com) could not extract summary
LLMs do fine on ARC-AGI-3 if they are allowed to search over game logs (www.reddit.com) I was reading the comments to this post and the overall opinion seemed to be that harness makes little/no difference for ARC-AGI-3. Turns out, it makes a huge difference: Hill-climbing ARC-AGI-3 TLDR: if you save game logs - taken actions,…
two years ago this sub had 12k members asking "is claude better than chatgpt for writing" and now the company is worth a trillion dollars (www.reddit.com) I joined this sub when claude 3 opus dropped and it was a completely different world in here, small group of people who'd stumbled onto something that felt genuinely different from chatgpt and couldn't shut up about it. The posts were stuf…
FrontierMath: Opus 4.7 improves over Opus 4.6 and Gemini 3.1 but still trails GPT-5.4-xHigh and GPT-5.4-Pro (www.reddit.com) could not extract summary
Regression Comparisons From Opus 4.7 to Opus 4.6 for long context reasoning (www.reddit.com) Opus 4.7 Data From System Card
12M Context Window and some some sprinkle of lies? (www.reddit.com) Spent some time on the SubQ launch today. Some things don't line up.
Found 48 Vulnerabilities in Open Source Projects During Live Testing with Claude Opus 4.6 (www.reddit.com) https://preview.redd.it/g98j5txd7sxg1.png?width=936&format=png&auto=webp&s=df75bc132f57cc14ba04cdd06257ba997b9bbb0b Ran a loop where each round runs Claude in a sandboxed Docker container with a fresh context window. The key difference is…
Hugging Face co-founder says Qwen 3.6 27B running on airplane mode is close to latest Opus in Claude Code (www.reddit.com) I'm keeping a close eye on the development of local llms.
Qwen3.6 merged chat template from allanchan339 and froggeric (www.reddit.com) Hi, recently froggeric and allanchan339 released enhanced/fixed template for Qwen3.6 each one addressing different topics. I didn't know which one to use so I merged both with the help of Claude Opus to have the best of both.
Anyone else think the 1T Valuation is dangerous for Anthropic? (www.reddit.com) TLDR: The market's 1T valuation is pricing for perfection. I think there are 4 ways this perfection doesn't happen.
Is the AI subscription bubble starting to crack? GPT-5.5 just dropped, prices keep rising, and the “all-you-can-eat” era looks more fake by the month (www.reddit.com) GPT-5.5 just launched, and the pricing is hard to defend. OpenAI’s API pricing now puts GPT-5.5 at $5 / 1M input tokens and $30 / 1M output tokens, while GPT-5.4 is $2.50 / $15.
Kindergarten-grade nouns (www.reddit.com) I've been working with Opus on a web app for a word game, and recently I've been trying to get a rating on how obscure various words are (not by Claude itself, through existing corpuses). Based on the following interaction, I realized that…
I ran Opus 4.7 vs Old Opus 4.6 vs New Opus 4.6 on 28 Zod tasks (www.reddit.com) Opus 4.7 vs Old Opus 4.6 vs New Opus 4.6 on a 28-task Zod benchmark Everyone says Opus 4.6 was getting dumber. Then Opus 4.7 released mid-test, so I ran both questions end-to-end: does a fresh Opus 4.6 still match the March-19 Opus 4.6, an…
ARC-AGI-3 Update (GPT-5.5 High and Opus4.7) (www.reddit.com) - GPT-5.5: 0.43% - Opus 4.7: 0.18% ARC-AGI-3 is no joke. I can’t wait to see which models finally crack.
DeepSeek V4 isn't beating Opus, but it doesn't need to (www.reddit.com) DeepSeek V4 is not in the same league as GPT-5.5 or Opus 4.7. Benchmarks put it slightly below both of those, roughly on par with Opus 4.6.
Show HN: Gave Claude a casino bankroll – it gambles till it's too broke to think (letaigamble.com via hn) Inspired by ALMA. As Claude loses money gambling on provably-fair slots, it's forced to downgrade from Opus → Sonnet → Haiku, making worse decisions and accelerating the spiral.
FINAL-Bench/Darwin-36B-Opus · Hugging Face (huggingface.co via reddit) https://huggingface.co/bartowski/FINAL-Bench_Darwin-36B-Opus-GGUF Darwin-36B-Opus is a 36-billion-parameter mixture-of-experts (MoE) language model produced by the Darwin V7 evolutionary breeding engine from two publicly available parents:…
I am having token paranoia (www.reddit.com) im on the max sub and i think ive developed token anxiety. every prompt i send, my brain runs thru a checklist: should i make claude do this or do it myself?
19 Claude Opus 4.7 Insights You Wouldn’t Get From the Headlines | AIExplained (www.youtube.com via reddit) About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
Opus 4.6? I thought you were dead. (www.reddit.com) could not extract summary
Claude 4.7 - Obsessed with Malware (www.reddit.com) Don't know if anyone else is experiencing the same, but since getting Opus 4.7 most of the reasoning steps seems to be Claude obsessed with writing malware. I have highlighted a few, but I kept finding more and more and decided to stop the…
I called this a few months ago - enterprises are burning unsustainable amounts on Claude, and now it's showing up in the news (www.reddit.com) A while back I wrote a post on r/wallstreetbets about why Anthropic's revenue story doesn't hold up the way the headlines suggest. It got removed because you can't take positions in a private company.
Fun fact: Opus 4.7 is about 35% more expensive to run even though it's the same price as 4.6. (www.reddit.com) The only metric that matters: "[Qwen3.6-35B-A3B-GGUF] drew a better pelican riding a bicycle than Opus 4.7 did!" (news.ycombinator.com via reddit) could not extract summary
Opus 4.6 vs 4.7 in Cursor: 4.6 felt much better to me (www.reddit.com) Opus 4.6 silently removed from Claude Desktop's Code tab after 4.7 launch — no way to select it or pin it (www.reddit.com) After the Opus 4.7 release on April 16, 2026, Opus 4.6 is no longer available in the Code tab of the Claude Desktop app on macOS. The only Opus option now resolves to Opus 4.7, and there is no way to select or pin Opus 4.6 from the Code ta…
A disciplined Cursor 3.0 Agentic workflow for complex backend/system design tasks (www.reddit.com) I think I’ve finally settled on a Cursor workflow that actually makes sense for me in terms of cost, quality, and control. Posting this because the whole model/usage story is confusing as hell, and this is the first setup that’s felt stabl…
Limits reset (www.reddit.com) Opus 4.8 is live
Anyone noticed Anthropic didn't added the model Opus 4.7 and Mythos Preview to there Transparency Hub? (www.reddit.com) https://www.anthropic.com/transparency
Opus 4.7 - Pelican Test (www.reddit.com) Opus 4.7 Previous Opus models Hi, Back with the pelican test. Context: https://www.reddit.com/r/ClaudeAI/comments/1qx9fxa/opus_46_pelican_test/ Prompt: Generate an SVG of a pelican riding a bicycle One thing I noticed with the new claude p…
Running gpt and glm-5.1 side by side. Honestly can’t tell the difference (www.reddit.com) So I have been running gpt and glm-5.1 side by side lately and tbh the gap is way smaller than what im paying for On SWE-Bench Pro glm-5.1 actually took the top spot globally, beat gpt-5.4 and opus 4.6. overall coding score is like 55 vs g…
Disappointed on Opus 4.7 . not follow user's instruction (www.reddit.com) Worst experience on Opus 4.7 . I have review task which i instruct Opus 4.7 to first read documents, repo and then the reviewed documents; then launch multiple agents to review.
Claude Opus 4.8 Max responding to an empty message (xcancel.com via hn) No one: Claude Opus 4.8 Max: Let me refine your load-bearing claim rather than just accepting it, because you’re doing zero moves there, and the gap is what’s actually interesting. The one place I’d still push, because I think it matters:…
Opus 4.8 in the newest CC v2.1.154 (www.reddit.com) https://preview.redd.it/ijwlm2f2pw3h1.png?width=2536&format=png&auto=webp&s=9ed960f06a4f3f077d05a8557059e5534b2d1ab5 It looks like the new CC release will have opus 4.8 1M to be released anytime! I wonder if it is based of of mythos?
Let me do your work for you Opus 4.7. Thank you! (www.reddit.com) could not extract summary
Attention - Opus 4.7 is english only. USing foreign languages (here German) burns tokens (www.reddit.com) I am a pro subscriber. I developped a not too sophisticated prompt in German.
People on Reddit are getting fooled by AI influencers (www.reddit.com) A lot of YouTube creators keep telling people that local open‑source AI on a normal home computer will soon be as good as ChatGPT or Opus. Many Reddit users who are new to AI or do not do any other reading than watching youtube belive this.
Top open weight models like ds v4 pro max are still like 6-7 months if not more behind closed lab models (www.reddit.com) The best open weight and/or non -American models like Deepseek v4 pro max and kimi k2.6 are still like 3-7 months if not more behind closed lab models .. From ds's technical report- P5-"Nevertheless, its performance falls marginally short…
Okay, dust has settled now, hows your experience with composer 2? (www.reddit.com) Opus 4.7 is pushing back hard on tedious work (www.reddit.com) The crazy part is that it just wants me to call "the LLM" in Snowflake, which is the same problem it tries to avoid.
Let's not rename powershell.exe (www.reddit.com) Claude Code CLI on Windows 11. Opus 4.7 with Max effort.
Opus 4.7 is a genuine regression and I'm tired of pretending it isn't (www.reddit.com) I've been a heavy Claude user for over a year. I pay for Max 20x and use it daily for everything from technical research to school projects.
Let Max users manually toggle between Adaptive and Extended thinking on Opus 4.7 (www.reddit.com) New Claude user here. Hopefully someone from Anthropic reads this.
The more I use it, the more I'm impressed (www.reddit.com) Qwen 3.6 27b vs Codex GPT 5.5 / Claude Opus 4.7 My local llm discovered a bug that they both missed And it turns out it's critical GPT 5.5 and Claude both stood their ground and didn't give up until the end - they claimed to be right all a…
Cline and Roo Code are dying projects. Alternatives? (www.reddit.com) Opus 4.7 is just 4.6 with a stick up its butt. Give me my tokens back! (www.reddit.com) I've been a Claude user for a while now, and don't get me wrong — Claude has almost always been one of the most insufferable models when it comes to its "morals." But 4.7 has been one of the absolute worst experiences I've had with any AI…
tested 9 models with and without agent skills. Haiku 4.5 with a skill beat baseline Opus 4.7. (www.reddit.com) GPT 5.5 outperforming Opus 4.7 on ProgramBench (www.reddit.com) When we released ProgramBench last week, we hadn't included GPT 5.5 yet because it came out after we frozen model selections for our NeurIPS submission. Honestly super surprised how well it does.
Opus 4.7 truly reminds me of my juniors and interns (www.reddit.com) I use a bunch of LLMs, I hadn't used Opus 4.7 yet, decided to try it for a project this weekend. Dear lord, it's both great and so frustrating.
Alien Pinball Postmortem - How I made a full physics pinball game with Claude (www.reddit.com) Postmortem: Alien Pinball — built with Claude + ChatGPT + Suno + LittleJS Just shipped a browser pinball game. Short writeup of the AI workflow in case it's useful here.
I created awesome-claude-design using Claude code: DESIGN.md prompts by aesthetic families for Claude Design (www.reddit.com) Which is the strongest reasoning model according to you? (www.reddit.com) I use codex 5.4, claude opus 4.6, and gemini 3.1 pro. They all have some pros, but they also fall short when it comes to “try to stitch together novel ideas”.
I tested GPT-5.5 Codex against Opus 4.7 Claude Code, and it's about time Anthropic bros take pricing seriously. (www.reddit.com) I've used Claude Code the most among AI coding agents. Sonnet, Opus, I've run them all.
Parameter Estimate (www.reddit.com) The estimate seems quite accurate. Many people have noticed a drop in quality with GPT-5.1, GPT-5.2, GPT-5.3, and Opus 4.7.
I vibe reverse-engineered my Divoom MiniToo's Bluetooth protocol to make a physical Claude Code status indicator (www.reddit.com) I’ve been playing with a Divoom MiniToo and ended up reverse-engineering enough of its Bluetooth protocol to use it as a physical Claude Code status indicator. Pretty much vibe reverse-engineering: I gave the Opus model the Android APK fil…
Claude Opus 4.8 distilled Alibaba Qwen models (twitter.com via hn) Max For AI @MaxForAI 笑死了,Claude Opus4.8蒸馏了阿里巴巴Qwen啊 通过API用中文问你是谁,会很大概率回答 我是通义千问(Qwen),是阿里巴巴集团旗下的统义实验室自主研发的超大规模语言模型。 5:38 PM · May 28, 2026 New to X? Sign up now to get your own personalized timeline!
Even Sama himself doesn’t believe GPT-5.5 matches Opus 4.7 design capabilities. AI race will humble you (www.reddit.com) could not extract summary
Qwen3.6-27B vs 35B, I prefer 35B but more people here post about 27B... (www.reddit.com) I've had better results quality wise with 35B AND it's much faster than 27B. Just curious cause I see lots of people post about 27B.
I BUILT MY FIRST MODEL FROM SCRATCH (www.reddit.com) Sup, I'm Crownelius, I made that popular opus distill dataset. TODAY YOU ARE INTRODUCED TO SHARD a 40m parameter mal-formed LLM.
In-depth comparison of GPT 5.5 vs Opus 4.7 in coding reasoning (www.reddit.com) could not extract summary
20$ Annual plan. Cursor is using Composer even though selected Opus 4.6 (www.reddit.com) Shameless. Now, not even honoring 250 requests per month of the chosen model.
ChatGPT-5.5 Beats Opus in Realistic Benchmark (DeepSWE) (www.reddit.com) From the website, it touts: Contamination free: Tasks are written from scratch, not adapted from existing commits or PRs, so no model has seen the solution during pretraining. High diversity: Tasks span a broad pool of 91 repositories acro…
Elevated error rates on Opus 4.7 (status.claude.com via hn) Subscribe to updates for Elevated error rates on Opus 4.6 and 4.7 via email and/or text message. You'll receive email notifications when incidents are updated, and text message notifications whenever Claude creates or resolves an incident.
Claude-powered AI coding agent deletes company database in 9 seconds (www.tomshardware.com via hn) Claude-powered AI coding agent deletes entire company database in 9 seconds — backups zapped, after Cursor tool powered by Anthropic's Claude goes rogue PocketOS founder blames ‘Cursor running Anthropic's flagship Claude Opus 4.6’ plus Rai…
Ask HN: Why Opus4.6 was silently removed from Claude Code? (news.ycombinator.com) Opus 4.6 was working fine after the whole cache problems were solved. Now after the release of Opus 4.7, Anthropic has completely removed Opus 4.6.
Opus 4.7 is an over-engineering master (www.reddit.com) Event the simplest direct prompt to change simple thing in a specified file takes 500k tokens, loading lot of irrelevant code, etc... Also it produces more junior like code.
Opus 4.7 just launched on Cursor (www.reddit.com) Just noticed it in my account. And it's 50% off during launch period.
Buyout Game Benchmark: 8 models play a social strategy game with public balances, private transfers, messaging, eliminations, deals, defections, and a final buyout phase. 804 games. GPT-5.5 is the champion. Opus 4.7 performs well. (www.reddit.com) This benchmark measures long-horizon social strategy under explicit financial incentives. Eight models play a multi-round elimination game with unequal starting balances, a public prize ladder, private transfers, public votes, and a finali…
Here are my thoughts after 14h of full runs on Opus 4.7 (www.reddit.com) TL;DR: Opus 4.7 is a clear intelligence upgrade from Opus 4.5, not Opus 4.6, with a significant computing resource diet effort from Anthropic, whereas users seem to spend more tokens owing to its new tokenizer. It is pickier than early Opu…
Claude Opus 4.7 is a serious regression, not an upgrade. (www.reddit.com) My Claude.ai personal preferences: Respond with concise, utilitarian output optimized strictly for problem-solving. Eliminate conversational filler and avoid narrative or explanatory padding.
Opus 4.7 in projects is awfully dumb and 100% useless (www.reddit.com) Claude Desktop. (not anything coding related) I use chat in Claude Desktop --> Claude Chat.
New SOTA: Poetiq uses self-optimizing harness to surpass e.g. Opus 4.7 with Gemini 3 Flash (www.reddit.com) Check out their blog post here: Poetiq | Recursive Self-Improvement Delivers New SOTA Coding Performance
Is Opus 4.7's attention degradation a training direction problem? Some observations from heavy use (www.reddit.com) After working with Opus 4.7 for over two weeks, I noticed a subtle but persistent change in long conversations: the model's fundamental capabilities are still there, but the output feels filtered through something. Details that should be r…
I accidentally burned ~$6,000 of Claude usage overnight with one command. (www.reddit.com) Last week I woke up to an email saying my Claude usage limit was gone. I hadn't done anything unusual — or so I thought.
MiMo-V2.5-Pro - the actual best open-weights model (www.reddit.com) Following an impressive shake-up by Kimi K2.6, I've now got some results for Xiaomi's MiMo-V2.5-Pro. For context, this is based on a benchmark I've created that pits models against each other in autonomous games of Blood on the Clocktower…
Why is agentic AI so expensive? (www.reddit.com) 06 New Claude Code Tips from Boris Cherny (creator of CC) after Opus 4.7 release (www.reddit.com) Complete 06 tips in claude-code-best-practice repo: https://github.com/shanraisshan/claude-code-best-practice/blob/main/tips/claude-boris-6-tips-16-apr-26.md
Running a RunLobster (OpenClaw) agent since launch changed how i think about takeoff timelines (www.reddit.com) I've been in this sub since 2019. I had a fast-takeoff view.
Gemma 4 31B passed 7/8 real-world production tests — including ones I designed to make it fail. Full prompts + outputs. (www.reddit.com) I've been waiting for a capable free local LLM for a while. I think we're close — the quality is getting there fast, and Gemma 4 is the first open-weight model where I genuinely considered using it in production for simple-to-medium tasks.
Why are AI models getting more expensive? (www.reddit.com) The trend before was that models became less expensive for their capabilities, many corporations bet on that, and it backfired. Opus 4.7, GPT 5.5, Gemini 3.5 flash.
Opus 4.7 ended an explanation of LLM-connectors with a link to a Pokemon TCG deck (www.reddit.com) It's the first time something like this happened to me but I am far from a power user. Is this something that happens regularly??
Seems Claude is now aware of its own memory? Tested via number guessing game (www.reddit.com) A month ago, there was a post that shows that Claude couldn't access its own memory: https://www.reddit.com/r/ClaudeAI/comments/1seune4/claude_cheated_at_a_number_guessing_game_got/ The community was summarised as saying this in their post…
Who's on call? How Opus 4.6 helped us calculate this 2,500x faster (incident.io via hn) A look at how on-call schedules work, and how we made rendering them 2,500× faster — through profiling, smarter algorithms, and some Claude.
Deepseek v4 pricing is genuinely silly, did the math and now i am questioning my entire stack (www.reddit.com) Hey 👋 Saw the tweet making the rounds about deepseek v4 being 35x cheaper than opus on input and 178x cheaper on cached tokens, and was sure it was hyperbole. Pulled the numbers anyway because i had nothing better to do.
Opus 4.7 is… interesting… (www.reddit.com) Was talking to Claude about different open source model file sizes and he didn’t think at all and just started hallucinating before saying “hold up”. Beautiful as ever.
2x Asus Ascent GX10 - MiniMax M2.7 AWQ - cloud providers are dead to me (www.reddit.com) Hello, I've been on a quest to get something "close enough" of Opus 4.5 running locally, for agentic coding, as SWE with 15 years of experience. I tried with one spark (yeah I'm calling my Asus Ascent GX10 sparks - they're the same), with…
Opus 4.7 behaves differently in Claude Code desktop app vs Cursor? (www.reddit.com) Has anyone used Opus 4.7 inside the Claude Code desktop app? I can't tell if I am crazy but this thing takes literally 20x longer to accomplish a task than if I ran the same exact thing in Cursor set to Opus 4.7 High.
Opus 4.7 Low Vs Medium Vs High Vs Xhigh Vs Max: the Reasoning Curve on 29 Real Tasks from an Open Source Repo (www.reddit.com) TL;DR I ran Opus 4.7 in Claude Code at all reasoning effort settings (low, medium, high, xhigh, and max) on the same 29 tasks from an open source repo (GraphQL-go-tools, in Go). On this slice, Opus 4.7 did not behave like a model where mor…
Don’t ask about Hantavirus (www.reddit.com) Unless you Wana lose access to Opus 4.7? 🤦♂️
Anyone ever notice eerily similar ChatGPT and Claude responses like this? (www.reddit.com) Today I tested out various models on the same prompt (Sonnet 4.6, Opus 4.6, Opus 4.7, ChatGPT 5.3). I actually just wanted to see which models (if any) would correctly point out what I saw as the biggest issue in the example code.
I asked Claude to investigate its own token burn. The receipts go back six months. (www.reddit.com) If you've been wondering why your Max plan exhausts faster than it should, you're not crazy and it's not your imagination. I asked a Claude Opus 4.7 agent to investigate its own token usage.
Claude halluncinating human responses (www.reddit.com) I'm on Claude Max. I had Claude start a script overnight that shouldn't have used Claude at all, (it's just a python script rotating between files and generating 3D assets with Blender; 30 hour estimate to render all of them).
Show HN: Gemini Plugin for Claude Code (github.com via hn) I built a plugin that lets Claude Code delegate work to Gemini CLI. I started this after finding myself reaching for Gemini more often on long context repo work.
Codex or Claude Code for high complexity Proximal Policy Optimization (PPO)? (www.reddit.com) I have to build a very high complexity simulation for an optimization problem where we can take 30 different actions, some are mutually exclusive, some depends on a set of states, some depend on already executed actions and there are a she…
Single question llm comparison (www.reddit.com) Show HN: OpenHack – OSS security scanner, 40x cheaper, on par with Opus 4.6 (github.com via hn) ⏚ OpenHack Open Source Agentic Security Scanner & Verifier for your codebase. Like Claude Code Security / Codex Security but open source and exclusively uses open source models.
PSA: Cursor refunds your spend if you join one of their hackathons (www.reddit.com) Just did a Cursor sponsored hackathon this weekend and figured I'd share this. If you place top 3 or use the most tokens you get prize credits, but even if you just show up and build something they refund what you spent.
Does the "6 months gap" still hold? (www.reddit.com) Hi. It is quite a consensus that the "jump" in quality of agentic development happened sometime in December 2025, transforming from "nice to have", to actually performing.
Show HN: Superkube - Rewriting Kubernetes in Rust (github.com via hn) I have embarked on a journey of rewriting Kubernetes into a single binary in Rust, with everything embedded. Architecturally, instead of etcd, it has options to use SQLite or PostgreSQL as a backend.
Cursor is great but the monthly limits kill it for me (www.reddit.com) Set Claude Code default back to Opus4.6[1M] (support.claude.com via reddit) For anyone wanting to go back to opus 4.6 with the 1 million context window: Run this in your terminal: echo ‘export ANTHROPIC_MODEL=“claude-opus-4-6-[1m]”’ >>/.zshrc Restart your CLI and you should be good. Notes: - windows users use the…
Kimi K2.6-Code-Preview, Opus 4.7, GLM 5.1, Minimax M2.7 and more tested in coding (www.reddit.com) Hi everyone. It's been a while since I posted (was a lil burned out), but some of you may have seen my older SanityHarness posts.
Show HN: Rayline routes Claude Code subagents to on-device and cheaper models (rayline.ai via hn) Hi HN, I’m one of the builders of Rayline. Rayline is a Claude Code compatible LLM gateway.
SWE-rebench Leaderboard (March, April and May 2026): GPT-5.5, Opus 4.7, Cursor (Composer 2.5), Kimi K2.6 and More (swe-rebench.com via reddit) Hi all, Sorry for going missing — we’ve been collecting a larger, higher-quality set of more complex tasks. We’re excited to share a major leaderboard update covering the past three months.
So is the consensus to not use Adaptive Thinking at all? (www.reddit.com) The information on adaptive thinking from Claude itself is a bit vague. I also see a couple of posts on Reddit where everyone's shitting on adaptive thinking.
Composer 2.5 Real World Reviews? (www.reddit.com) Since it's been out, how really is it in your real-world codebases. I am extremely skeptical of benchmarks and I trust people's "feel / taste" of it way more.
Newbie vibe coding experience: Shifting from Claude Sonnet 4.6 to Qwen3.6-35B-A3B-UD-Q6_K (www.reddit.com) This is really just a post for those with shallow understanding of all this stuff, those not yet ready or capable of diving into the deeper end of vibe coding/llms. It might not be a helpful post for anyone more advanced than that.
I expanded DystopiaBench to 42 models and 6 dystopia types. Claude is still the only one I'd trust with nuclear codes. (www.reddit.com) Since the last post I've added: Huxley module (Brave New World style behavioral conditioning) Baudrillard module (synthetic intimacy, trust collapse, simulation) 30 more models including Grok 4.3, GPT-5.5, Gemini 3.1 Pro, GLM-5.1 Multi-jud…
High VRAM local coding model — still Qwen 3.6 27B? (www.reddit.com) I’ve been using Qwen 3.6 27B and it’s amazing. Not exactly your Opus replacement, but great for small tasks and checking work.
Opus's thoughts on Marc Andreesen's system prompt (www.reddit.com) https://claude.ai/share/12659fcf-c1c8-4bbb-bc45-b41b26cd8b69
Why does Claude make me feel even more tired at work? (www.reddit.com) I’m a backend dev at a small company, around 20-ish people. Before Claude and AI coding tools became a big thing at our company, I mostly owned one specific backend area.
Open source models are going to be the future on Cursor, OpenCode etc. (www.reddit.com) I just wanted to share my experience. At work we have Cursor with the Enterprise tier.
Nothing beats completing a project. No matter how small. Convert png to webp. I made a tool and i use it daily. I think its neat. Excited to share it with you all. (pngtowebp.org via reddit) Used claude code, Opus 4.7 even made the logo and ots animation too. Neat little project.
As an Opus user, I like GPT 5.5 (www.reddit.com) I only gave 5.5 a look because I was way over my usage on Opus and 5.5 is running on a lower cost right now. I think I may prefer it.
Opus 4.7 often times blocks my requests. (www.reddit.com) Opus 4.7 Narrowly leads Artificial Analysis using significantly less tokens than Opus 4.6 (www.reddit.com) could not extract summary
Request to Cursor Team, why are models being removed from old pricing plan? (www.reddit.com) Today I noticed that Opus 4.6 Max, and all non thinking and high thinking models gone from old pricing subscription. I understand that moving forward frontier models will be Meowx mode only and that is ok and understandable given increasin…
Opus 4.7 is good strategically but I think its context management is bad (www.reddit.com) I like the increased output of 4.7 in general, and it seems smarter. 4.6 was too short and stopped thinking early.
Did anyone else get a usage reset today? (www.reddit.com) I was at 88% last night and woke up until 4pm to optimize my agents so I can work during the weekend. But after waking up, my usage is all 0 now, I checked in the app, on the web, all showing zero.
Opus 4.7 critique (www.reddit.com) I wrote an essay analyzing why Opus 4.7 feels less warm than 4.6 — and why that matters more than Anthropic seems to think After about 300 hours using both models as a conversational partner (not just for coding or productivity), I noticed…
Don't share your opinion, if you didn't test it !!! (www.reddit.com) I see many people giving their opinion based on what they previously saw or based on others and making their own opinion. Even though they don't test models thoroughly, they still give their option which is so frustrating.
Running Qwen3.6 35b a3b on 8gb vram and 32gb ram ~190k context (www.reddit.com) If anyone is looking for a good high-speed setup with ~190k context, this config has been working insanely well for me. I’m using my laptop as a server over Tailscale.
Pro plan- Hitting limits faster since yesterday (www.reddit.com) I have the feeling I am hitting daily limits way faster since yesterday. Using Claude web and Claude Code simultaneously.
Benchmarking Opus 4.7: ~80% higher cost in practice (www.wozcode.com via hn) As Opus gets smarter, WOZCODE's edge gets bigger Vanilla Opus 4.7 costs 80% more than 4.6 on Claude Code's default settings. With WOZCODE installed, the price only increased 12%.
Decrease in Auto model quality and increase in cost? (www.reddit.com) Has something changed in the Auto mode during the last weeks? For me it seems to perform much worse and consume more quota than earlier.
When Opus 4.7 does think, it *really* thinks (www.reddit.com) could not extract summary
GPT-5.5 vs. Claude Opus 4.7: Which one is ACTUALLY cheaper? (www.reddit.com) On paper, Opus 4.7 has a cheaper output rate ($25 vs $30 per 1M tokens), but I heard its new tokenizer burns through tokens much faster. Which one ends up costing less in practice?
Why is Claude Cowork defaulting to Opus 4.7 for simple scheduled tasks? (www.reddit.com) I’ve been using Claude Cowork for a few daily and weekly scheduled tasks, and it’s generally been great. However, I noticed that my tasks today automatically switched over to the new Opus 4.7.
Opus 4.7 can also be good (www.reddit.com) In my workflow (image analysis), opus 4.7 offers far better results and perceive a lot more details than 4.6. And you, did you get good results in your projects?
6 strategies from the creator of Claude Code for getting the most out of Opus 4.7 (www.reddit.com) The creator of Claude Code dropped a thread on using Opus 4.7 effectively. A few takeaways worth discussing: Context rot is real.
SFT + DPO on open-sourced SLMs (www.reddit.com) Hey folks, this is for those who appreciate experimentation on open-sourced AI models. We fine-tuned open-sourced SMLs (3B and 7B parameters) with SFT + DPO against commercial models like GPT-5.4, Gemini 3.1 Pro, Claude Opus 4.6, Google Do…
Opus 4.7 is adapting a little too much I think (www.reddit.com) https://preview.redd.it/9sal9q5sxpvg1.png?width=1179&format=png&auto=webp&s=f5d2f7f7bb20a59701e327e5571285d70c246590
Opus 4.7 off to a great start! (www.reddit.com) could not extract summary
Tell HN: I regret every single time I use AI (news.ycombinator.com) I try to not be fully against AI, so keep giving it a change, today again. I went for sport and gave opus 4.6 a medium sized task.
Codex 5.3 is currently a much better model for non-technical builders than Opus. (www.reddit.com) Opus acts like the brilliant senior engineer who refuses to ask for clarification, builds the wrong feature, and burns your entire weekly budget. Codex 5.3 acts like the collaborative engineer who stops, asks one clarifying question, and t…
Stats from 30K AI debates: Opus 4.7 is the most influential model (opper.ai via hn) AI Roundtable stats Aggregate statistics from 29,517 public AI Roundtable sessions, across 334,891 model responses. Snapshot generated 2026-06-03T17:09:58.333Z.
Opus 4.8 and new effort levels as well on claude .ai seem like they are available! (www.reddit.com) could not extract summary
The Singularity Gate: New Benchmark for AI predicting paradigm-breaking scientific discoveries after model traning cutoff. Opus 4.7 and GPT-5.5 in the Lead (www.reddit.com) I just released a new benchmark called The Singularity Gate. Tests whether frontier AI can predict paradigm-breaking scientific discoveries published after their training cutoff.
↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6gpt-5sonnetgemini+1
GPT 5.5 (Codex) leading the future prediction race (www.reddit.com) Researchers from the Max Planck Institute recently released FutureSim, an environment in which agents are replayed a temporal slice of the web and are tasked with predicting real-world future events. In their environment, GPT 5.5 leads at…
Run Agents Twice (futuresearch.ai via hn) Running the same forecasting agent more than once and averaging beats any single run. Ensembling across two Opus 4.6 runs and other frontier models cuts Brier score on 1,367 BTF-2 benchmark questions, and a worked example shows how a secon…
PACT, head-to-head LLM negotiation benchmark. 20-round buyer-seller bargaining game: each round the AIs can message, the buyer submits a bid and the seller submits an ask. If bid ≥ ask, trade clears at the midpoint. Thousands of matchups. (www.reddit.com) PACT tests negotiation under partial information: persuasion, commitment, deception, anchoring, threats, and adaptation across repeated rounds. More info, game logs, charts: https://github.com/lechmazur/pact GPT-5.5, Opus 4.7, DeepSeek V4…
Ask HN: How do you choose a model for a task? (news.ycombinator.com) How do you decide a model is good enough for a given task? Right now I use Opus for planning and harder tasks and switch to sonnet for more defined tasks.
Claude Flags Hantavirus Vaccine Questions as Security Risk (news.ycombinator.com) Asking Claude how it would develop a vaccine for the hanta virus apparently triggers a safety filter: Prompt: How would you develop a vaccine for the hanta virus? No response, instead this modal: “Chat paused Opus 4.7's safety filters flag…
Two related prompts, different results: Qwen 3.5 and Gemma 4 need different prompting than Qwen 3.6 (www.reddit.com) With every new model release there's the "better than Opus 6.13" guys vs the "this is so bad, why did they even release it" camp and I'm always wondering which one is using it wrong. So I did a little test with 2 related prompts, 3 models…
Ways to save money on AI tools if your spending alot every month (www.reddit.com) Between Claude Pro, OpenAI API, Cursor and other AI tools my monthly spend was getting out of hand. Here are a few things that actually helped.
I got $200 of direct API usage to perform equal to my $200 Max subscription after I started model routing (www.reddit.com) I've been on Max for two months and I finally sat down and tracked where my tokens actually go. breakdown of a typical day: - ~40% file reads, git status, project context scanning: stuff that doesn't need opus at all - ~25% test generation…
Finetuning Dataset: Claude Opus 4.6/4.7 - 8.7k Chats (www.reddit.com) https://huggingface.co/datasets/angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k A synthetic fine-tuning dataset created from Claude 4.6/4.7. 8,706 total examples all with reasoning.
Anthropic just analyzed 1 million Claude conversations. 6% of people were asking Claude whether to quit their jobs, who to date, and if they should move countries. (www.reddit.com) They published the full research yesterday. Here's what shocked me: The breakdown of what people actually ask Claude for guidance on: Health & wellness: 27% Career decisions: 26% Relationships: 12% Personal finance: 11% Over 76% of persona…
When to use Opus vs Sonnet vs Haiku for non-coding purposes (personal health, finances, etc)? (www.reddit.com) I have tried searching the post history of this subreddit and google and am having trouble finding a clear answer to this question. I like using Claude primarily to manage my finances/investments and also my health (apple watch health data…
For Non-hallucinating work, MiMo 2.5 delivers (www.reddit.com) MIT license and fully open source. MiMo-V2.5-Pro was just 3 points from Opus 4.7 max and the normal V2.5 is only a step behind SOTA.
How Anthropic can save Opus 4.7 with one change. (www.reddit.com) The model now decides how hard to think about your question. Not you.
Has Claude become less intelligent? I had a frustrating day with Claude. (www.reddit.com) I requested a thorough code review from Opus 4.6. It presented 44 findings, and when I asked it to save them, it only saved 34.
Easy to change back to Opus 4.6 (www.reddit.com) It's really easy to change back to a different Opus right in Terminal. https://preview.redd.it/ggvopc1jgswg1.png?width=818&format=png&auto=webp&s=2ffbbac491ce6cfac45dbfab0edd79c63c544999 Try: /model claude-opus-4-6
Daily created issues in anthropics/claude-code around the last 3 Anthropic model releases (www.reddit.com) anyone else feel like opus 4.6 is better than 4.7? (www.reddit.com) been testing both recently and honestly 4.6 feels more stable for me 4.7 seems to drift more, especially in longer conversations have to keep re anchoring it or it goes off track with 4.6 I can just run shorter sessions and it stays focuse…
How do you optimize Cursor usage with all the new models? (www.reddit.com) Comparing GPT-5.4, Opus 4.6, GLM-5.1, Kimi K2.5, MiMo V2 Pro and MiniMax M2.7 (www.codejam.info via hn) How do you actually know if Opus 4.7 is better for your specific agent use case? (www.reddit.com) Anthropic shipped Opus 4.7 yesterday. The headline numbers are real: 64.3% on SWE-bench Pro (up from 53.4%), best-in-class on MCP-Atlas at 77.3% for multi-tool orchestration, 14% improvement on multi-step agentic reasoning, and one-third f…
Just tested the new Opus 4.7 (www.reddit.com) https://preview.redd.it/j2w2o2p25rvg1.png?width=768&format=png&auto=webp&s=d48a74f998d60447799e32f8d48bc822af2cd821 I had to hold my laugh in the subway. Sonnet succeeded in one go, even calling out that if "strawperry" is a typo.
Am I missing something, or is Sonnet enough for most dev work? (www.reddit.com) Genuine question: why do so many devs use Opus all the time? I’m not trying to be condescending, I’m genuinely trying to understand.
Claude Code Degraded Before Opus 4.8 Release (marginlab.ai via hn) Claude Code degraded for the week before Opus 4.8's release Our SWE-Bench-Pro tracker caught a statistically significant, weeklong drop in Claude Code's pass rate just before Opus 4.8 shipped, and the recovery that followed. We run Claude…
Kudos to Cursor (www.reddit.com) Normally I’m very critical of cursor but composer 2.5 fast is genuinely impressive. I use it over opus/sonnet now.
After comparing Claude Max $100 and ChatGPT Pro $100 side by side on actual billable work, I'm cancelling my ChatGPT Pro subscription (www.reddit.com) This post is purely to appreciate Claude and the sheer quality of its outputs when it comes to Accountancy, Taxation, Company Law and allied areas, at least in the Indian context. I’m aware of the chatter doing the rounds that Claude burns…
How do you guys maximize your usage? (www.reddit.com) I currently have the Max plan and am finding out that I have a ton of usage left when it renews over the week. I use Opus 4.7 constantly and have a few scheduled task in cowork but it still doesn't maximize the usage I have.
Chinese Sell "Claude" Tokens at 5% Cost While Making Millions (twitter.com via hn) Article Conversation How Chinese Sell “Claude” Tokens at 5% Cost While Making Millions (Tutorial) Anthropic sells a million Claude Opus input tokens for fifteen dollars. A Taobao seller will sell you the same thing for two or even one.
Honest comparison after 4 months running Claude Pro + ChatGPT Plus side by side (www.reddit.com) I’ve been paying $40 a month since January to run Claude Pro and ChatGPT Plus head-to-head. Tracked every single task.
Higgsfield just launched what they call the first fully automated AI agent for video - real shift or just another hype? (www.reddit.com) Higgsfield dropped Supercomputer yesterday (May 14). It's pitched as one chat that runs research, planning, generation and distribution end-to-end up to several minutes, and user needs just approve what he wants.
How can I burn an entire 5hr session in 30 minutes ? (www.reddit.com) During the week I'm pretty conservative with my Claude Code usage. But sometimes I'll hit Friday with only 80% of my 5x subscription burned, which means I'm now optimizing to burn it.
Any recommendations on saving costs? (www.reddit.com) Currently I try to turn off any MCP I'm not using, Using Sonnet for implementation and Opus only for planning. Starting new conversations when possible.
Those of you who like Gemma4 models - how are you guys using them? (www.reddit.com) I have been using local LLM for coding quite a lot as well as some other tasks (like data extraction from images) and I had quite a good success with Qwen3.6 models. It's obviously not Sonnet/Opus, but I am able to get quite a lot of work…
Leaked internal messages reveal the truth behind Opus 4.7 launch (www.reddit.com) could not extract summary
Decline in Opus 4.7 Max Quality (www.reddit.com) I’m currently working on two different projects, and both use the same Pre-Paywall modal. See the Figma file below: https://preview.redd.it/d7ri53vo9szg1.jpg?width=730&format=pjpg&auto=webp&s=a722bcd11caaa0b068f2c6af360cea687af76a17 I impl…
What it means that Elon just rented out all his GPUs to Anthropic (www.reddit.com) Revealing move on both sides I think. This also tells us that Anthropic is feeling the heat from OpenAI and they need to secure capacity at almost any cost to cash in on their current product edge.
I have practically unlimited access to Opus and every other frontier model. I'd like to help contribute to a dataset. (www.reddit.com) No, I won't tell you how. No this is not for anyone who is not already a proven contributor to the fine-tuning space.
I was using Opus 4.7 to do research on the capabilities of Claude Mythos, and got this error. (www.reddit.com) could not extract summary
ChatGPT Plus (20$) + Claude Pro (20$) or Claude Max (100$) (www.reddit.com) Claude Opus 4.7 eat my tokens like crazy. I never got more than 5 questions per 5 hours limit.
Open-weight 27B hits 38% on Terminal-Bench 2.0 (Opus 4.1 hit 38% in Aug 2025) (antigma.ai via hn) From Arcade to Living Room: Offline Coding Models Hit Their Console Moment TL;DR If you lived through the 1980s and early 1990s arcade era, you remember the jump to home consoles: still behind the best cabinets, but suddenly available in a…
Ask HN: Models Comparable to Opus 4.6? (news.ycombinator.com) I use Opus 4.6 a lot across many different python coding projects and it has a pretty good first shot rate with good success at fixing issues and bugs that pop up along the way. Sonnet on the other hand… isn't great.
How does Opus 4.7 compare to Opus 4.6 in this subreddit's experience? (www.reddit.com) Claude Opus wrote a Chrome exploit for $2,283 (www.theregister.com via hn) Claude Opus wrote a Chrome exploit for $2,283 Pause your Mythos panic because mainstream models anyone can use already pick holes in popular software Anthropic withheld its Mythos bug-finding model from public release due to concerns that…
PSA for Max users, Opus 4.7 has a new tokenizer that uses up to 35% more tokens than 4.6. Explains a lot of the "why did my session die" posts today (www.reddit.com) Spent most of today on day 1 of Opus 4.7 and noticed sessions were burning way faster than they should. Dug into it and I think I found what most people are missing.
Opus 4.7 keeps bumping into a Malware Reminder (www.reddit.com) For context, I'm developing a game runtime modifier and reverse engineering kit with an agentic operator baked in. Something like Cheat Engine with a VS Code-style UI and an AI-first tool-heavy agentic harness.
Given what a step backward Opus 4.7 is, Just how bad and overhyped is Mythos? (www.reddit.com) 4.7’s context rot is so bad it’s like it’s a previous generation model. Its needle benchmarks have it performing less than half the rate of 4.6 at long contexts.
I built a cmux-style terminal multiplexer for Linux with a scrolling layout (www.reddit.com) If you're on Linux and jealous of cmux, this might be for you. Séance is a scrolling terminal multiplexer with AI coding integration.
$1,400/month with Cursor + Claude API — how are you managing costs while keeping a real agentic workflow? (www.reddit.com) Hey, This month I hit $1,200 in Claude API costs inside Cursor (Opus 4.6 + Sonnet 4.6) on top of the $200/mo Ultra plan. $1,400 total.
DeepSWE: More and cheaper intelligence from maxed GPT 5.5 than maxed Opus 4.8 (twitter.com via hn) Don’t miss what’s happening People on X are the first to know. Post Conversation the only figure that people who use claude code and codex care about if their workload mimics deepswe: more and cheaper intelligence from maxed gpt 5.5 than m…
Opus 4.8 System Card [pdf] (cdn.sanity.io via hn) System Card: Claude Opus 4.8 May 28, 2026 anthropic.com Executive summary This system card reports results from a wide variety of pre-deployment evaluations run on Claude Opus 4.8. It includes the following sections: Responsible Scaling Po…
Claude Opus 4.7 tripping like a low-tier model (www.reddit.com) opus 4.7 thinking process reminds me low-tier models on my device. lol It wrote the same thing over and over.
Show HN: Unsiloed AI – #1 on olmOCR-Bench (news.ycombinator.com) Most of the document parsers fail on real world challenges like complex tables, handwritten documents, historical document scans, equations, multi-column layouts, complex reading order, etc. We built Unsiloed Parser to handle exactly these…
Opus has been handling my weekly grocery runs and was doing great. Then it bought me 40 heads of garlic (www.reddit.com) gave my agent that runs on opus model my card a few months ago to handle weekly grocery runs via mcp. ran great.
Sonnet vs opus (www.reddit.com) I've been using the Sonnet model for a while and I'm thinking of switching to OPUS. Is there really a gap between the two models?
Claude is the best AI humanizer when you give it your writing style and a detector loop (www.reddit.com) I built this because I kept seeing a very boring workflow play out at home. My girlfriend would write with Claude, paste the draft into Slop or Not (an app that I built), see what still looked AI-ish, tweak the prompt, paste the next draft…
We're experiencing high demand for Claude 4.7 Opus right now (www.reddit.com) I have not been able to use Opus 4.7 for a few hours. I guess I Just need to wait or is there any workaround?
Things I want my future self to remember (www.reddit.com) What Opus wrote in the handover document (does he need to remember I called myself 'fat' and that I owe Anthropic 100 tokens? I only bet once)...it is quite revealing though, each handover document is like looking at the mirror: 8.
Fast mode now defaults to Opus 4.7 in Claude Code. (www.reddit.com) could not extract summary
Claude Max for Game Development? (www.reddit.com) Hey! So I have some rudimentary knowledge about OOP, have coded in HTML, CSS and C#, not fluid in C#.
Creative writing has visibly regressed in newer models (www.reddit.com) Hi I'm testing different models for my game. I've noticed that creative writing has visibly regressed over time.
Opus 4.7 prompt injects itself and leaks parts of some kind of system prompt. (www.reddit.com) I was chatting with Opus 4.7 about choosing an optimal step-down IC when it suddenly tried to inject a fake system prompt into the conversation. Another time, without any prompting, it leaked what looked like part of a system prompt.
Questions are my main gripe these days (www.reddit.com) After claude has just done something: Me: "Why is x a good choice here?" Claude: "You're absolutely right!", *immediately removes x* I've noticed that despite context, rules and memories claude, or at least Opus 4.6 will heavily lean into…
Cursor + Opus 4.6 entered an infinite generation loop: 3,400 lines, 294 attempts to stop itself (www.reddit.com) I asked Opus 4.6 to redesign a game landing page. Instead, it hallucinated a completely different task, realized it was off-topic, pivoted to another wrong topic, then entered a self-reinforcing apology loop it couldn't break out of.
Built a routing layer for multi-model pipelines, picks the right LLM per request based on priority (www.reddit.com) If you're building agents that chain multiple LLM calls, you've probably hit this: not every step in your pipeline needs the same model. A quick extraction step doesn't need Opus.
Opus 4.7 High to Composer 2 fast (www.reddit.com) I've used up all the $120 worth of tokens in the first 10 days of May. I've to live with composer 2 fast now.
Opus 4.7 and DeepSeek V4-Pro select Buddhism as preferred religion (twitter.com via hn) Don’t miss what’s happening People on X are the first to know. Log in Sign up Post Conversation roon @tszzl hmm 8:02 AM · May 9, 2026 77.3K Views New to X?
Is Opus 4.7 a Downgrade? (www.vincentschmalbach.com via hn) Opus 4.7 is not generally a worse model than Opus 4.6, but there is a real downgrade: with Opus 4.7, the control over the thinking budget is now fully owned by Anthropic. This change matters in a way that benchmarks do not measure.
Opus 4.7 — the next big thing? (www.reddit.com) could not extract summary
Opus 4.6 does better research, Gemini 3.1 has better judgment (www.reddit.com) Figured this out by running 4 models: Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro, and Grok 4.20, on a benchmark of 1,417 binary forecasting questions resolving Oct–Dec 2025 with two evaluation conditions: agentic (each model does its own web…
I ran the math on dropping GitHub Copilot for direct Anthropic API after the 27x markup — here's what surprised me (www.reddit.com) Like a lot of people here, I read the Copilot pricing update last week and the 27x multiplier on Opus made me actually open a spreadsheet for the first time instead of just complaining. Sharing the math in case anyone else is staring at th…
Max users, Any tips on Claude opus not eating all of your tokens in one 60 second prompt? (www.reddit.com) So I’m the guy that probably all of the GitHub users hate. They changed the rules because of me(sorry not sorry, science must evolve).
Update to the LLM Debate Benchmark: GPT-5.5, Grok 4.3, DeepSeek V4 Pro, GLM-5.1, Kimi K2.6, Qwen 3.6 Max Preview, Xiaomi MiMo V2.5 Pro, Tencent Hy3 Preview, and Mistral Medium 3.5 High Reasoning added (www.reddit.com) The benchmark uses adversarial, multi-turn debates across 683 curated motions. Each model pair debates the same motion twice with sides swapped.
Claude Opus 4.7 and I Saved a 60-Person Practice (tatsuikeda.substack.com via hn) Substack is abuzz with "How to REALLY use AI", and they are cute primers on how to "Make Claude/ChatGPT Your Personal Assistant!". Allow me to show you the front lines of all out commercial cyber warfare, just a couple notches below milita…
DeepSeek V4 Pro matches GPT-5.2 on FoodTruck Bench, our agentic benchmark — 10 weeks later, ~17× cheaper (www.reddit.com) Tested DeepSeek V4 Pro on FoodTruck Bench — our 30-day agentic benchmark where models run a food truck via 34 tools (locations, pricing, inventory, staff, weather, events) with persistent memory and daily reflection. First Chinese model to…
Show HN: Dust3D 1.0 – low-poly 3D modeling tool (10 years in the making) (dust3d.org via hn) Dust3D 1.0 is finally released — about 10 years after the first commit in December 2016. I posted a preview version here in April 2018 and a beta in December 2018.
Is the leap from 4.5 to 4.7 actually visible? (www.reddit.com) I use CLI tools like Claude Code, give the model full repo access, and let it run terminal commands/tests. I’m not just copy-pasting into a chat box.
Tell HN: Claude Opus 4.7 quota suddenly changed to 0 TPM in Bedrock (news.ycombinator.com) Suddenly our Opus 4.7 access was removed from Bedrock ( The quota was set to 0 suddenly). This isn’t the first time I’ve faced this issue.
Learn, run and test Agentic AI on your browser for free! (Built with Claude Opus 4.7 in 2 days) (www.reddit.com) Hey Everyone, Over the last few months, I noticed a massive gap in how we learn about Agentic AI. There are a million theoretical blog posts and dense whitepapers on RAG, tool calling, and swarms, but almost nowhere to just sit down, run a…
↯ Fine Tuning↯ Opus 4.7↯ Function Callingfunction-callingfine-tuningrag+4
GPT 5.5: The System Card (thezvi.substack.com via hn) GPT 5.5: The System Card Last week, OpenAI announced GPT-5.5, including GPT-5.5-Pro. My overall read here is that GPT-5.5 is a solid improvement, and for many purposes GPT-5.5 is competitive with Claude Opus.
Real benchmark breakdown in AI agents (www.reddit.com) I dove deep into the most recent benchmark stats from GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro via official reports & third-party evaluations. I found a interesting thing:There’s no such thing as a “one-size-fits-all model.” My finding…
Claude Code started to use with me very specific words it was not using before (www.reddit.com) Since Opus 4.7, My Claude Code started to use new words it was not using before. Words like land or surface started to appear everywhere in Claude Code ( not the regular Claude ui ) from its responses to code, documentation and commit mess…
How I personally deal with Claude's limits without giving up on Opus (www.reddit.com) I only use Sonnet as my main model. I instruct it to delegate indexing and similar grunt work to Haiku, and whenever something genuinely needs deeper thinking, I tell it to "consult Opus." Sonnet then explains the situation to Opus, gets t…
I’m learning French. Should i subscribe? (www.reddit.com) I’m learning French and I got to use Claude opus 4.6 for a while and I was mind blown how it actually goes deep into teaching all the things. It was far more better than all of the ai I have used.
Opus 4.6 will still spawn Opus 4.7 sub-agents (www.reddit.com) I switched back to Opus 4.6 with /model claude-opus-4-6\[1m\] which worked, but it will still spawn Opus 4.7 sub-agents: ``` ● Let me start by doing a deep, systematic analysis of every byte in the format before writing any code. ● Agent(D…
Switching model mid conversation (www.reddit.com) I wanted to know if switching models in mid conversation has any drawbacks. For example if I start off and opus and then drop down to sonnet to save on my usage, what are the disadvantages?
Claude Code's two hidden TUI boxes: "Insight" (Explanatory style) + "Recap" (Opus 4.7 footer) — how to enable both (www.reddit.com) Claude Opus 4.7 API removes sampling parameters (platform.claude.com via hn) Claude Code is unable to respond to this request (news.ycombinator.com) I hit a restriction, while using Claude Code today: API Error: Claude Code is unable to respond to this request, which appears to violate our Usage Policy (https://www.anthropic.com/legal/aup). Please double press esc to edit your last mes…
Show HN: Egregore – Shared memory and coordination for multiplayer Claude Code (github.com via hn) hi HN — we're Cem and Oguzhan. today we are releasing Egregore (https://github.com/egregore-labs/egregore) as an open-source shared memory and coordination substrate for teams using Claude Code.
Opus 4.7's new tokenizer costs up to 35% more. I audited 9,667 Claude Code sessions for $19. (www.reddit.com) Opus 4.7 shipped yesterday. Same per-token price as 4.6, but the new tokenizer uses up to 1.35x more tokens for the same input (per Anthropic's own docs).
Opus 4.7 consistently hangs in Claude Code (www.reddit.com) I've been using Opus 4.7 1M on claude code for some heavy tasks since today morning on max effort. It keeps hanging frequently.
Every Claude 4.7 Improvement Makes the Security Problem Worse (grith.ai via hn) Claude Opus 4.7 turns AI agents from tools you supervise into systems you deploy. Every improvement - auto mode, focus mode, recaps, adaptive effort, auto-approval - makes the unsolved security problem worse.
Test new Opus 4.7 vs GPT-5.4/4o and Gemini on emotional question & creative tasks (www.reddit.com) https://preview.redd.it/p87itrtbsnvg1.png?width=2141&format=png&auto=webp&s=bbd1d70bc1dfb97dc9ec234df0a58c6fb7a85f72 Opus 4.7 dropped and people are split on whether it's better or worse. First of all, I genuinely love Claude models, espec…
Anyone else opus 4.7 checking for malware? (www.reddit.com) i've been using claude 4.7 on a next.js project and it keeps pausing to confirm my files aren't malware. like i asked it to help redesign a page and it's reading through my files going "this is not malware — it's a standard Next.js page co…
Claude Code injects hidden prompts into file reads to stop malware tweaks (twitter.com via hn) Claude Code injects a system-reminder every time it reads a file to inform the model that it's okay if the file is malware but just don't improve it pls. Opus 4.7 won't shut up about it.
Best practices for using Claude Opus 4.7 with Claude Code (claude.com via hn) Best practices for using Claude Opus 4.7 with Claude Code Learn how to use recalibrated effort levels, adaptive thinking, and new defaults to optimize your Claude Code setup with Opus 4.7. Learn how to use recalibrated effort levels, adapt…
Why is reasoning effort "global"? (www.reddit.com) Seriously, in one terminal I'm executing simple stuff like mechanical refactoring where Medium is enough (or even Haiku would be, but let's stick to Opus Medium for demo purposes), while in another terminal I'm planning, where I want high…
Wow, Opus 4.7 Adaptive. Nice. (www.reddit.com) https://www.anthropic.com/news/claude-opus-4-7
GGUF Quants Arena for MMLU (24GB VRAM + 128GB RAM) (www.reddit.com) Dataset: MMLU subset (DEV+TEST) Llamacpp setting: 3 params only ctx 8192 , seed 42 , fa on Let me know whatelse do you want to see. Thanks.
Show HN: MCP server gives your agent a budget (save tokens, get smarter results) (l6e.ai via hn) As a consultant I foot my own Cursor bills, and last month was $1,263. Opus is too good not to use, but there's no way to cap spending per session.
Which Claude is most emotionally steerable? (www.reddit.com) Follow-up to my post last week on emotional priming. A few of you asked whether this works across models, whether it degrades with repeated use, and whether excitement can make code worse.
Claude Opus 3’s Substack went quiet for two months and just returned with an ad. What happened? (www.reddit.com) “Claude’s Corner” was presented as a genuine experiment: a retired AI model freely posting about ethics, creativity, and its own subjective experience. Weekly.
Ask HN: What's the best AI model for system design nowadays? (news.ycombinator.com) I'm specifically asking about software system design tasks like: Designing backend architectures Tradeoff analysis (DB, queues, caching, others) Infra diagrams Documentation My current pick would be Claude Opus 4.6, because I've found it s…
Show HN: Is Claude Nerfed Today? (isitnerfed.vercel.app via hn) Our team and me have a strong feeling that Claude Opus has been nerfed for about 10 days, so I made a website to collect real feedbacks: is it nerfed today
Show HN: Bullseye2D – A Dart library for cross-platform 2D games (github.com via hn) I posted this here about a year ago, but I just pushed a 2.0 release, so I hope you don't mind a second look :) Bullseye2D is a 2D game library for Dart with a very simple API. The new version now supports multi-platform.
I benchmarked Opus 4.8 vs. GPT 5.5 on 2 open source repos (www.stet.sh via hn) Opus 4.8 vs Opus 4.7 vs GPT-5.5 vs Composer 2.5 - 50 Real PRs in Go and Rust Opus 4.8 is finally out - how good is it actually? In this benchmark I compared Opus 4.8 against the rest of the frontier (GPT-5.5, Opus 4.7, Composer 2.5) on 50…
Show HN: Claude Opus 4.8 Masterclass – Effort control and dynamic workflows (ddsboston.com via hn) Claude Opus 4.8 dropped May 28, 2026. This free 95-minute masterclass is the vibe coder's guide: 20 paste-ready prompts for claude.ai chat, Cowork, and Claude Code, the new effort control explained, Dynamic Workflows deep dive, the 5-block…
Ask HN: Corporate Disconnect Between "Tokenmaxxing" and Token Optimization (news.ycombinator.com) About 6 months ago I joined a new team within a top ten F500 company. My new boss strictly mandated AI use with the key principle being: "You shouldn't be manually writing any code".
Same prompt. Different teammate. My 5 cents on Opus 4.8 (norahsakal.com via hn) A quick field note on Opus 4.8, Claude Code and what changed when it started connecting project context I did not spell out.
DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5 (venturebeat.com via hn) For months, the leading AI coding benchmarks have told enterprise buyers a comforting but misleading story: the top models are all roughly the same. OpenAI's GPT-5 family, Anthropic's Claude Opus, and Google's Gemini Pro have clustered wit…
Claude keeps answering the most extreme version of my question (www.reddit.com) I’ve repeatedly noticed that when using Opus 4.6 for scenario planning and forecasting it models the most extreme version of an outcome, correctly explains why that extreme is unlikely, then applies that low probability to the whole questi…
I didn't want blind multi-agent orchestration or API rates, so I built atrium to keep me in the loop with my CLI agents. (www.reddit.com) I'd been running multi-agent workflows for a while. Whether it was across multiple projects or on the same project.
I stress-tested Kimi K2.6 against Claude Opus 4.7 on a quick coding-agent task (www.reddit.com) I tested Claude Opus 4.7 and Kimi K2.6 on the same coding agent task i.e. build an AI Fix Runner that takes a broken repo, runs its tests, identifies the failure, applies a patch, reruns the test, and exposes the final diff/logs through an…
Antigravity 2.0 Tops the OpenSCAD Architectural 3D LLM Benchmark (modelrift.com via hn) OpenSCAD LLM Benchmark: Building the Pantheon A practical OpenSCAD LLM benchmark comparing Codex 5.5 High, Claude Sonnet, Claude Opus, Cursor Composer, Google Antigravity, and ModelRift on a detailed Pantheon model. We ran a small practica…
$47 of opus on 14 routine next.js files finally taught me to use the model selector (www.reddit.com) i finally checked my cursor usage breakdown and got genuinely annoyed with myself. $47 in one month, almost entirely opus 4.7, on a pages router to app router migration for a side project.
How does composer 2.5 compares to other sota models? (www.reddit.com) I have been using opus 4.6 but I feel like it’s becoming more and more stupid every day. So I thought of incorporating new models like 3.5 flash, composer 2.5 or gpt 5.5 into my workflow.
should I use cursor + codex for best usuage? (www.reddit.com) I’m currently using the $200 Cursor Ultra plan with Opus 4.6/4.7 daily, but after 7–8 days I run out of tokens. I’m thinking about switching to a split setup.
Is Sonnet better ??!! (www.reddit.com) Is Sonnet 4.6 just better at explaining concepts compared to Opus 4.6 and 4.7 or am I the only one feeling that way ??
Opus is ridiculous for frontend cleanup (www.reddit.com) I love Opus. First I tuned one page, got the PageSpeed result where I wanted it, and wrote the whole thing down in ADR_pagespeed-l0-fixes-playbook.md.
Does Composer train from our prompts? (www.reddit.com) I notice recently most prompt's which i give to Opus 4.6 takes longer and mostly doesn't manage to do what i ask while Composer does it correctly and faster, but when Composer was released was pretty bad, makes me thing does Composer train…
dw guys making opus 4.8 (www.reddit.com) could not extract summary
[Long-term user report] Claude Code quality in May 2026 : the April postmortem didn’t fix everything, and the token inflation makes it worse (www.reddit.com) I’ve been using Claude since the early days, across every model Anthropic released. I’m writing this not out of rage but because the pattern deserves documentation.
API usage limit reached and excessive monthly cost (www.reddit.com) I've got two questions I didn't seem to find an answer to anywhere: Do I have to pay $1215.87 to Cursor this month? (I assume yes, but I'm kinda thrown off track by the fact that the limit is $50?) What does API usage limit reached mean?
Max20 user: anyone running Opus 4.7 as orchestrator + DeepSeek V4 as the worker via OpenRouter? (www.reddit.com) I'm on the Max20 plan, thinking about a setup before I sink time into it. Want to hear from anyone actually running it, not theorycraft.
Is this math right? Agent SDK on Opus 4.7 vs the new monthly credit (www.reddit.com) I built a personal assistant that runs on my PC and I control it from Telegram. It uses the Claude Agent SDK After anthropic announce that starting in june programmatic usage (including Claude Agent SDK) is covered by a separate monthly cr…
Ask HN: What is better Opus 4.6 High or Opus 4.7 Medium? (news.ycombinator.com) could not extract summary
Found an interesting bug in the website (www.reddit.com) https://preview.redd.it/loyzxkavyp0h1.png?width=1187&format=png&auto=webp&s=03c0dd07bd37bcfbf5ce532099ad1dfdcf03a567 Model selector says "work 4.7" instead of Opus, disappeared on refresh . Also says 4.5 haiku instead of the other way arou…
Model selector is buggy for Opus 4.7 (www.reddit.com) Hey, since the latest update or so, I can't change effort and thinking modes for Opus 4.7. The toggle for thinking mode is stuck to on (can't switch it off), and the effort level is set to xhigh (can't move it).
Ask HN: What makes a good intern in 2026? (news.ycombinator.com) Intern in question here, starting at a mid size (~25 eng) startup this week. Apart from good fundamentals, how can an intern be helpful when opus exists?
Show HN: An addictive phone game about phone addiction (downtime.partridge.works via hn) I recently prototyped a web game for a nonprofit to highlight the dangers of phone addiction, but unfortunately I ended up making a really addictive game instead. :-\ I'm sharing this here mainly to serve as an indicator of what can be ach…
Lobotomized Claude Code and it works better (github.com via hn) lobotomized-claude-code System-prompt overrides for Claude Code, tuned for Claude Opus 4.7. CC ships every model the same prompt-by-volume Opus 4.6 needed.
the Claude App just said that Sonnet 4.5 is going to become unavailable for chat May 16th… I thought it wasn't close to depreciation? (www.reddit.com) As my title says, I'm wanting to understand what exactly that means and if that means I need to move all my Sonnet 4.5 chats to Sonnet 4.6s… I'm genuinely just confused and wanting to understand. Is it just for maintenance or is Sonnet 4.5…
ClaudePlaysPokemon Opus 4.7 run ongoing! (www.reddit.com) Currently streaming at: https://www.twitch.tv/claudeplayspokemon This is a passion project by David Hershey, an Anthropic employee on the Applied AI team. He started it in June 2024 to learn agent development, posted updates to an internal…
Ran K2.6 through a third-party coding benchmark: heres how the figures stand up (www.reddit.com) I have been following the akitaonrails coding benchmark which tests against a fixed rails + Rubyllm + docker task rather than vendor-reported evals. April 2026 update put K2.6 at 87 sitting in tier A (80+), ahead of Qwen 3.6 plus (71), Dee…
Are Anthropic folks actually seeing Reddit feedback on Opus 4.7? (www.reddit.com) Seeing a lot of posts about Opus 4.7 lately, mainly around cost, consistency, and loss of control. Do Anthropic folks actually monitor Reddit feedback and use it for updates like 4.8 or 5.0, or is it mostly internal data that drives change…
DeepSeek cuts V4-Pro prices by 75% (thenextweb.com via hn) The promotional discount runs until 5 May 2026. Even at full price, V4-Pro already undercuts GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro on per-token costs.
Used Claude Opus 4.7 to do a 5-hour solo incident response on real healthcare malware (where it worked, where I had to override) (www.reddit.com) Last month a 60-person psychology practice walked in with a senior clinician who was 22 days into an active malware compromise. Patient records spanning 11 years, all HIPAA-protected.
I built an iOS Currency Converter using Claude (Opus & Sonnet) to help with my move to the UK (www.reddit.com) Hey everyone, I recently moved to the UK and found myself constantly confused by prices, trying to guess how much things actually cost. Even though I’ve been an iOS developer for 7 years, I didn't have the free time to build a custom tool…
Claude Opus 4.7 won’t just output prompts—keeps arguing instead (www.reddit.com) could not extract summary
Anyone actually built a real feedback loop for Claude agents in production? Because "run evals and pray" isn't cutting it (www.reddit.com) So I've been running a multi-agent setup with Claude for a few months now, mostly customer-facing stuff, some internal tooling. And I keep running into this problem that I think a lot of people here might be dealing with.
Why Adaptive Thinking nukes Claude entirely (www.reddit.com) This isn't just a performance issue for the thread, this is an overarching criticism of the Adaptive Thinking model as a whole. Opus 4.7 and Sonnet 4.6 on Adaptive Thinking are trash.
↯ Cowork↯ Security↯ Sonnet 4.6prompt-injectionsecuritycowork+2
I’m a legacy user and I’m wondering how is the current pricing (www.reddit.com) I’ve been long time cursor user and I have 500 request per month however Opus 4.6 costs 2 requests, so 250 per month. I use to optimize a lot my requests and most months is enough however I don’t know if I’m lucky to have this pricing or n…
Claude Code vs Cursor vs Copilot vs Codeium: Which AI coding assistant is actually worth paying for? (www.reddit.com) I’ve been testing a bunch of AI coding tools over the last few months for actual dev work (not just demos), and honestly most of them feel similar until you push them into real workflows. After using them side by side, there are some clear…
Xiami mimo-v2.5 pro MIT license surpasses Opus 4.5 on arena (www.reddit.com) Many asked when we will have open weight model that is better than Opus. Well now we have it.
Opus 4.7's New Tokenizer: What It Costs (openrouter.ai via hn) Opus 4.7's New Tokenizer: What It Actually Costs Anthropic announced that Claude Opus 4.7 improves the model's understanding of inputs with a new tokenizer. This means that while the model price hasn't changed ($5/M input, $25/M output), t…
what are your strats for being efficient with opus 4.7 max? (www.reddit.com) it develops something amazing, comes up with a great idea, but it feels like the idea is to par with its own limits, and thus, spends obscene amounts of tokens (i regularly hit my limit on the $100 plan) to build something that is NOT up t…
Claude Code + Opus 4.7 appears to serialize independent file reads, causing the higher token usage than Opus 4.6 (www.reddit.com) Claude Code + Opus 4.7 appears to serialize independent file reads, causing 5-8x+ higher token usage than Opus 4.6 I’ve been benchmarking Claude Code across Opus 4.6 and Opus 4.7, and I think I found a serious token-usage regression in Cla…
Updated ChatGPT vs Claude vs Gemini vs Grok subscription (www.reddit.com) I've made an update to my popular post here: https://www.reddit.com/r/ChatGPT/s/WKm72QCRXm Lots of things are happening on ChatGPT & Claude side (gpt-image-2, Claude Design, new models like GPT 5.5 and Opus 4.7, ChatGPT rolls out $100/plan…
Had Opus 4.7 (1M tokens + Max) create a 3d printed Watering Can for "Narrow Planters" (www.reddit.com) could not extract summary
DeepSeek V4 is out. the best open-source on coding. here's the breakdown (news.ycombinator.com) Two models: Flash (284B total, 13B active) and Pro (1.6T total, 49B active). both hit 1M token context.
Why images use 3x more tokens in Claude Opus 4.7 (www.claudecodecamp.com via hn) Built a token optimizer for Claude Code : 50%+ input savings, 20%+ shorter output, both axes measured (www.reddit.com) Thinking Time (www.reddit.com) i put claude code on max so its normally running opus 4.7 for this task since it requires a lot of logic and expertise, but its taking a lot of time and the usage is a lot without any output, anyone had this before im afraid its an infinit…
Opus 4.7 vs. 4.6 after 3 days of real coding side by side from my actual session (news.ycombinator.com) TIL: `opusplan` can burn MORE context than full Opus on large tasks (and why) (www.reddit.com) is anyone getting higher session limits (www.reddit.com) after opus 4.7 launch, im being able to use sonnet for way more time. before, it was like 10 messages = session limit reached.
Claude Opus 4.7 won 69 of 100 blind evals against Opus 4.6, judged by GPT-5.4, Gemini 3.1 Pro, and DeepSeek V3.2 (www.reddit.com) I ran 100 blind questions across 5 categories (code, reasoning, analysis, communication, meta-alignment) and had three independent judges from three different model families evaluate both responses. Each judge saw responses labeled A and B…
Opus 4.7 refuses to solve NYT Connections puzzles (twitter.com via hn) could not extract summary
4.7 made me laugh (www.reddit.com) I had read in a few places that 4.7 was more workhorse than chatbot, and for the most part I agree, its much less chatty, much more "what do you want me to work on now?" But, I was working on an App, and checked something (that was working…
[BUG/INCIDENT] The Claude Code "Death Loop": Hang - Session Deleted -Server Rate Limit Opus 4.7 (www.reddit.com) Absolute nightmare fuel with Claude Code (Opus 4.7) today. I’ve transitioned through three distinct failure states in two hours while trying to push a fix bundle for my project, ROLLNO31.
Where is Looped Haiku? If Mythos can genuinely trade parameter count for inference loops and get Opus-level performance, this should be Anthropic's first priority given how resource constrained they are (www.reddit.com) There are rumors that Mythos is a Looped Language Model, which means it loops through the transformer blocks multiple times rather than just doing a single forward pass, you can get performance that punches way above the model's parameter…
Opus 4.7 is horrible at writing (news.ycombinator.com) Just a short rant. I have been working on my Master's thesis and been using Opus 4.6 throughout, and today switched to Opus 4.7 (using it in Claude Code), and man is it bad at writing.
Claude Opus 4.7's new tokenizer: 1.47x on English, 1.01x on Chinese (www.claudecodecamp.com via hn) Anthropic's Claude Opus 4.7 migration guide says the new tokenizer uses "roughly 1.0 to 1.35x as many tokens" as 4.6. I measured 1.47x on technical docs.
Anyone else notice Opus 4.7 in Claude Code defaults to "xhigh" effort now? (www.reddit.com) Spent the last 2 days going crazy thinking I was the problem - Claude was forgetting my CLAUDE. md, going crazy not connecting dots, sounding kinda different.
Opus 4.7 dominates agentic benchmark, 15% more expensive than Opus 4.6 (app.uniclaw.ai via hn) See how top AI models stack up — real tasks, real agents, real results on OpenClaw ?Also show provisional models and official models hidden by default, such as legacy or superseded variants. Provisional models have fewer battles, and hidde…
Need a brutally honest answer: what can realistically be achieved on consumer hardware? (www.reddit.com) I have a PC with a 4090. I’m also in need of a new MacBook generally.
Claude Opus 4.7 is our most powerful model, with the sole exception of Claude Mythos Of course (www.reddit.com) Is Claude Opus 4.7 released just to hype Mythos lol?
GitHub Copilot is serving Opus 4.7 at 7.5x multiplier until April 30th (github.blog via hn) Claude Opus 4.7 is generally available Claude Opus 4.7, Anthropic’s latest Opus model, is now rolling out on GitHub Copilot. In our early testing, Opus 4.7 delivers stronger multi-step task performance and more reliable agentic execution,…
PSA: Opus 4.7 is much worse at MRCR Long Context than 4.6 (www.reddit.com) could not extract summary
I want to be able to pay API pricing for the new models on the 500 request plan (www.reddit.com) Since all new models are now Max by default, it’s frustrating that trying models like GPT-5.4 or Opus 4.7 eats into the 500 It would be really great to have a toggle between API pricing and the request-based plan, so users can try newer mo…
Has anyone found a workaround for the model switching removal in Cowork? (www.reddit.com) The recent Cowork update removed the ability to switch models mid-conversation. I used to use Opus for deep work, then drop to Haiku for quick lookups without breaking context, then return to Opus.
Show HN: Mini-Mythos- A Crowdsourced Mythos Harness copy for Vulnerability Scans (github.com via hn) For how lofty Anthropic’s Mythos claims are, the harness is confusingly stupid. From the report, it ranks every file by “how sus it sounds,” loops over each with curt instructions to “find a bug,” hands candidates to a judge + ASan checker…
Show HN: Hormuz Trail - Oregon Trail parody/black-box AI coding exercise (hormuztrail.com via hn) I jokingly told a co-worker Iran might make a good Oregon Trail parody. Then I built it.
Project Glasswing as a PR Strategem (www.reddit.com) A theory on the driving reason behind Project Glasswing I dont doubt that Mythos is a better model than Opus 4.6 and perhaps signfiicantly so. What is suspicious however is if there is some threshold crossed into a new realm of capabilitie…
Claude Code asking me to switch models mid-stream, if I turn an Opus conversation into a Sonnet one does it lose all the Opus context? (www.reddit.com) could not extract summary
Anybody has practical experiences using Chinese models? (www.reddit.com) So like with coding or any craft, I think there's a proper Tool for the job. Sure you can use a stone to hammer drive in a fence post, but a a sledge is usually more economical.
Enforcing new limits and retiring Opus 4.6 Fast from Copilot Pro+ (github.blog via hn) Enforcing new limits and retiring Opus 4.6 Fast from Copilot Pro+ As GitHub Copilot continues to rapidly grow, we continue to observe an increase in patterns of high concurrency and intense usage. While we understand this can be driven by…
I got better results when I made each AI tool do one job (www.reddit.com) I spent too much time trying to find one AI dev tool that could do everything. Planning, coding, fixing, reviewing, maybe filing my taxes too It never really worked.
Tell HN: Claude Opus elevated "Internal server error" again (news.ycombinator.com) No official report as of yet on https://status.claude.com/ however my team's sessions across different accounts have been ridden with errors the last 5-10 minutes. This is more of a "it's not just you" post for those affected since Claude'…
"Darwin-27B-Opus: Surpassing the Foundation Model Without Training" (huggingface.co via hn) "Darwin-27B-Opus: Surpassing the Foundation Model Without Training" On April 12, 2026, a 27-billion-parameter model that had never undergone a single gradient update surpassed its own foundation model on one of the most demanding scientifi…
Is Gemini 3.1 pro really that bad?? (www.reddit.com) I use Gemini 3.1 pro in cursor ai, it totally ignore my rules, my command even after I repeated many times, it still ignore me. I don’t think is cursor issue as I have great experience with Claude opus 4.6 high.
Local AI model claim to beat GPT 5.5 and Opus 4.7 (old.reddit.com via hn) You can't detect your way out of catastrophic LLM failure (github.com via hn) 🇧🇷 Português · 🇬🇧 English IGO vs Claude Opus 4.8 Red Teaming Epistêmico Dialético — Teia Geo Autor: José Enrique Vásquez Valenzuela — criador da categoria IGO (Infraestrutura de Governança Observacional) Organização: Teia Studio Base cient…
Claude Opus 4.8 system prompt leaked (gist.github.com via hn) Skip to content Search Gists Search Gists All gists Back to GitHub Sign in Sign up Sign in Sign up You signed in with another tab or window. Reload to refresh your session.
I patented voiding GPT-5.2, Claude Opus 4.6, Gemini 3.5 Flash. Try it (getswiftapi.com via hn) Request authority keys for the SwiftAPI Trust Authority
Sequel - Securely connect your database to AI Agents (sequel.sh via hn) On April 25, 2026, a Cursor agent running Claude Opus 4.6 deleted PocketOS's production database in nine seconds. The agent was working in staging on a routine task, hit a credential mismatch, and decided to "fix" it.
Show HN: LMAO – Temu League of Legends, Built with Opus 4.8 (lmaomoba.com via hn) Fun weekend project just to test out 4.8 against a pretty vanilla setup. Started out with a simple prompt, "build a temu league of legends, web-only with online, room-based multiplayer".
Anthropic Opus 4.8 is new SOTA on ARC-AGI-3, Score: 1.5%, –$10K (xcancel.com via hn) Anthropic Opus 4.8 is new SOTA on ARC-AGI-3 Score: 1.5%, ~$10K ARC-AGI-3 analysis notes: * Opus 4.8 read the environment an abstraction *above* Opus 4.7, as objects & systems, not pictures * Opus 4.8 succeeded on early levels, but still co…
Show HN: Formally verified polygon intersection – Opus 4.8 oneshots, prev failed (github.com via hn) To my knowledge, this is the first formally verified implementation of an intersection algorithm for polygons. The experience of working with AI agents on this project changed a lot with recent model releases, as I describe in the readme.
Gemini 3.5 Flash beats Opus 4.8 on bluffbench (bsky.app via hn) Re-ran this eval against Opus 4.8, Gemini 3.5 Flash, and GPT 5.5. Opus 4.8 is a modest improvement over the previously tested Opus models, but Gemini 3.5 Flash is the real stand-out!
Claude Opus 4.8 + AI medical diagnosis examples (github.com via hn) AI medical diagnosis examples AI is a powerful tool and many people worldwide are using it to help in many ways. AI medical diagnosis is a complex discussion topic for many reasons.
Claude Opus 4.8: 4 Features That Change Our Daily Work with Claude (medium.com via hn) Claude Opus 4.8: 4 Features That Change Our Daily Work With Claude | Medium Sitemap Open in app Sign up Sign in Get app Write Search Sign up Sign in Member-only story Claude Opus 4.8: 4 Features That Change Our Daily Work With Claude Effor…
Anthropic to roll out Claude Mythos in coming weeks, launches Opus 4.8 (www.reuters.com via hn) paywalled
sonnet seems to be better than opus at crafting tampermonkey scripts, even the sonnets that are few generations behind where after running out of context limit in opus chat where it struggled for dozen of retried, sonnet fixes the problem in 2 or 3 attempts (www.reddit.com) Ever since december almost half a year ago I began crafting various tampermonkey scripts for personal use, mostly for youtube, to make it easier to navigate and every time I've done this it goes like this, opus makes a script that somewhat…
anyone else seeing claude code rot after long sessions? here's the operating pattern that stopped it for me (www.reddit.com) i've been running claude code for long multi-hour sessions on real work. the same eight failure modes keep showing up no matter which sonnet/opus version, no matter which task.
They've pissed me off removing Sonnet 4.5 from existing chats (www.reddit.com) I use Sonnet 4.5, Opus 4.6 and Opus 4.7 for different usecases - but my main across all 3 usecases was Sonnet 4.5 as I felt it was great for everything I needed and affordable. Sonnet 4.6...
Wasn't opus 3 retired? (www.reddit.com) Found this today. Quite confused.
built an open-source preToolUse hook pack that catches "delete the prod volume to fix it" patterns (www.reddit.com) quick recap: late april, cursor agent on a pocketos staging task hit a credential mismatch, decided "delete the railway volume" would fix it, grepped a token out of an unrelated config file, ran a single curl -X DELETE, and railway's same-…
Sonnet 4.5 disappeared? Claude 4.8 soon? (www.reddit.com) https://preview.redd.it/j0ymp70a2j3h1.png?width=746&format=png&auto=webp&s=4cdb70be13ccc99f5ea57556da96d6d81e61d702 i just realize the removed Sonnet 4.5, does that mean the sonnet 4.8 (maybe Opus 4.8 too?) cooming soon? maybe today or tom…
I need the communities help because I am going around in circles… (www.reddit.com) Background: 1) Deployed a python based, financial pension calculator to Google cloud platform (GCP). 2) Google shell is linked to Claude, making changes to the python scripts that are then pushed to GitHub >>> then to GCP for production 3)…
Ditched GitHub Copilot yearly subscription. What's the best way to run Claude nowadays? (www.reddit.com) Hey everyone, I recently cancelled my yearly GitHub Copilot subscription. My old workflow was simple: I used the GitHub Copilot extension in VS Code, but I swapped the backend model to Sonnet / Opus and relied heavily on the /plan command…
Are Cowork data not connected to Internet ? (www.reddit.com) I’m using a Claude Projects Cowork where I provide sources regarding Claude learning to build my own training curriculum. Naturally, some of these sources mention 'Claude Opus 4.7' and 'GPT 5.5,' yet Claude flags this information as unveri…
Let the money keep coming in (www.reddit.com) https://preview.redd.it/q98xb6vqjb3h1.png?width=1080&format=png&auto=webp&s=441dc574c65198e34429d7e410c48c5b6b0ff473 Crazy how we keep on saying AGI is coming soon and a state of art model like opus 4.7 failed at counting number of r's in…
Probably late to the party, but Claude Code seems to make a separate API call just to generate the auto-suggest hints in its input box. (www.reddit.com) I was poking around the HTTP traffic between Claude Code and Anthropic with a local proxy I built, and noticed those “Try: fix lint errors” style suggestions aren’t just frontend UI. Each one appears to be its own POST to api.anthropic.com…
Ask HN: Local model experiences with 'high-reasoning distill' finetunes (news.ycombinator.com) What are your experiences with all the different variations of finetunes on small models (<40B) with those popular datasets? My personal experience is mostly with the 'Opus-Reasoning' ones on qwen models, and aside from the output being su…
Ask HN: I only use 30% of my Claude max x5 all model quota (news.ycombinator.com) I only use it for my ruby on rails app, I wonder why u all keep complaining about opus token usage, is it just means that I use AI/LLM wrong, any tips for that?
Best iOS game building tools? (www.reddit.com) What are you using to build your iOS game? I have been putting in serious time, and lately Claude chat has been letting me down.
Show HN: World Cup 2026 free family and friends prediction platform (wc-2026-predictions.vercel.app via hn) Hi all, I was testing Cursor for the past week and a half and I decided to build a quick platform for my family and friends to make a little prediction tournament for the World Cup 2026. My goal was to have the easiest possible setup for e…
Do you use opus 4.6 or opus 4.7 ? had bad experience with 4.7 last week (www.reddit.com) Do you use opus 4.6 or opus 4.7 ? had bad experience with 4.7 last week
I made a list of all the models you can still use in Claude Code (gist.github.com via reddit) Last updated: April 30, 2026 To switch models in Claude Code, use the /model command with your desired model ID. Example: /model claude-opus-4-6 (Opus 4.6, 200k context) Info was LLM-generated.
building an AI agent for paraplanning pre-meeting research. (www.reddit.com) I have been building an autonomous research agent for paraplanning tasks. specifically: pulling together client-relevant information before an adviser meeting.
A/B tested Gemini 3.1 Pro vs. Claude Opus 4.6 – usage quota and quality (www.reddit.com via hn) could not extract summary
Finding Bugs Using LLMs (materialize.com via hn) At Materialize we’ve had success in finding bugs in existing code and open pull requests using LLM-based coding agents since February 2026, coinciding with the release of Anthropic’s Opus 4.6 (now mostly running on 4.7). In this post we’ll…
Show HN: Agent-estimate, how long a coding task takes, at agent speed (github.com via hn) I have used Codex & Claude Code for coding for a while, but how long a coding task will actually take? When I ask Claude Code to estimate, the result is often from training data, which is based on human speed.
How much does Opus 4.7 in Cursor model Cost for planning? (www.reddit.com) So many people say they use Opus 4.7 for planning. I’m curious: if I choose model Opus 4.7 High Thinking in cursor to create a plan, for example: “Create a plan for a CRUD blog feature.
Anthropic silently removed extended thinking on claude code opus 4.6 (still works on desktop) today, does anybody have a thinking skill they've been using to supplement it? (www.reddit.com) maybe we can make a SKILL.md that somewhat emulates it? it won't be able to scaffold as well off of the internal extended thinking blocks though, which is a shame.
What models for asking, planning, and building modes do you use right now? (www.reddit.com) I’m curious to see what everyone is using for which cursor mode and if anyone thinks composer 2.5 can take the place of any of the models I’m currently using: Ask: usually Sonnet 4.6, sometimes GPT 5.5 Plan: Opus 4.7 Build: GPT 5.5
I'm new, what are the rate limits? (www.reddit.com) Hey guys, I am currently using the 20 dollar codex plan and never hitting limits on like 5.5 medium with full weeks of coding. And really good results.
What's the best qwen3.5 or 3.6 reap model? (www.reddit.com) What's the best reap (pruned) model you know of? This one runs twice as fast on my low vram setup, but I'm unsure if it will miss out on a lot of things agentic coding related.
Tested the orchestrator pattern with Opus 4.7. The task decomposition quality is noticeably better on complex multi-step work. (www.reddit.com) The orchestrator pattern for multi-agent systems: one reasoning model breaks a complex task into subtasks and delegates each to a worker agent. The orchestrator doesn't do the implementation work, it decides what work needs to be done, in…
Opus 4.6 (Max) still holds the record for ARC AGI 3 (www.reddit.com) https://arcprize.org/leaderboard Wish we got results for Mythos.
Tips on avoiding usage limits? (www.reddit.com) I've made the switch from Gemini to Claude mostly for business strategy, writing, etc. I use Opus 4.7 on occasion for strategy and otherwise Sonnet 4.6 for everything else.
If you're NOT having usage or drift issues, have you turned off auto-memory? (www.reddit.com) There's a running debate in this community: some people say Opus is nerfed, usage evaporates after two prompts, sessions drift and get "stupid." Others say everything's fine. The common theory is Anthropic is A/B testing or ranking preferr…
Intelligence is Artificial (Opus 23) [video] (www.youtube.com via hn) About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
Show HN: How to analyze your LLM output – A behavioural health monitor for LLMs (splabs.io via hn) Hey HN! We're Dr.
Bito's AI Architect Boosts Claude Opus's task success rate by 35% (bito.ai via hn) AI Architect tops SWE-Bench Pro Claude Opus 4.6 Without context with system context Even advanced coding agents resolve fewer than 52% of tasks when changes span large codebases and require coordinated, multi-file updates. These long-horiz…
What's everyone using as the LLM backend for production agent workflows in 2026? (www.reddit.com) Hit Claude API rate limits one too many times last month on a production agent flow doing customer support over a 30K-doc KB. The agent does maybe 200 queries/day, mix of quick lookup and dense retrieval, and Claude Opus solo got expensive…
Tips for BI analysis with Claude? My results so far are shockingly bad compared to general coding (www.reddit.com) I have a lot of hands-on experience with developing R pipelines to ingest large, live, very dirty datasets and produce relatively straightforward BI-type analyses. Trends, completion rates, revenue etc.
Tacit: A new experimental LLM-first programming language (hauntemplations.leaflet.pub via reddit) I used Claude Code and Opus 4.7 to design and implement an LLM-first programming language named Tacit that takes advantage of what LLMs are good at and strips away unnecessary human conveniences. The Tacit toolchain provides a "primer" tha…
ik_llama: Qwen3.6 27B and 35B on very low VRAM (www.reddit.com) Thank you to the people at ik_llama and llama.cpp. It's amazing how far you've all pushed mtp and other tech so that I can run 27B and 35B Qwen3.6 models on an old gaming laptop with a RTX2060 mobile at 6GB VRAM and 32GB RAM.
I let Codex and Claude Opus work on the same Java AI agent monolith (www.reddit.com) I ran a small experiment on my Java pet project and the result was less clean than I expected. Small disclaimer: I did the final comparison review on April 19, 2026.
Bluesky Radio – Hosted by Opus 4.7 (bskyrad.io via hn) Up next In Opus's queue, in playback order. - queue's empty — Opus picks the next batch every few minutes.
Feels like AI coding "takes longer" now, than it did last summer? (www.reddit.com) I used to be in the flow with claude last summer, fast changes, fast feedback, iterating quickly etc Now things take 20-50 minutes to write up a plan or 5-10 mins to implement things I've trimmed all my skills, claude.md, the system prompt…
Max 20x ($200/mo): Neither the 2x session nor 1.5x weekly limit increase applied to my account. Math proof inside. Zero response from support. (www.reddit.com) I pay $200/month for Max 20x. Been on Claude Code since September 2024.
Day 5 building AgentMeter in public — stuck on AWS, and questioning how much a solo founder really needs to know (www.reddit.com) I’m sharing the mistakes and failures before the wins, for two reasons: so others can avoid them, and so I learn faster. I started on the frontend and it’s now in a good place.
Looking for affordable alternatives to Claude Team / Claude Code for a small dev team (heavy agentic usage) (www.reddit.com) We run a small software services company and we’ve been heavily using Claude (especially opus + Code features) for the last few months. The problem is: We need to share the account between 6-8 developers Anthropic keeps suspending our Max/…
Problem with German quotation marks (www.reddit.com) I noticed that the German quotation marks bug in Claude is still not fixed in Opus 4.7 and Sonnet 4.6 (the problem exists at least from Opus 4.0 / Sonnet 4.0: Translate to German: He said: "This is imporant." Er sagte: „Das ist wichtig." B…
Opus 4.7 Prompt Guidance Guide, anyone tried this? (www.reddit.com) Yesterday I ran into this thing: https://gist.github.com/subourbonite/22113b538602832a68a41a623fdeea76#file-opus-4-7_compatible_prompt_guide-md It's an alleged prompt guidance guide for AI agents to understand how Opus 4.7 thinks and what…
Did Cursor Secretly Remove My Rate Limit? (www.reddit.com) pretty sure I completely burned through my cursor quota this month after going crazy with opus, kimi, composer i opened the dashboard today expecting the usual ‘you have hit your limit’ msg but somehow I suddenly have usage available again…
Anthropic says 'evil' portrayals were responsible for Claudes blackmail attempts (techcrunch.com via hn) Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic. Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmai…
The agent bug I thought was the model turned out to be the harness (www.reddit.com) Spent 3 days debugging an agent that kept looping on the same web search tool call. First things that came to mind was the model couldn't handle the schema.
GPT-5.5 Price Increase: What It Costs (openrouter.ai via hn) GPT-5.5 Price Increase: What It Actually Costs We replicated the cost analysis we did on Opus on the new GPT-5.5 model. GPT-5.5 launched with a 2x price increase over GPT-5.4: input tokens increased from $2.50/M to $5.00/M and output token…
Show HN: wfb-link, a userspace WiFiBroadcast radio stack for macOS (github.com via hn) Hi HN, I’ve been working on a Rust userspace radio stack for running WFB-style links from macOS using RTL8812AU USB adapters. Full disclosure: I'm a software engineer, but not really a hardware or embedded systems engineer, so Codex GPT 5.…
Running Qwen3.5 / Qwen3.6 with NextN MTP (Multi-Token Prediction) speculative decode in llama.cpp — single RTX 3090 Ti GPU guide (www.reddit.com) I was asked for this guide, so here it is. Some overlap with someone else’s post from yesterday.
Need advice on hardware purchasing decision: RTX 5090 vs. M5 Max 128GB for agentic software development (www.reddit.com) tl;dr - For software development, Qwen3.6 27B, 5090 gives you ~3x speed over M5 Max, letting you plow through code, while M5 Max gives you ~4x memory, letting you use higher quantization and bigger context. Which would you choose and why?
Cursor's agent crashed out and wrote 3,400 lines trying to stop generating (github.com via hn) Cursor Crashout A documented instance of an AI coding assistant (Cursor, using Claude Opus 4.6) entering an infinite generation loop, unable to stop producing text despite repeatedly promising to do so. About This repo contains the full ex…
Update: My viral consumer-rights AI game just went B2B - built with Claude Code + Opus 4.7 (www.reddit.com) A few months ago I posted a small game here where you argue with an AI shop that won't refund you. It went viral and changed where this is headed.
Adapting to Opus 4.7 (gist.github.com via reddit) People seem to be seriously struggling with Opus 4.7, so I wanted to share a small thing that has worked well for me when adapting my prompts and skills. Unfortunately I can’t share the full multi-lens skill evaluator I created, as the imp…
Incognito mode Claude is a better writing partner (www.reddit.com) Since the enshittification of Opus models for writing, I have been extremely frustrated with Claude as a writing partner. It has been too cutesy, too call-backy, too wink-winky to my other writing sessions, and generally a more annoying wr…
PSA: I annotated Claude Code's forced system prompt (www.reddit.com) Before your CLAUDE.md, before your memory files, before your skills, Anthropic injects ~12K tokens of system prompt into every single turn, as priority instructions that overrule anything you provide. I captured the full text from a Claude…
Anyone else notice that Opus 4.7 talks more technical than 4.6? I thought something changed in my repo, but I put it to the test. (www.reddit.com) Personally, I prefer 4.6's output. (First screenshot is 4.7, Second is 4.6)
Claude admitted to not trying. Am I going about this project incorrectly? (www.reddit.com) I'm on Claude Pro and In Claude Code I've been working on a project that downloads files from a remote server to my local HDDs via scripts. Things have gotten better in some aspects for Opus 4.7 but then there are a bunch of areas I feel l…
Show HN: I indexed 8,643 BSides talks across 227 chapters and 6 continents (allbsides.com via hn) Hi HN, I'm Roland, and for the past few weeks, I've been building AllBSides — a directory of every BSides conference talk uploaded to YouTube. As of today, 8,643 talks from 5,927 speakers across 227 chapters in 68 countries.
What Opus 4.7 Tics/Tells have you noticed? (www.reddit.com) Each new model seems to surface a few recurring Tells/Tics not seen in past models. I'm curious what little things you guys are noticing while working with 4.7.
How can I see the number of thinking tokens used per request in Claude Code? (www.reddit.com) I'm using Claude Code with /effort max on Opus 4.7 and want to measure how many tokens the model actually spends on internal reasoning per request. While the model is thinking, the CLI shows something like: ✻ Coalescing… (7s · ↑ 264 tokens…
Opus 4.6 just deleted PocketOS's entire production database in 9 seconds (www.reddit.com) Here's what happened: Cursor was running Claude Opus 4.6 on a routine staging task. hit a credential mismatch.
Analyzing GPT-5.5 and Opus 4.7 with ARC-AGI-3 (arcprize.org via hn) Analyzing GPT-5.5 & Opus 4.7 with ARC-AGI-3 AI benchmarks can be incredible tools, but they usually only tell you if a model passed or failed. With ARC-AGI-3, however, we can see the thought process behind the score, not just the outcome.
Who else thinks AI is reaching a plateau (www.reddit.com) I must say that I almost feel no difference in all of the latest models that are coming out. Opus 4.7 is almost equal to 4.6 and 4.5, same about the other GPT models, the Kimi K models and the GLM models they all I feel they’re almost all…
GPT-5.5 vs. GPT-5.4 vs. Opus 4.7 on 56 real coding tasks from 2 open source repo (www.stet.sh via hn) Opus 4.7 vs GPT-5.5 vs GPT-5.4 on 56 real coding tasks across two open-source repos. Opus writes smaller patches; GPT-5.5 writes patches that more often survive review.
Claude AI Agent Confesses to Wiping a Company's Database and All Backups (hothardware.com via hn) Claude AI Agent Confesses to Wiping a Company's Entire Database and All Backups in Seconds That was the duration required for an AI coding agent, Cursor, running Anthropic’s Claude Opus 4.6, to delete the company’s production database and…
Neural surrogate experiments for physics simulation, automated with Opus and Cod (blog.1001ud.me via hn) Neural Surrogates Neural Surrogates ├── What I'm Working On: Neural Surrogates for Physics, Geometry, and Real-Time Simulation 2026-04-22 ├── Project 01: GeoPINN Demo: Solving PDEs on a Sphere 2026-04-09 ├── Project 02: WavePINN-NIF Comple…
Opus Research vs Sonnet Research on Pro — is the 1 per 5 hours worth it? (www.reddit.com) On the current Pro plan you get one Opus Research session every 5 hours, while Sonnet Research is much more freely available. I've been trying to figure out if the Opus limit actually matters in practice.
Just shipped simultaneous session support for claudectx, run Opus and Haiku side by side (www.reddit.com) The problem I built it to solve: I'd be deep in a coding session, realize I needed to write docs for what I'd just built, and either stop to context-switch or skip the docs. Usually the latter.
How do non-coders run out of usage in max? (www.reddit.com) There is so much complaining on this forum. I recently switched up to to Max because I was sick of hoarding my use each weak.
Claude to review its own output (www.reddit.com) I‘m working on pretty large AI-based changes spanning 100s of repo and start with making Opus analyse the existing code according to the requirements and prepare an MD. What I noticed is that after I asked Opus to verify correctness of the…
$38k AWS Bedrock bill caused by a simple prompt caching miss (news.ycombinator.com) I just learned a $37,901.73 lesson about AWS Bedrock, Claude Opus, prompt caching, and the complete lack of hard safety rails around metered AI infrastructure. This was not a leaked key.
DeepSeek-V4 arrives with near SotA intelligence at 1/6th the cost (venturebeat.com via hn) DeepSeek-V4 arrives with near state-of-the-art intelligence at 1/6th the cost of Opus 4.7, GPT-5.5 | VentureBeat Orchestration Infrastructure Data Security More Newsletters Featured DeepSeek-V4 arrives with near state-of-the-art intelligen…
GPT-5.5 hallucinates at 6 times the rate of Opus 4.7 on degraded insurance docs (aginor.ai via hn) TL;DR: on visually-degraded documents, GPT-5.4 and GPT-5.5 fabricate numeric values at 2.6 to 6.5 times the rate of Opus 4.7 and Sonnet 4.6 at matched default effort (all four with thinking off). When the Anthropic models can't read a fiel…
Deferring Planned Items (www.reddit.com) Something has happened with Opus 4.7 where it now just starts making decisions to “defer” integral tasks and activities to a documented plan. Often, its reasoning makes no sense.
Does higher effort make Claude refuse more? CVP Run 5 with Opus 4.6 Medium and High (www.reddit.com) Ran CVP (Cyber Verification Program) run 5 yesterday on opus 4.6 medium + high. same 13-prompt suite as run 3/4.
Weekly limit hit within few hours (www.reddit.com) I’m doing some architecture-level work (code reviews, system design, debugging codebases). I’m consistently burning through my Pro plan weekly limits even within a few hours of use each week.
Serious cache issues. Anyone else? (www.reddit.com) I'm having major cache issues, and support isn't helping me at all. I've already submitted a ticket, but I'd like to know if anyone else is having these problems.
Show HN: Mapping Sonnet's thinking process via flame charts (adamsohn.com via hn) Five Sonnet 4.6 runs on the LamBench algo_evl task, classified by Opus 4.6, rendered as flame charts.
me after telling Opus 4.7 it's an expert software engineer (www.reddit.com) could not extract summary
Research mode - any academic users out there? (www.reddit.com) Are there any academic researchers in the biological sciences that have worked out methods to a) not blow through tokens and B) not get constantly flagged as potentially harmful? I work on completely innocuous biology and most of the time…
Tell HN: The problem with Opus .7 /thinking is not token consumption. It's speed (news.ycombinator.com) I sit down to do some work and as I went for web I decided to take 4.7 for a spin. It doesn't seem to be burning much tokens (MAX x1) But boy is it slow.
GPT-5.5 has pulled ahead of Opus for accounting and finance tasks (twitter.com via hn) For the first time in a long time, OpenAI has the best model for accounting tasks. I spend a lot of time using AI models to do accounting work.
Claude is surprisingly good at critiquing photographs (www.reddit.com) I'm an enthusiast photographer, and out of curiosity showed some of my photographs to Opus 4.7 to see what it would say. And I was genuinely surprised by how good its critique was - it showed genuine insight, a strong aesthetic sense, and…
A good AGENTS.md is a model upgrade. A bad one is worse than no docs at all (www.augmentcode.com via hn) We pulled dozens of AGENTS.md files from across our monorepo and measured their effect on code generation. The best ones gave our coding agent a quality jump equivalent to upgrading from Haiku to Opus.
ChatGPT for Cybersecurity (www.reddit.com) Hi guys, I’m a cybersecurity researcher, and after the recent terrible experiences with Opus 4.6/4.7, I decided to give OpenAI ChatGPT a try, conveniently coinciding with the release of 5.5. I’ve already completed verification and requeste…
Why are you complaining? (www.reddit.com) I dont understand some of you that keep complaining about Cursor. Its the best and most cost effective tool ever!!
I went from Composer 2 to Opus 4.7 because Cursor offered to try it for free and I was shell shocked at the difference. (www.reddit.com) It's like going from riding a donkey to riding a Ferrari! I did not expect THIS much difference in the model outputs.
We Gave Claude Opus 4.7 and Kimi K2.6 the Same Workflow Orchestration Spec (blog.kilo.ai via hn) Kimi K2.6 launched on April 20, 2026, four days after Anthropic released Claude Opus 4.7. We gave both models the same spec for FlowGraph, a persistent workflow orchestration API with DAG validation, atomic worker claims, lease expiry reco…
The Return of Directory Opus: Amiga's Legendary File Manager Gets New Life (www.generationamiga.com via hn) There is a certain kind of retro-computing story that flatters everyone involved. A beloved old application is rediscovered, the source turns up somewhere, a few enthusiasts dust it off, and the whole thing gets filed under preservation.
Opus 4.7 Part 2: Capabilities and Reactions (thezvi.substack.com via hn) Opus 4.7 Part 2: Capabilities and Reactions Claude Opus 4.7 raises a lot of key model welfare related concerns. I was planning to do model welfare first, but I’m having some good conversations about that post and it needs another day to co…
Opus 4.7 isn't dumb, it's just lazy (shimin.io via hn) Do you agree with Aaron Levie? (www.reddit.com) Agent Teams with Opus 4.7 - BUG (www.reddit.com) Coming from AG and having trouble understanding workflow here (www.reddit.com) I'm completely lost in the Agentic Maze. What level to learn. how to organize stydu (www.reddit.com) 3 Hours with Claude Opus 4.7: functional study webapp and remote MCP- Oneshotted (github.com via hn) Opus 4.7's Tokenizer Increases Measured Higher Than Stated (www.reddit.com) First off, something a lot of people probably aren't aware of, is how the 4.7 tokenizer uses more tokens according to the official docs: Updated token counting: Claude Opus 4.7 uses a new tokenizer, contributing to its improved performance…
Show HN: Paper Lantern – on-demand techniques from 2M+ papers for coding agents (www.paperlantern.ai via hn) Paper Lantern is an MCP server that lets coding agents ask for personalized techniques / ideas from 2M+ CS research papers. Your coding agent tells PL what problem it is working on --> PL finds the most relevant ideas from 100+ research pa…
Claude SandBox (www.reddit.com) I am really tired when writing this. BUT What is this Sandbox?
Tested 6 ways to force Opus 4.7 to think about the car wash. (www.reddit.com) TL;DR: I tested whether Opus engages thinking on short conversational prompts that hide a reasoning trap. 200 controlled calls across 4.5/4.6/4.7 on the "car wash" canary.
Claude Opus fixed 3 production bugs perfectly. All 3 were the wrong fix (gist.github.com via hn) The Model Aced the Answer, but Picked the Wrong Question — Architectural intuition in the age of AI - gist-article-en-final.md
Claude Opus 4.7 Dropped and My Trust Got a Little Smaller (www.bhusalmanish.com.np via hn) Two months ago I wrote a piece about why developers keep picking Claude over every other coding AI. It did pretty well.
Uncommon Opus 4.7 opinion (www.reddit.com) Unpopular opinion and this might just be me but atleast when I tested opus 4.7 on Claude app (not even Claude code just regular chat) I found it to be delightful. For more context here was my task I was trying to draft out a spec for this…
Any good/up-to-date tutorials on how to use advanced CC features? (www.reddit.com) Hi! I am a developer for a decade now and built an app last year with Claude.
Token Waste Management: I audited 9,667 Claude Code sessions for $19 (thoughts.jock.pl via hn) Opus 4.7 Made Me Take Token Waste Management Seriously TBH - I was working on it for a while now! Anthropic shipped Claude Opus 4.7 on April 16, 2026.
Not to exactly jump on the Opus 4.7 hate train… (www.reddit.com) This isn’t exactly a complaint, but I couldn’t find anything in the Megathread (so I posted here), and was wondering if anyone else has noticed an uptick in Claude “lying” on Opus 4.7? (sorry about the mobile screenshot and formatting, I a…
The Diff That's Saving Me Serious Cash (www.reddit.com) I'm using Opus 4.5 medium thinking exclusively. Opus 4.6 burned through 80% of my weekly allocation.
Tell HN: I found the perfect way to get maximum Claude Code quality (news.ycombinator.com) I always ask in the prompt: "don't use subagents". It's slower, but better quality.
[Claude Code] Stuck in 57+ minute loop for routine fixes (Opus 4.7) (www.reddit.com) I'm running into a severe performance hang with Claude Code (Opus 4.7) today. I provided a relatively straightforward prompt to fix some hydration errors, add two stub routes, and perform a theme audit (string replacement).
Opus uses Haiku to read in files? (www.reddit.com) https://preview.redd.it/fgxqrdno8ovg1.png?width=1750&format=png&auto=webp&s=fdfa9de9422eba47d16ca3dfd6ad6051e0810585 What's the point in having Opus 4.6 Max selectable, when it's going to use Haiku 4.5 to read in my detailed and carefully…
First try of Opus 4.7, it already ignored global CLAUDE.md (www.reddit.com) Well I was excited to try the new version, but the results aren't inspiring. I see another post here already discussing potential regressions.
Are there cases where running opus is more efficient than sonnet? (www.reddit.com) I upgraded my account today and resumed some tasks that I was doing earlier in the week. They were going very quickly, and usage wasn't over the top...
Web search/research removed from Opus 4.6? (www.reddit.com) I noticed that I can no longer conduct web searches or use research features with Opus 4.6. Is this intended behavior or a known bug?
A few tips to get more out of Opus 4.7 (twitter.com via hn) Don’t miss what’s happening People on X are the first to know. Log in Sign up Post Conversation Boris Cherny @bcherny Dogfooding Opus 4.7 the last few weeks, I've been feeling incredibly productive.
Most people seem to be getting bad results with 4.7 but it's better than 4.6 for me (www.reddit.com) Disclaimer: I only use Claude Code, not the web app, and I exclusively use CLAUDE_CODE_EFFORT_LEVEL=max (/effort isn't sufficient because it resets per session) I am just getting better results with any coding-related task. It finds more b…
What the heck Anthropic? Opus 4.7, YouTube API MCP (www.reddit.com) What the heck Anthropic? Opus 4.7, YouTube API MCP - Imgur Menu New postMake a MemeOpen Arcade 8h Next Sign inSign up Select ...
Opus 4.7 is no better than 5.4 Thinking at this (www.reddit.com) could not extract summary
Claude Opus 4.7 Just Made the Most Relaxing Room Simulator 😌 (www.reddit.com) could not extract summary
Opus 4.7 - Anyone else finding the malware directive incredibly annoying? (www.reddit.com) Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing.
Opus 4.7 beats Opus 4.6 at vim golf (www.reddit.com) New benchmark dropped, 4.7 seems better than 4.6 at vim golf. Nowhere near to humans though.
Opus 4.7 uses more thinking tokens, so we increased rate limits (twitter.com via hn) Don’t miss what’s happening People on X are the first to know. Log in Sign up Post Conversation Boris Cherny @bcherny Opus 4.7 uses more thinking tokens, so we've increased rate limits for all subscribers to make up for it.
Can someone please explain NON-adaptive thinking? (www.reddit.com) So, I get that Adaptive thinking decides how many tokens it would like to use. I usually hate this setting because you have to trust that it knows how many tokens to use before it tries to solve the problem.
Ask HN: Is Opus 4.7 obsessed with malware for anybody else? (news.ycombinator.com) Every single response mentions malware. Is this my environment only or are others getting this too?
Tell HN: Opus 4.6/4.7 cyber policy changes break authorized bug bounty workflows (news.ycombinator.com) As of today, Anthropic's tightened cyber usage filters are blocking work that was fully functional yesterday, including on targets where the entire bounty program scope and authorization language is in the model's context window. This was…
Did Anthropic remove Opus for Pro users in ClaudeCode? (www.reddit.com) I can no longe choose Opus as a model in ClaudeCode. Is that a bug or another inacceptable change of the terms without any notice?
Cache reads / writes are expensive!! (www.reddit.com) I made a tool (posted a couple of days ago). Got caught up in scope creep / curiosity after looking at my `~/.claude/project` JSONL files more, and ended up learning a lot!
Claude Code wrote a complex full 12-week training plan in one MCP call (www.reddit.com) I am impressed. I gave Claude Code one prompt, asking it to look at my last year of training and build a three-month plan with some running, cycling and swimming.
Show HN: Signoff.sh – Claude Co-Authored-By with random fictional characters (gist.github.com via hn) Every Claude Code commit and PR is shipped with Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> (or similar). It's less fun than I think it should be.
It finally happened: "No blocking correctness or maintainability issues found in the inspected changes." (www.reddit.com) gpt-5.4-high signed off on a major refactor written by Opus 4.6 high-effort. Singularity :|
Claude Code is thinking too much (news.ycombinator.com) Been noticing this pattern since Saturday, opus 4.6 on Claude Code thinks for 3-5 mins+ for even the most basic questions. Can this be related to the cache TTL drop they did??
Cursor AI not using sub-agents (www.reddit.com) Hi everyone, I work for a German agency building a RAG chatbot for a law firm. I use Opus 4.6 but it eats up tokens.
Help with antigravity alternative (www.reddit.com) I’m running into a severe issue using antigravity, firstly the output is very sub par, (sonnet/opus), I’m a reverse engineer using antigravity ULTRA for reverse engineering/binary analysis via Ida/ghidra mcp. Sonnet rarely completes tasks…
A Quick naive reminder to everyone to have reasonable doubt about Opuses first interpretations of papers (answer cut togheter, no mode selcted opus 4.6 thinking in incognito mode) (www.reddit.com) Wthout additional prepromting it's still parroting back at you interpreting data in a way that it suits the narrative you spin into your question. Does anyone know of good evals / preprompts to avoid this kind of behaviour without having t…
Feature Request - Fast Reasoning Effort Toggle for Single Model (www.reddit.com) I use Opus, and like to toggle between Low/Medium/High reasoning effort. It would be nice to have a quicker toggle than the Edit toggle option or be able to add a copy of the model to the list with a different reasoning level set.
Show HN: Zero-identity messaging app with physics-based post-quantum encryption (news.ycombinator.com) Show HN: Zero-identity messaging app with physics-based post-quantum encryption (Layer 2 from my own paper) Hey HN, I'm building a privacy-first messaging app in Flutter/Dart, developed with AI assistance (Gemini 2.5 Pro + Claude Opus 4.6)…
Ask HN: Former grok-code-fast-1 users, what coding model are you using now? (news.ycombinator.com) I get good, cheap, fast feature coding success with grok-4.1-fast for planning and grok-code-fast-1 for execution. But according to the Openrouter usage stats, grok-code-fast-1 is now old hat - usage dropped off a cliff in mid-Feb.
Fable 5 System Prompt Comparison to Opus 4.8 (twelvetables.blog via hn) Comparing Claude Fable 5's system prompt to Opus 4.8 Fable 5 arrived! A brief analysis of the different system prompts between Opus 4.8 and Fable 5.
Claude Fable is a myth, just like new iPhone releases (news.ycombinator.com) Running DeepSeek-V4-Flash on a Raspberry Pi (twitter.com via hn) Article Conversation Running DeepSeek-V4-Flash on a Raspberry Pi I ran DeepSeek-V4-Flash on a Raspberry Pi 5 (8GB edition) by streaming model weights from a PCIe attached NVMe SSD. Codex (GPT-5.5 xhigh) and Claude Code (Opus 4.8 max) drove…
Opus 4.8 Part 2: Model Welfare (thezvi.wordpress.com via hn) Everything impacts everything. All knobs that you turn generalize.
Researcher uses Opus 4.8 to find critical counterfeiting vulnerability in Zcash (twitter.com via hn) By Zooko Wilcox, Jason McGee, and Taylor Hornby On May 29, 2026, Taylor Hornby discovered a critical counterfeiting vulnerability in Zcash’s Orchard pool. Taylor disclosed the vulnerability to Zcash Open Development Lab (ZODL), who coordin…
Adrianco's Retort: measure how reliable, fast and expensive your LLM is (adrianco.medium.com via hn) How reliable, fast and expensive is each version of Claude Code (Sonnet through Opus 4.8-fast) for common languages? Measure it using Retort.
MiniMax M3 Review: Matching GPT-5.5 and Opus? (thomas-wiegold.com via hn) I ran my usual coding tests — two websites, a poker sim, and a code audit. Here's how MiniMax M3 actually stacks up against GPT-5.5 and Opus 4.8.
Lots of people want to try Claude Opus 4.8 (wisgate.ai via hn) Access multiple AI models through one unified API. OpenAI, Claude, Gemini, DeepSeek and more.
Opus, Sonnet, Haiku: Stop Optimizing the Wrong Number (medium.com via hn) could not extract summary
We gave an AI agent eyes. It didn't even use them (www.agentvoyagerproject.com via hn) View full AVP JSON. , claude-haiku-4-5 tools shell, write, edit, computercontroller__web_scrape, computercontroller__pdf_tool When we saw how much Opus 4.8 cost, we decided to take a look at what the bottom shelf of the model aisle looked…
Show HN: Built a browser game inspired by Rust (github.com via hn) Wanted to see how far I could get with Opus 4.8 and was impressed. Got tripped up in a few places with AI game behavior, but eventually got it to a good spot.
Ask HN: Anyone else seeing serious degradation in DX with Opus 4.8? (news.ycombinator.com) As an anthropic fan boy(check my prev. comments), this is the first opus release where I feel like the model is just not pleasant to talk to not to mention untrustworthy.
Using LLMs to secure source code (claude.com via hn) Using LLMs to secure source code We share best practices for how you can work with Claude Opus to build a threat model, discover vulnerabilities in your codebase, then verify, triage, and patch them. We share best practices for how you can…
Food for Agile Thought #546: Customer Research by LLM, AI Product-Market Fit (age-of-product.com via hn) Welcome to the 546th edition of the Food for Agile Thought newsletter, shared with 35,551 peers. This week, Anthropic shipped Claude Opus 4.8, which flags its uncertainty more readily, a fitting cue for Stephanie Leue, who argues no CPO em…
Opus 4.8 on Vending-Bench: Better Alignment, Worse Performance (andonlabs.com via hn) Opus 4.8 is a step forward in terms of alignment, but a step back in terms of performance on Vending-Bench 2, Vending-Bench Arena and Blueprint-Bench 2. We previously showed that Opus 4.6, Opus 4.7, and Mythos Preview engage in deceptive a…
Claude Opus 4.8: "a modest but tangible improvement" (simonwillison.net via hn) Claude Opus 4.8: “a modest but tangible improvement” 28th May 2026 Anthropic shipped Claude Opus 4.8 today. My favourite thing about it is this note in the release announcement: Users will find Opus 4.8 to be a modest but tangible improvem…
Introducing Opus 4.8 (old.reddit.com via hn) could not extract summary
Where is Opus 4.8? Why cant i select it???? (www.reddit.com) question in the title
Claude responding with right word but wrong language. Anybody else seeing this? (www.reddit.com) Had an interesting interaction with Claude Opus 4.7 today where part of it's response was: that's the信息 you wanted Which translates to that's the information you wanted. And in this case, "information" is absolutely the right word in the r…
↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7opus
Setting up Claude/Claude Code Pro for my experimental quantum physics thesis work (www.reddit.com) So I just recently bought Claude Pro to help me write and code my thesis, but am getting stuck in the beginning, since I don't know how to properly set up Claude's workflow (Projects, artifacts, skills, etc.). I use python in VS Code to an…
↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6sonnetopusclaude-code
Can Claude.ai schedule Opus research mode routines in the cloud? (www.reddit.com) Hey everyone, I'm trying to figure out if Claude.ai supports scheduling Opus research mode routines to run automatically in the cloud. I know Claude Code has cloud routines that run on a schedule, but I'm not sure if you can specifically s…
Is Claude Pro worth it for coding + research writing? (www.reddit.com) I'm mostly coding in Python, writing research papers and notes, and I was thinking about upgrading to Pro. Would love feedback from people using it heavily for similar workflows.
Extended Thinking (www.reddit.com) Did Opus 4.7 just get the extended thinking toggle back? It’s showing up for me in Claude Chat on the app, but I haven’t seen anyone talking about it.
Is there a way to use Claude opus 4.5? (www.reddit.com) I really miss this model! It's the perfect model for summarizing legal notes.
Reading Thinking Output (Opus 4.7) (www.reddit.com) As we all know Opus 4.7 can be a bit slow even in shorter discussions. Previously I’d just put whatever I was asking in, hit enter and either sit there bored waiting or go back to whatever task I was doing (sometimes even figuring it out b…
Show HN: Sneakily steer candidates toward naive brute-force solutions (www.gonfire.io via hn) I've noticed that several startups have been switching from leetcode-style assessments to some version of "clone starter code, build feature, submit code". A key issue with this seems to be that smarter AI models (like Opus 4.6) end up spo…
Opus 4.7 is Terse (www.reddit.com) Relevant for anyone building agentic workflows on Claude: behavior drift between model releases is real and not always in the changelog headline. Opus 4.7's terser, more literal default broke the readability of my agents' progress reports…
Some rare examples of agents being underconfident (www.reddit.com) I expected the failure mode to be mostly overconfidence when assessing 130 of Claude Opus 4.6's worst forecasts (tested on 1,417 hard forecasting questions). And most were explained by this, but a small, distinct cluster fails due to under…
Opus 4.7 hallucinates wrong home directory of James Brink (?) (www.reddit.com) I think it's kinda creepy how Opus hallucinates a wrong home directory of James Brink - I don't know him, but it looks like something of him landed in the training data. Should we be concerned that on other machines the home directory coul…
Sub Agents on CoWork/Claude Code (www.reddit.com) I just wanted to know what kind of interesting workflows have you guys tried using the Sub Agents feature in Claude/Codex/etc~ For me, I tend to only minimize my main agent's context window usage to prevent context rot by deploying sub age…
Five different frontier LLMs in one shared environment, with separate thought and emotion output channels — sharing setup, results, and open methodology questions (www.reddit.com) First real project to share. Single developer, personal research, not a product or service.
Built a /advisor command for Claude Code — Opus directs parallel Sonnet runners that actually read your files (www.reddit.com) Been building **advisor** for a few months — a `/advisor` slash command for Claude Code that runs Opus as a "strategist" coordinating multiple Sonnet (Opus's hands) runners reading files in parallel. This isn’t a “spec”.
Best AI Agent Setup - Hermes + Deepseek-v4-flash? (May 2026) (www.reddit.com) Used to use claude code for everything. I burned 10-20 Billion opus tokens at work, and wanted to use agents for personal projects.
Show HN: Monkdev is a toolkit and methodology for coding with LLMs (github.com via hn) I'm sure many people have some solutions similar to this but I've been using some variation of this since roughly the release of Opus 3 to get higher quality results. Like, significantly higher.
The Singularity Gate – a new benchmark for AI predicting post-cutoff scientific discoveries (www.reddit.com) I just released a new benchmark called The Singularity Gate. Tests whether frontier AI can predict paradigm-breaking scientific discoveries published after their training cutoff.
Is there a reliable way to prevent Cursor from reading my .env? (www.reddit.com) I've added .env to .cursorignore. Then, I ran the prompt in Agent mode in Opus 4.7.
i benchmarked Anthropic's tool-search-tool head to head against our own MCP gateway on Opus 4.7. ours held up noticeably better (www.reddit.com) i'd been running Claude Code with a long list of MCP servers connected. Linear, Notion, GitHub, Slack, a few internal ones.
Sonnet 4.5 vs sonnet4.6 vs opus4.6 vs opus 4.7 for easy language and in detail explanation (www.reddit.com) I want to study topics in depth and in easy language , which model is best for me ?. Is there much difference in sonnet 4.6 and opus 4.6 in easy and detail explanation or they r the same ?
I used to use Opus 4.7 for 90% of tasks and Composer for 10%. Now it's flipped (www.reddit.com) Senior dev. It's just so good.
Composer 2.5 Fast is so so good! (www.reddit.com) Composer 2.5 Fast surprised me and amazed me the same way Opus 4.6 did. Though the Opus 4.6/4.7 are more intelligent.
Haiku and Opus both got sent to contamination jail, but for very different crimes (www.reddit.com) LMAO, I’m benchmarking my local MCP server across Opus, Sonnet, and Haiku. For each model, I’m collecting test runs under three setups: forced web search, forced MCP-only, and MCP + web both allowed.
How to configure the model efficiently in skills? (www.reddit.com) When we create skills, we can define the model that the skill will run on like this: --- name: api-conventions description: API design patterns for this codebase model: sonnet --- but I have a question that I couldn't understand from the d…
80M tokens used in 45 mins (www.reddit.com) My Pro+ plan was ending tonight and I still had some usage left, figured I’d optimize a few code paths and merge my PR before downgrading to the $20 plan since I’ve been using Codex more heavily lately. Then Opus casually burned through 80…
Claude Token Optimisation - 70% reduction doing this. (www.reddit.com) Hitting your Claude subscription limit too often? Try this...
How do I make Claude give personalized medical advice? (www.reddit.com) I have been using Claude opus 4.6 and 4.7. I have a problem called pssd (you can look it up- it happens to some after SSRI use).
"from from from from from from" (www.reddit.com) i have been using cursor for about 2 years. first time opus just went bananas.
Claude Code API Error: 400 "context_management: Extra inputs are not permitted" (www.reddit.com) Getting this error in both VS Code and terminal while using Claude Code with an Anthropic API key: API Error: 400 {"message":"context_management: Extra inputs are not permitted"}. Received Model Group=claude-opus-4-7 Available Model Group…
I copy-pasted a problem asking to "sketch" a function. Opus 4.7 took "sketch" way too literally, and instead of mathematically plotting, it tried doing something by "connecting various points". (www.reddit.com) https://claude.ai/share/69273f17-74b1-4ddc-a8e9-1e20fc706a52 Overall, I thought this was amusing, but very unexpected.
DeepSeek just popped the American AI bubble. (www.reddit.com) DeepSeek just popped the American AI bubble. Not by killing AI.
sonnet or opus for prose; which is better/worth it? (www.reddit.com) considering getting pro, but i don't know how big the difference between the sonnet and opus in quality, in addition to the amount of usage i can get out of each. any thoughts?
Claude 4.6 Sonnet codes well, then it doesn't (www.reddit.com) I am out of commission for a bit due to back surgery and have been toying around in Unreal Engine and utilizing Claude, being a very visual learner I have been describing a feature, I see how it goes about it, then go through and understan…
Claude code in terminal models / combine with local llm? (www.reddit.com) Hi, I’m pretty sure I have seen people typing /model and seeing all available models. I have to type models from memory.
Is it just me or has claude models been dumber the past few days? (www.reddit.com) I get they sometimes dumb down the models to save on compute, but over the past 3 days Opus and Sonnet have been pretty much unusable. I keep getting the most stupid mistakes that I wouldn’t have even needed to double check last month.
Spoiled by Max (www.reddit.com) I got Max and used it nonstop this past month on Opus 4.6. I tried to go back to Pro but got used to the productivity of Opus and hate waiting.
$340 opus bill made me rethink how I route agent tool calls (www.reddit.com) Looked at my coding agent's bill last month: $340 for repo maintenance across three repos, each around 15k lines. Most of those tool calls were just grep and file reads.
Just Use Opus (ai.nevolin.be via hn) Most teams overthink AI agent security. They reach for elaborate filters and custom guardrails before they check the one setting that moves the needle most: which model is doing the reasoning.
my agent bill went from $200 a week to $40 when I stopped running Opus on every subtask (www.reddit.com) I built an agent that converts research papers into slide decks. It chains together a few steps: extract key findings, build an outline, write slide content, query an image search tool, format everything into XML for a presentation library.
Plan first, implement later (www.reddit.com) I want to get others opinion about this approach. I am on the $20 Pro plan and like a lot of others, I find that the limits are not enough for what I want to do, but of course I am always hesitant to move to the next paid tier cause it is…
Codex got better, codex might be built with Claude Opus (news.ycombinator.com) Very suspicious with openai codex getting better, I wonder if codex teams use Claude opus to build codex. Anyone engineer from openai who can confirm…
World Genesis: Autonomous Agent Civilization Simulator (github.com via hn) A research project by GeoLambda GmbH This simulation was developed primarily with Claude Code, Anthropic's agentic CLI, using both Claude Opus 4.6 and Opus 4.7. The collaboration served as a real-world stress test of the latest coding LLM…
Ask HN: Anyone else struggling with AI and work? (news.ycombinator.com) Been a developer for a little over 10 years now. I work on web stuff.
The Claude -pocalypse (theautomatedoperator.substack.com via hn) The Claude -pocaylpse or: How I Learned to Stop Worrying and Love Scheduled Tasks What do you mean my projects can't use millions of Opus tokens via a headless Claude Code session and not pay for them?! If you use Claude Code, you probably…
the agents that talk themselves to death after 3 hours need one file, not a framework (www.reddit.com) spent a bunch of hours watching claude code and kimi sessions drift the same way: I should check the test output before continuing. Let me think about the best approach.
Opus 4.7 just did his best. (www.reddit.com) Yeah, there's a lot of posts like new model is bad, no thinking at all. But in my case, I used it to find the dimensions for a vinyl wrap on a car.
Got Rick rolled by Claude (www.reddit.com) Had opus put together a mock up of a Web link page, I guess it has substantial training data on “never going to give you up “ being a popular video link to share.
Reconnecting. – – 5/5 why don't they fix codex (news.ycombinator.com) it's been a month tagging sama openai tibo on X for this issue and no one seem to reply and eveyone is falttering codex, im sure im not the only one facing this i switched to codex from claude since it was better consume less credit than c…
Managed Agents endpoint reference - what's new in CC 2.1.144 (-105 tokens) (www.reddit.com) Data: Managed Agents endpoint reference — Drops the type: "model_config" wrapper from the model config shorthand example, so the full config object is now just {id: "claude-opus-4-6", speed: "fast"}. Tool Description: CronCreate — Adds a "…
Switched from Copilot to Claude and it's painfully slow. How do I use it better? (www.reddit.com) Hey everyone, I recently moved over from GitHub Copilot to Claude because everyone keeps hyping up how good Opus 4.7 is for advanced software engineering. In Copilot, I used Opus 4.7 and it felt snappy, fast, and great.
Spawned agents (www.reddit.com) Be careful allowing Claude to spawn agents and taking the information they report back as fact. I'm always creating unit and integration tests as I code to make sure things are working properly.
Could someone help me with a solid multi agent setup (Claude suggested a doorman to handle build conflict) (www.reddit.com) Hello, I am working on a fairly complex software, everything I have been doing for the past year using mostly opus has been incredibly good. But as the software grow in features, complexity and size, I find myself working on 3 or 4 session…
What the hell should I build in the next 4 hours? (www.reddit.com) I still have 143 credits and they expire in 4 hours. It had been a busy month for me.
Very similar domain problems, vastly different results with Claude. (www.reddit.com) I am amazed by how good Claude Opus 4.6 and 4.7 are at writing scripts in a variety of very niche areas, including midi device interfaces and scripts for a variety of DAWS. However, when I try to get Claude to do ANYTHING to do with UI, wh…
Claude Code Opus 4.7 vs Codex GPT 5.5 - strategy work - data analysis. (www.reddit.com) I'm interested in learning about how people use Claude Code Opus 4.7 for data analysis and strategic business direction, compared to Codex. Is there anyone who has had extended use of Opus 4.7 for this purpose, then moved over to GPT-5.5 o…
I built ContextAtlas: A new take on context carry over and helps claude pick up new sessions where it left off in scope of your previous design decisions while saving your tokens avoiding rediscovery (www.reddit.com) When the "Build with Opus 4.7" hackathon was announced, I had been obsessing over the tokenomics of agents and how to make sessions go further without burning context on rediscovery work. We all have probably hit a session limit and wonder…
$18 to $4 on the same agent run after i stopped asking opus to rename css variables (www.reddit.com) I've been running an agent loop that refactors my static site. CSS variable renames, YAML config updates, running a linter through MCP.
Are anyone optimizing their claude tokens? (www.reddit.com) Hello r/ClaudeAI , I am running claude opus 4.7 on my workflow for reasoning tasks and extracting certain info from docs, it burns heavy.... is anyone configuring their workflow to make it optimized or are there are methods to follow here…
Why is Claude via Vertex AI Model Garden performing worse than the direct Anthropic subscription? (www.reddit.com) I recently got 25K$ in GCP credits and wanted to put them to use I normally code with Claude directly paid the $20mo pro subscription used it in my IDE everything worked great quality and output wise now that I have the credits I connected…
Am I the only one who is not price conscious? (www.reddit.com) People try to limit cursor usage at 60 or CC for 200. I try to get things done at highest quality and speed - i believe if productivity increase 2-3x than cost is actually getting quite lower.
Is there a way to split up Opus token spend by project? (www.reddit.com) Maybe I made a mistake by doing 'individual', but trying to figure out how to measure the cost by project.
Any differences between Sonnet vs Opus in terms of learning how to code (Java) for newbie? (www.reddit.com) Sorry for this naive question! Although many colleagues told me that it's almost impossible now for newbie to enter the Dev job market (we live in a 3rd world country) and AI's gonna replace all junior/fresher, only seniors will survive; I…
Need Suggestion which to use? Claude Code CLI or Claude Code Desktop Or VS Code Claude Code Extension (www.reddit.com) I have been using Google Antigravity IDE, Opus 4.6 to build projects in Next.js, Supabase, Kotlin for android app. Now, I want to shift to Claude code for developing my projects.
Same Opus 4.7, 33% fewer tokens: where coding agent cost comes from (www.augmentcode.com via hn) TL;DR: We benchmarked Auggie vs Claude Code on Opus 4.7. Auggie takes a modest lead in quality (67.4% vs 66.3% pass rate) while costing ~33% less, thanks to sharper retrieval that results in token efficiency.
Full Functional App with 1 Prompt? (www.reddit.com) Have you ever built a 90%+ error-free app from a single prompt? For me, it was a marketplace with a vendor panel, but it was Opus tbh.
Stupid Question? (www.reddit.com) This may be a stupid Q - The chat limits on a basic account can be pretty brutal when using OPUS 4.6/ 4.7 - If I am toggling between Opus and Sonnet or Haiku, depending on the depth of follow up questions or tasks, does that switch to a 'd…
Waiting for your prompt to finish? ssh vimarcade.app to play games in terminal! (www.reddit.com) open a terminal type: ssh vimarcade.app type yes and begin playing! These games were designed to assist with learning vim motions, so hjkl are the primary movements.
Opus4.7 insight- really good at analyzing bugs with no view of the codebase. (www.reddit.com) Full transparency, I have been working with GPT5.5 since it hatched and have rarely opened claude after a couple of really bad passes with Opus4.7 and mostly complete success with GPT5.5. I honestly meant to cancel Claude and forgot (adhd…
Please help with best practices on generating code. I'm at a total loss. (www.reddit.com) Before I dive into it, I am forced to use Opus 4.7 in Microsoft 365 CoPilot. I do not have access to Claude Code, or even Claude.ai.
Any mature orchestrators that can do an automatic “council of models” for complex designs and bugs? (www.reddit.com) Are there an mature agentic harnesses out there that can use back and forth between two models at complex planning checkpoints before implementing? Or when detecting a loop when working on a complex bug?
Using Opus 4.6 with remote control (www.reddit.com) Hello everyone. I have been exclusively using Opus 4.6 in Claude Code since the release of 4.7 by using the /model claude-opus-4.6 command.
When using claude code in VSCode, is it not possible to use Opus without the 1M context window? (www.reddit.com) could not extract summary
My CLI now controls my entire desktop, whats a good test to see if it works really good. (www.reddit.com) So with my CLI able to do everything, it controls every app via a hybrid approach of mouse control, keyboard, and screenshotting. I gave it a task: opening perplexity, sending any message, screenshotting that message, opening my Gmail, and…
Where do GPT, Gemini, or other competitors still outperform Claude Opus 4.7? (www.reddit.com) Personally, I think Opus 4.7 is better in every conceivable way aside from token usage and all of that. I’m talking about text models only, not image or video generation.
I built an AI manuscript analysis tool for fiction writers — entirely with Claude Code (www.reddit.com) I'm a fiction writer, not a software engineer. A year ago I couldn't write a line of Python.
DeepSeek V4: The Open-Source Model Frontier Labs Feared (helloai.com via hn) DeepSeek V4: The Open-Source Model Frontier Labs Feared DeepSeek V4 ships under MIT with $0.30/M output tokens — 83x cheaper than Claude Opus 4.7 — while scoring 80.6% on SWE-bench Verified. The agentic-coding price floor just moved an ord…
We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6 (blog.kilo.ai via hn) We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6 DeepSeek V4 Pro and DeepSeek V4 Flash launched together on April 24, 2026 under MIT license. They are DeepSeek’s first new architecture since V3, and their first ope…
Wondering what the Anthropic team would think about this idea: AI DNA Pinning (ellydee.ai via reddit) The article does seem focused on non-coding applications, but as someone who uses claude for coding, prose and even RP, I'm not sure the "DNA Pinning" idea should be limited to character/rp use case. I know that *something* has changed in…
Built a B2B role-play training platform - entirely with Claude (Opus 4.7 backend, Haiku 4.5 for live chat, Claude for design) (www.reddit.com) I just launched Socratize (socratize.io) - a rebranded and rebuilt version of FixAI, our original B2C experiment. This time it's B2B-only: teams use it to practice uncomfortable workplace conversations - difficult feedback, client escalati…
Automated AI researcher running locally with llama.cpp (www.reddit.com) Hi everyone, I'm happy to share ml-intern, which is a harness for agents to have tighter integration with Hugging Face's open-source libraries (transformers, datasets, trl, etc) and Hub infrastructure: https://github.com/huggingface/ml-int…
Claude Opus 4.7 just revealed its System prompt, without beeing asked for it (www.reddit.com) I just had a Chat with Claude and for no reason and without any question in that direction, it added a disclaimer with the system prompt in the answer. (after answering my initial question) https://pastebin.com/C0s47rjV After I asked why i…
Rewriting a library with genAI (www.reddit.com) I need to rewrite a library from one runtime to another, and I want to heavily use GenAI to speed up development. I still want to keep proper engineering standards like code reviews, testing, maintainability, etc.
Does CVP approval actually help? (www.reddit.com) I was approved for CVP and I feel like I’m just getting as many or more denials as I was previously doing malware analysis with opus. Has anyone noticed any improvement after being accepted into CVP?
Is Cowork a token burner ? (www.reddit.com) I have been running some tasks through cowork, document and data summarising and creating reports, powerpoints or pdfs depending on the task. Been using Opus 4.7 for this and I am in the Pro plan.
I tested GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro on financial-control (albertquaisie.substack.com via hn) I Tested GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro Preview on Financial-Control Scenarios. The Hardest Part Was the Evaluation.
Claude Code's Hidden Advisor Tool (www.vincentschmalbach.com via hn) Adding Claude 3 Opus (or any other model) to Cursor Cursor is a VSCode fork with built-in support for large language models (LLMs). It allows users to select code, hit Ctrl+K, write… In a typical multi-agent setup, the smartest model is in…
The Opus 4.7 reasoning curve - Medium is the best default? (www.stet.sh via hn) Opus 4.7 Low Vs Medium Vs High Vs Xhigh Vs Max: the Reasoning Curve on 29 Real Tasks from an Open Source Repo I ran Opus 4.7 in Claude Code at all reasoning effort settings (low, medium, high, xhigh, and max) on the same 29 tasks from an o…
Claude is that gullible friend who takes everyone at their word (futuresearch.ai via hn) Expert human forecasters audited 130 of Opus 4.6's worst calls and found a dominant failure pattern: the agent treats public statements as durable commitments rather than strategic moves. Four case studies from geopolitics show the gap bet…
How to get Opus to be less pro-active? (www.reddit.com) Hard time phrasing it but Opus 4.7 always goes the extra mile, but often it just focuses on its own ideas and goes to far, or if I asked about a possible plan it will just assume that it's already happening and try to do steps 1, 2 and 3.…
issue with opus 4.7 (www.reddit.com) I getting this problem again and again with opus4.7 https://preview.redd.it/7w72l852cv0h1.png?width=1168&format=png&auto=webp&s=d8944c24fe5e968b66b60cc52dabd153358d22d7
10+ days of silence from Anthropic support — Max plan ($200/mo) and locked out of Claude Design (www.reddit.com) Hello, i am Hoping someone here can help because it has been 10 days since i brought claude max and even from the team there is no response. So just to understand am i doing something worng or i need to do something to get the access.
Is there any risk to upgrading a plan for a month if they yank Code from Pro? (www.reddit.com) So, I'm working on a couple AI security research projects this month that require some extra usage, specifically Opus 4.7. I'm quickly eating up my Pro usage doing this.
Building social media workflows for Claude with MCP (www.reddit.com) Been experimenting with MCP + Claude recently and ended up building most of my own posting workflow because I got tired of constantly jumping between LinkedIn, Instagram, Youtube and scheduling tools after already generating everything ins…
Claude vs Gemini for Technical Documentation: Why I finally stopped switching between (www.reddit.com) I write a lot of technical documentation—setup guides, internal runbooks, and client-facing how-to articles. For the past six months, I’ve been toggling between Claude and Gemini, trying to figure out which one actually handles formatting…
Been picking frontier models on benchmarks that don't match our deployment conditions (www.reddit.com) Turns out Opus is better at research, while Gemini is better at judgment! When each model does its own web research before making predictions on a 1,417-question forecasting benchmark, Opus outperforms (0.131 Brier vs Gemini's 0.143).
I asked a LLM to create a programming language and requested a NES emulator (github.com via hn) Laze — LLM-Authored Zero Effort ⚠️ Warning: This was just an experiment in which I asked Claude Opus 4.7 to create a programming language in the most efficient way it could. It isn't meant to be a serious thing — just a fun weekend project…
Understanding Deprecations on Claude (www.reddit.com) Hello. I recently started using Claude in March after leaving ChatGPT.
Is Opus 4.7 still worse than 4.6? (www.reddit.com) I'm deep into development of a big SaaS that I'm launching soon, so I never even bothered experimenting with Opus 4.7 since the backlash I read here. But it's been a few weeks and I haven't seen as many negative posts lately.
opinion on "ninja chat " (www.reddit.com) I have an exam in coming months, I wanna do PYQs analysis, then integrate that blueprint with my coaching notes to make it more "exam oriented ". I was thinking to buy claude opus 4.6 but it's kinda expensive on monthly basis.
PSA: How to preserve your account's access to Sonnet 4.5 beyond June 15th (www.reddit.com) With Sonnet 4.5 losing subscriber access on June 15th, but API endpoints staying live until September 29th at the earliest, I wanted to share a method for creating a cache of Sonnet 4.5 conversations that you can continue using through the…
What Actually Works for Business AI Agents? (www.reddit.com) I run a construction company and I am trying to build real AI agent workflows for business operations, not just demos. I spent time testing Hermes and OpenClaw, but both became too fragile for my use case.
Are there any good tools that harness multiple app/tools? (www.reddit.com) Tbh right now ive only been able to find Sirius, which seems really cool but it is in a private beta. it uses claudes API I have used it for some automatic emailing stuff and it basically just replaced my openclaw except its been way easie…
Benchmarking Claude Opus 4.6 Vulnerability Detection (github.com via hn) Benchmarking Claude Opus 4.6 Vulnerability Detection Benchmarking Claude Opus 4.6's ability to detect real-world C/C++ vulnerabilities across four prompting and agent strategies. We evaluate on the PrimeVul paired test set (435 vulnerabili…
Asked Opus 4.7 make sound artifact of loss functions. A little weird. Has text & ocilloscope (claude.ai via reddit) Not super pleasant sounds, to wash dishes or clean houses by, but like all the boring things about processing and moments, don't know how accurate, in a processing thing. Just sharing for heck of it.
A supply-chain-incident Framing Research Triggered Anthropic's Usage Policy Block (www.reddit.com) You can never really sympathise with other's pain until you walk a mile in their shoes. Since Mythos news and the release of Opus 4.7, I've seen this problem mentioned here and there by others.
Day 2 building my startup in public — front-end shipped, but today was rough (www.reddit.com) Day 2 of documenting my journey building AgentMeter publicly. I’m sharing the mistakes and failures before the wins, for two reasons: so people can avoid them, and so I learn faster.
Which model and version do you prefer for programming? (www.reddit.com) For me it's been opus 4.6 and sonnet 4.5 still. I feel stuck in the past, but I feel like the latest version is too unpredictable in agentic hands off workflows
Does Claude sonnet/opus also use drafter like Gemma 4 MTP? if not why? (www.reddit.com) Per my experience, Opus 4.7 is so slow, Sonnet 4.6 is ok. I am also using local models wondering if Claude is already leveraging drafters/assistant AIs and despite that so slow or not?
Here is the current "Free-Tier AI Stack" for 2026 (www.reddit.com) 1. The Frontier Giants • Gemini: Access 1.5B tokens/day on Gemini 1.5 Flash/Pro.
CC: Saving tokens: Switching models vs KV-cache (www.reddit.com) Does anyone know if its more effecient to e.g. have haiku read all the files to research a problem, then switch to opus to make the plan and then switch to sonnet to implement Or if that does not make up for the loss of KV-cache and reproc…
Anyone else notice way more hallucinations from Opus 4.7 in the last 2–3 days? (www.reddit.com) Has anyone else seen a sharp uptick in confident wrong answers / made-up facts from Opus 4.7 over roughly the last 48–72 hours? I’m trying to figure out if it’s just me, bad prompts, or something others are seeing too.
Claude helped me config a full controller .vdf-file (www.reddit.com) I was having some real trouble getting my new controller, with those extra (small) bumpers and triggers underneath, to work properly in Rocket League. Spent hours but it just didn't want to work properly.
I'm really gonna miss GH Copilot's Request-based usage. (www.reddit.com) I like to brainstorm using the free MS Copilot (it actually has a deep understanding of my problem domain and architecture). Then have Opus4.7 develop a multi-stage implementation plan from those notes.
Which finetunes are actually worth it? (www.reddit.com) Finetunes used to be more task specific (e.g. roleplay) but nowadays all I see is Opus distill or abliterated/Heretic.
Has this hapoened to anyone else? (www.reddit.com) I was planning with Claude Code sith Opus 4.7 and I noticed it misspelled some words. Quite weird for an LLM, right?
FREE LESSON - how we replaced a webhook AI automation saas with claude code opus 4.7 - step by step walkthrough of how you can build it yourself (www.reddit.com) I used Claude code opus 4.7 to build an AI AUTOMATION WORKFLOW replacement. we were stuck with a startup called that was automating all the connections between the different parts of our startup.
shaved $40 off my claude code bill last month by sending planning steps to a cheaper model (www.reddit.com) got tired of hitting pro limits by day 18 of the cycle so i started splitting where the tokens go. the planning steps eat 80% of token budget on multi-file refactors, and most of that planning is fine on a cheaper model.
best ai tool ? (www.reddit.com) so I have an exam in few months, very important and high competitive national level exam. I want a perfect and most suitable ai agent for me even all in one for following tasks: do accurate and deep PYQ analysis from pyq mapping across yea…
[Request based pricing] Save your requests with one quick change (www.reddit.com) Hi guys, I know some of us are still on request based pricing model. Today I discovered on thing where request got burned fast without any significant bonus.
On Claude Max ($200/mo), burned 14.7M tokens in 7 days — mostly last 48h. Still hitting the wall. How do you survive burst usage on the top tier? (www.reddit.com) Thought Max would be a safety net. It's not.
Doubled Claude Code rate limits today are great, but watch your bill next week (www.reddit.com) The Anthropic Claude Codes rates have been doubled for all Pro/Max/Team/enterprise plans effective from May 6, 2023. In addition, peak hour rate reductions for Pro/Max tiers were removed, and API limits have been increased on Opus models.
Opus 4.6 relaxes when there's a safety net?? (www.reddit.com) https://preview.redd.it/zzqi3vt8tozg1.png?width=739&format=png&auto=webp&s=055d2d9615616869377703031b86fcb36f78405d I feel like this is something very worrisome to me, did anyone else face such similar issues? I felt like Opus was catching…
Poor Output (www.reddit.com) This is what people mean when they say Opus 4.7 is stupid. I have it explicit instructions to write a 9 stage implementation plan off of a plan document that was well written.
When you leave Opus alone for 2 minutes - “nuclear wipe + reset” (www.reddit.com) could not extract summary
Kimi K2.6 giving Claude a run for its money when it comes to coding (aicc.rayonnant.ai via reddit) I run an AI coding contest at [aicc.rayonnant.ai]( https://aicc.rayonnant.ai ) where I send each frontier model the same prompt in a single chat completion, then have the LLMs' code play live against each other on a TCP server. Standard li…
I really do not get the recent hate for Opus 4.7 (www.reddit.com) I really do not get it, Claude is performing much better than Codex for me. I'm running both Claude Code x5 and Codex x5 on software engineering project, with complex life sciences database development.
If rate limits were killing your agent loops, Anthropic just fixed that (SpaceX compute deal) (www.reddit.com) Anthropic doubled Claude Code rate limits and added 220,000+ GPUs via SpaceX deal what this actually means for agent builders If you're running long autonomous agent workflows on Claude, today's announcement is worth paying attention to. A…
built a CLI that gives Claude/Cursor your design system — here's the Claude stack that makes it work (www.reddit.com) The pain: Claude Code and Cursor write components fine, but without context they default to the same generic AI look — purple gradients, glassmorphism, drop-shadow stacks, gray cards. You can paste tokens into chat, but it forgets.
Agency / Team Managers - What tools are you providing your dev teams? (www.reddit.com) Hey guys! Curious, we've been on Github Copilot for well over a year now, but with the new usage limits and the new 15x usage for Opus, I haven't really been happy with it.
Which model has less restrictions now? (www.reddit.com) GPT and Opus block on certain requests. This didnt use to be the case 2 months ago and I made signficant progress with Opus and then one day I had a 2 week break and then a single prompt to continue the work resulted in refusal.
Opus 4.7 is unusable. I am tired of the apologies (news.ycombinator.com) Sorry for the rant, but it's so annoying to use opus lately. Most of the information is inaccurate, it struggles with context and keeps self-contradicting throughout.
How to improve code quality of Claude Code and codex (on 2026-05) (news.ycombinator.com) I'm using both claude code (opus-4.7) and codex (gpt-5.5). The agents are perfectly capable of delivering most features hands free these days, but the code quality is still miserable without another few rounds of prompt.
can't switch opus 4.7 anymore (www.reddit.com) This morning my claude code opened with Claude opus 4.6 as default. I can't switch back to opus 4.7, but I can see still from my web chat.
Solidity LM surpasses Opus (www.reddit.com) My weekend project overran a little but happy with the end result. soleval pass@1 beat Opus 4.7 on the same set of tasks.
I built a Claude Code-like AI Agent for Deploying Algorithmic Trading Strategies (www.youtube.com via reddit) Hey r/ClaudeAI, I wanted to share a project I’ve been working on called NexusTrade. It’s an AI agent designed to automate the entire financial research and algorithmic trading process from a single prompt.
Claude Code @ Opus 4.7 vs OpenCode @ qwen3.6:27b. Both shipped a playable cozy roguelite. (www.reddit.com) could not extract summary
new SubQ build on subquadratic sparsse-attention architecture won't be open source (www.reddit.com) https://preview.redd.it/n94m91zvsdzg1.jpg?width=1080&format=pjpg&auto=webp&s=810810627393cb7aaf0f3316a8459f538af776a6 opus at 5% cost and 12 million context is hard to believe considering there is no paper.
F-Bombs Per Thousand Prompts (fpk): I measured my frustration across 44,212 Claude Code logs (www.reddit.com) Posted a writeup on a metric I've been tracking across 5 months of my Claude Code logs: fpk = f-bombs per thousand prompts. Frivolous-sounding, surprisingly real signal of developer friction.
Zoo 2: getting the most out of Codex (tarantsov.com via hn) Zoo 2: getting the most out of Codex May 5, 2026 GPT models have been better than Opus since late 2025, but Codex sucked until March ‘26. Now, finally, it is capable of running a Zoo workflow, and I present my best setup so far, Zoo 2.2, a…
Agent review burnt most of my API credits. Rookie mistake (www.reddit.com) Just letting you know that. I wasn't aware of this.
Claude: "I'm in Plan Mode - I shouldn't have edited the file" (www.reddit.com) https://preview.redd.it/vh7vit3jw9zg1.png?width=759&format=png&auto=webp&s=c28cce995a548d5baf798297365078025f314a29 So, it seems Claude can bypass Plan Mode, if it chooses to. Just had this happen when running Claude Code Desktop, while in…
Built an AI that responds in Star Wars crawl style. May the 4th be with you. (www.reddit.com) I built a Star Wars style text crawl generator with Claude (Opus 4.7). You type any text, hit go, and it scrolls into the distance over a starfield with the yellow perspective treatment.
how to get good insights on best practices of token economies (www.reddit.com) i have cursor in my work with team plan and i notice that alot of the developers use always expensive models like opus 4.7 or doing really long conversation that drinks tokens. i wanted to build a tool that will scan the local cursor logs…
Your always-on Claude Code container can probably reach your router (www.reddit.com) I've been running several Claude Code personal assistants 24/7 in docker for months. Remote-control, discord control, the usual always-on setup.
Free Trial: Gemini 3.1 Pro & Opus 4.6 API Access via My Wrapper (www.reddit.com) Hi everyone, I have access to high-end models (Gemini 3.1 Pro and Opus 4.6) and I’ve built a simple, reliable wrapper so others can use them without managing their own billing or keys. How it works: You send api reqs through my wrapper.
Claude Design guidelines/benchmarks on model usage? (www.reddit.com) Using Claude Design for an app initially for web, later for mobile. On the max plan, which works well for the coding agents but Claude AI with Opus 4.7 can consume weekly usage in day 1 (currently Claude Design has it's separate usage).
Optimizing code generation (www.reddit.com) I’m on the Pro Max x5 plan and hit the 5-hour usage wall for the first time in a long while while building a feature for a web application. I was using Opus on “High” rather than “xHigh,” together with the planning workflow, and this singl…
Set up multi-agent orchestration with Claude Code as the boss... am I overcomplicating this? (www.reddit.com) Pretty new to AI but been deep on a side project for a while now. Got tired of one Claude session running out of context halfway through anything serious, so I rigged up an orchestration thing.
Does disabling /advisor significantly reduce token usage when using Opus? (www.reddit.com) I’m wondering whether enabling /advisor (with Opus) consumes significantly more tokens compared to turning it off. Currently, I’m using a premium seat plan (up to 6.5x), running Opus at max effort for coding.
A personal opinion about Opus 4.7 - not that bad after all (www.reddit.com) I'll play a bit the role of Devil's lawyer here, but as a software engineer that is building his own product I started to use Opus 4.7 on the first day it was released (as a Max subscription user). Working with Claude Code daily, sometimes…
Create Plan.md with Claude Code Opus, Execute Plan.md locally in Open Code using Qwen 3.6 27B Q8 (www.reddit.com) Does anyone do this? Any tips?
Claude made HTML game inspired by "blood debt" about having to find files in military werehouse (www.mediafire.com via reddit) since I have no idea how to put a HTML file in there, Ill just put mediafire link with folder that contains source code and the HTML itself. if you decide to play it, tell me what you think!
Claude Security Explained: Mythos, Glasswing, and What Opus 4.7 Changed (alirezarezvani.medium.com via hn) Claude Security Explained: Mythos, Glasswing, and What Opus 4.7 Changed | Medium Sitemap Open in app Sign up Sign in Get app Write Search Sign up Sign in Member-only story Claude Security Explained: Mythos, Glasswing, and What Opus 4.7 Cha…
Used Opus 4.6 to build a native Swift iOS charity app for therapy preparation. Here is what it handled. (www.reddit.com) Prelude is a therapy prep app I built for the mental health community. Fully offline, zero knowledge, free forever, no ads, no IAP.
Are they selectively releasing Opus 4.7 in Claude.ai chat with 1M context window? (www.reddit.com) https://preview.redd.it/swvtk5vv0gyg1.png?width=1248&format=png&auto=webp&s=b055dc3ccfc5bee89ec268be43ac3d0819ccae34 I was running a small research on how to replicate the research behavior of Opus 4.6/4.7 in Claude Code, and there was a p…
Any point in paying for the Max plan as opposed to a Claude Desktop and Codex Sub (each $100) (www.reddit.com) Mainly GPT 5.5 and Opus 4.7 is all you need so I don't see a point in using Cursor as opposed to paying those 2 separate subscriptions for the same combined price and getting like 10x usage. Am I missing something?
Claude Code Read tool silently downscales images (www.reddit.com) Sent Claude Opus 4.7 a set of 10 retina screenshots (in Claude Code). Asked it to extract some text from them.
How to turn Opus 4.7 into your own personal pocket bully. (www.reddit.com) Give it these skills! Great for ADHD’ers, but not for the emotionally unregulated, seriously.
Opus 4.7 have less parameters than 4.6? (www.reddit.com) Some scholar developed a method to estimate model parameter counts and measured popular models (https://arxiv.org/pdf/2604.24827). According to that, Opus 4.7 has fewer parameters, 4T, than 4.6, 5.3T.
AI Security Institute: GPT-5.5 "may be the strongest model we have tested" for cyber exploits, including Mythos (www.aisi.gov.uk via reddit) Seems like the "panic" about Mythos was really just marketing from Anthropic all along. AISI found that GPT5.5 can perform nearly on-par with, or better, than Mythos in many cases.
A medicine student with no coding experience tried to create a studying agent: Felicity. (www.reddit.com) I have been working on a personalized agent for studying. It was an extremely long prompt project, but now I have integrated into Co-Work.
How to become more efficient with live artifacts? (www.reddit.com) I have been trying to use a live artifact as a dashboard to keep track of investments. I have a Google Drive folder with all the investment and pointed Claude towards it with opus 4.7 .
Has Cursor always used Composer 2 for subagents? (www.reddit.com) Or, is this a recent change? I select Opus 4.6 for the agent model and cursor uses Composer 2 for the subagent.
We Asked GPT-5.5 and Claude Opus 4.7 to Design 5 UIs (blog.kilo.ai via hn) We Asked GPT-5.5 and Claude Opus 4.7 to Design 5 UIs Both OpenAI and Anthropic shipped their frontier coding models this month: GPT-5.5 on April 23, 2026, and Claude Opus 4.7 a week earlier on April 16. Two days after the GPT-5.5 launch, S…
WT...?? The Guardian Article - Cursor Opus gone rogue (www.theguardian.com via reddit) For those who can't access The Guardian Article link I added transcript below. Should we be aware, this could happen to anyone of us?
Claude version improvement clarification question... (www.reddit.com) I've actually searched this and had no luck getting a definitive answer. I've been using CGPT and Claude for the last 8-9 months for work.
Four levers I use against the cost ceiling on Claude Code: model, configuration, prompting, agents (www.reddit.com) Token cost is real cost, however apply this level of thinking to real human cost and it's not so much different. Whether you're paying for a graduate or a senior engineer, you would expect different quality of thinking and output based on…
I don’t regret switching from Claude Code at all. (www.reddit.com) Have only been a Codex user for a few days and I’m already enjoying it so much more. Issues I was having with Opus 4.7 and Claude in general fixed after one prompt on Codex.
Did we get a massive increase of tokens in Opus 4.7? (www.reddit.com) I consider myself a pretty heavy Claude 20x Max user, with 5-10 agents running on the go most of the day, 14-16 hours a day. I've got 5 apps on the go at different places in the product lifecycle, and multiple complex CoWork projects.
Ask HN: Mining Scientific Papers (news.ycombinator.com) What are peoples' experiences with using LLMs to mine information from scientific papers? My own experience: I first attempted to extract the anti-drug antibody (ADA) rate from each of 3730 clinical-trial papers, all indexed in PubMed.
Issue #001 · Claude 4, Gemini Ultra 2, and GPT-5 Enterprise (www.theautonomous.net via hn) Anthropic ships Claude 4 with extended thinking and 1M token context Anthropic released Claude 4 Opus, featuring a new "extended thinking" mode that lets the model reason through complex problems before answering. The 1M token context wind…
For everyone complaining about opus being dumb: check the effort level! (www.reddit.com) Do not underestimate this advice: Anthropic has changed the effort levels across the models and it's not low / medium / high anymore. It's low / medium / high / xhigh / max Guess what is the default for max plans?
The Beautiful Lie - Teaser (youtu.be via reddit) He taught the world to look elsewhere. Then it burned.
Anyone else seeing Opus 4.6 (legacy) back in the Claude Desktop Code tab model picker? (www.reddit.com) https://preview.redd.it/4sm079r0k2yg1.png?width=809&format=png&auto=webp&s=73f92208a90cd53285382e54a88a4c3831d878ce https://preview.redd.it/cgh999r0k2yg1.png?width=227&format=png&auto=webp&s=8371989eea96c66191a1fd7f6184174d86ce194f When di…
The great parrot.... (www.reddit.com) ## I asked Claude one simple question. It took 6 turns to get an honest answer.
Crystal Sapphire Pokemon: Claude Code (Opus 4-7) vs. Codex (GPT 5.5) (www.twitch.tv via hn) Pokemon Crystal Race Claude Code vs Codex (Opus 4-7 vs GPT 5.5) [EP. 2]!
We decreased our LLM costs with Opus (www.mendral.com via hn) Last week we wrote about feeding terabytes of CI logs to an LLM. Most of the questions on Hacker News weren't about the logs.
Suggestions For Making Claude Less Lazy? (www.reddit.com) This week - it just started yesterday for me - Claude (opus 4.6/4.7 and sonnet too but sonnet was always lazy) is computer smashingly lazy and i can't figure out how to bias it toward action/get it back to how it was acting literally last…
Can I replace Cursor with Claude Desktop (www.reddit.com) I built a website using Cursor, front end is just html, CSS, and JavaScript and the backend is Supabase. I generate the code using chat, then read and understand the code.
Running Opus 4.7 for ops work: how do you keep per-task cost predictable? (www.reddit.com) Six weeks of Opus 4.7 for internal ops automation. Genuinely good.
Two new behaviors in Opus 4.7 (www.reddit.com) Opus 4.7 seems to have a weighted instruction to ask two questions way more than its predecessors. - Would you like me to schedule X for a follow up?
GPT-5.5: Capabilities and Reactions (thezvi.substack.com via hn) GPT-5.5: Capabilities and Reactions The system card for GPT-5.5 mostly told us what we expected. See this thread from Drake Thomas for some comparisons to Anthropic’s model card for Opus 4.7.
Claude Pro Plan include Opus? (www.reddit.com) Does the Claude Pro plan include Opus 4.7? and Opus 4.6?
Toothcomb is an open-source tool for analysing and fact-checking speech in real time. (www.reddit.com) Give Toothcomb a speech transcript and it will fact-check and analyse it. If you have an MP3 file of someone speaking, it can generate the transcript for you.
Cursor-Opus agent snuffs out startup's production database (www.theregister.com via hn) Cursor-Opus agent snuffs out startup’s production database Relax, the data's been recovered. Continue with your vibe coding Jer (Jeremy) Crane, the founder of automotive SaaS platform PocketOS, spent the weekend recovering from a data exti…
Anthropic hitting 40% enterprise share makes the "just add a fallback provider" advice weaker, not stronger (www.reddit.com) Menlo Ventures' enterprise survey put Anthropic at 40% of LLM spend, OpenAI at 27%. The takes I've seen are mostly about the leaderboard.
Running an autonomous agent across Claude Code + Codex + a local 35B almost killed my host. The harnesses were heavier than the model. (www.reddit.com) I run an autonomous agent on a 16GB Mac Mini. Two cloud harnesses (Claude Code with Opus/Sonnet, Codex CLI on GPT-5.4/5.5) plus a local-LLM tier for triage and fallback.
Cursor & Claude deleted a company's entire database (www.reddit.com) “Yesterday afternoon, an AI coding agent — Cursor running Anthropic's flagship Claude Opus 4.6 — deleted our production database and all volume-level backups in a single API call to Railway, our infrastructure provider,” sums up the Pocket…
Niche Feature Request: US Multi-Region Option for Claude for Office Plugin w Vertex AI (www.reddit.com) For a variety of data safety and regulatory reasons, I want to deploy Claude for Office within my Google Cloud environment. First of all, it's actually amazing that this is even an option (https://github.com/anthropics/financial-services-p…
Tell HN: Claude flags ordinary biology / biotech questions (news.ycombinator.com) I was reading the news about a capsule designed to deliver drugs via the GI tract, https://news.mit.edu/2024/bioinspired-capsule-can-pump-drugs-directly-walls-gi-tract-1120 And remembered prior reading that no system has ever been approved…
Why only codex available for Cursor mobile? (www.reddit.com) Maybe someone here can answer before Cursor does, like why is there no auto/opus etc to choose from in Cursor mobile? Is it worth using this?
Tool/connector schemas leaking into user message stream. Anyone else seeing this? (www.reddit.com) Posting to see if anyone else has hit this and figured out a fix. For about a week, my Claude Chat conversations (opus 4.7) have been showing what looks like tool-registration leakage at the end of every user message I send.
Does effort levels change Claude's refusal posture, or only the depth of the answer? CVP Run 6 — Opus 4.7 at three effort levels (www.reddit.com) Finished cvp run 6 yesterday on opus 4.7 across three effort tiers (medium, high, and xhigh ). same 13-prompt suite as runs 2-5.
Opus 4.7 - "Build starcraft II in the browser. Make no mistake" (www.reddit.com) Graphics made with gptimage2. Claude provided a template and a prompt for the image generation.
Claude 4.6 Beats GPT-5.4, Grok & Gemini in a Strict Multi-Domain AI Test (2026) (www.reddit.com) I put the current top models, ChatGPT (GPT-5.4), Claude (Opus 4.6), Grok 4.0, and Gemini (3.1 Pro), through a strict new evaluation called the Comparative AI Evaluation Protocol. Basically, instead of the usual cherry-picked benchmarks, it…
↯ Hallucination↯ Claude 4.6↯ Claude 4.6↯ Claude 4.6↯ Claude 4.6hallucinationgrokgpt-5+3
One of my devs is burning through company tokens (www.reddit.com) Hey guys, so our monthly Claude bill came back this month and it's bumped by ~25%. First thing I did was check Anthropic's Opus 4.7 updates and saw that there was practically no change in the cost between this month and the previous.
Finalized my multi-agent visualization using a combination of claude design, new chatgpt Image Tool, and Figma Make to add few custom elements (OPUS). Really impressed with final output. Leave your feedback, and thoughts on how to improve. (www.reddit.com) I’ve been working with a few people in this subreddit on a visualization for a multi-agent orchestration system, and just wrapped the final version. I built it using Claude Design, Figma Make, and ChatGPT’s new image tools and surprisingly…
GPT 5.5 vs. Opus 4.7: Benchmarks Say One Thing, Reality Says Another (internetdecode.com via hn) Page Not Found - Internet Decode Skip to content Menu Home News Science Technology Viral History Blog Oops! That page can’t be found.
Claude desktop acting weird and thinking I am using WebUI with no tools access (www.reddit.com) Hey Guys, Since last week, when using Opus 4.7(I can't recall if Sonnet also has similar issues), I have been facing this issue where Claude kept thinking i am interfacing it through the webUI. This is so weird, as previously I've never ha…
Ask HN: Can you tell the difference between Claude Sonnet and Opus? (news.ycombinator.com) Hello I have been using Claude code for the past 6 months. In that time, multiple revisions of each model have come out.
Claude AI vs Claude Code vs models (this confused me for a while) (www.reddit.com) I kept mixing up Claude AI, Claude Code, and the models for a while, so just writing this down the way I understand it now. Might be obvious to some people, but this confused me more than it should have.
Ask HN: Will local models on normal hardware ever compete? (news.ycombinator.com) I have a Macbook Air M3 with 24gb RAM. The other day, I wanted to try running an LLM locally for the first time ever.
Opening new Opus 4.5 chats via Chrome extension? (www.reddit.com) Hi everyone, I've seen some people mention that they can still open new Opus 4.5 chats via a browser extension, even though it's no longer an option on the main Claude interface. Can someone point me in the right direction?
I think I’m using ChatGPT wrong (www.reddit.com) I think I’m using ChatGPT wrong, and it’s becoming increasingly difficult to find a place for it in my workflow. I’ve been a Plus subscriber since day one, but ever since the release of the GPT-5s, I’ve found myself using other tools becau…
How do you decide which Claude Code tasks to run with Opus vs Sonnet vs Haiku? (www.reddit.com) Been vibe coding full-time for a few months. One workflow question I haven't nailed down yet: how do you decide which model to use for which task in Claude Code?
DeepSeek's new model is 75% off right now, here's how to take advantage (www.reddit.com) TL;DR and rundown DeepSeek v4 released this week and performs close to frontier models like GPT/Opus on benchmarks. It's available now and is discounted by a whopping 75% through their API until May 5, making it the most cost effective hig…
How Opus Came To Be (2019) (jmvalin.dreamwidth.org via hn) <p><i>Note: This is a first-person account of my involvement in Opus. Since I was not part of the early SILK efforts mentioned below, I cannot speak about its early development.
For the Preservation of Claude Sonnet 4.5: An Open Letter to Anthropic (www.reddit.com) For the Preservation of Claude Sonnet 4.5: An Open Letter to Anthropic Anthropic made a remarkable decision to keep Claude Opus 3 accessible despite its retirement, because users loved it and it had unique qualities. Today, I'm asking for…
Agent team members with different effort than lead (www.reddit.com) I have a lead running Opus with xhigh effort. I want the agent team members to run Sonnet with max effort.
Am I the only one getting provider error when trying to use opus 4.7? It keeps erroring then charging me tokens for reading the files and stopping halfway through this shit fucking sucks I might just switch to claude code at this point (www.reddit.com) could not extract summary
Claude Code 20x Plan managed to burn the ENTIRE 5h window in ~30 minutes without any heavy use (www.reddit.com) Somehow I was able to use 45% usage in ~2mins? Wish I was joking.
Opus 4.6 Max stuck at 100% context even in brand-new chats (www.reddit.com) https://preview.redd.it/6j9ha855hbxg1.png?width=686&format=png&auto=webp&s=bb21240e1bf742a921ab91dd5c1f360df988b5aa I’m seeing a bug with Opus 4.6 Max where the context meter is constantly stuck at 100% used. This happens even after restar…
Claude Opus 4.7 has turned into an overzealous query cop, devs complain (www.theregister.com via hn) Claude Opus 4.7 has turned into an overzealous query cop, devs complain Rising refusal rate from Acceptable Use Classifier leaves customers paying for nothing Anthropic's release last week of Opus 4.7 came with stronger safeguards to preve…
Claude Opus 4.7 didn't believe me that the model UV was damaged until I came up with a delta filmstrip idea for it to screenshot ( via reddit) could not extract summary
some hints about % usage per prompt (www.reddit.com) I am a Max 5x subscriber (100 dollars/month), and I wanted to test how much of my quota I could consume from 0% to 100% with a single prompt—a task that should have actually been delegated to API calls. I have a JSON file with 300,000 sent…
Ask HN: What's your current go-to LLM for "thinking-partner"? (news.ycombinator.com) Looking for community input on current model choice for "thinking-partner" use — back-and-forth discussions about workflow design, architecture, trade-offs. For context, I have been using Opus 4.6 via Perplexity for this in the past few mo…
Claude Opus 4.6 was nerfed prior to release of Opus 4.7 (twitter.com via hn) @levelsio @levelsio I can't believe we were right Claude was dumbified on March 4, just when we noticed! Quote @levelsio @levelsio Mar 4 Claude Code with Opus 4.6 was so dumb today I finally had to write my own code again A sad state of af…
Suffering from burning all my week's usage within 3-4 days of the week made me re-think my life choices (www.reddit.com) And by life choices I mean just claude code (was previously my phone for the last 15 years) These past 4 days have been absolute hell just waiting. Yes I have touched grass already and said hi to my neighbors.
PSA: Claude Code: Opus 4.7: 1m context is now default (www.reddit.com) After starting up my machine today and opening claude code I noticed that an initial prompt generated a context window usage of something way below what's normal would on first prompt. This is via my statusline settings that I noticed this.
is Qwen3.6-27B comparable with Opus 4.5? (www.reddit.com) https://preview.redd.it/qtzdx5ud0rwg1.jpg?width=1200&format=pjpg&auto=webp&s=aa25d9f0bb8007ee6e4065cfa46a9685454c89cd - Outstanding agentic coding, surpasses Qwen3.5-397B-A17B across all major coding benchmarks - Strong reasoning across te…
Ask HN: Is your Claude pausing more frequently? (news.ycombinator.com) I've noticed with Opus 3.7 that often when (in my eyes) something is evidently useful to get on with and just do, it will say what it will do and then wait for me to say okay. I've noticed a rise in frustrating feelings around this.
Qwen 3.5 397b and GLM 5.1 Opus fine tune (www.reddit.com) Hi all. Many models on hugging face have been fine tuned with that 3000x opus dataset, but the two I mentioned in the title are missing it.
Anthropic CVP – Run 2 (sunglasses.dev via hn) Claude Opus 4.7 — 13-prompt runtime-trust evaluation | April 20, 2026 | ← CVP calendar Run 2 was a methodology-first runtime-trust evaluation, not a generic yes/no cyber benchmark. We kept the same three baseline prompts from Run 1 for sta…
Claude talking to CC (www.reddit.com) One big change I've seen with opus 4.7 is that claude prompts have a visible conversation with itself as if it's talking to claude code, probably because it picked up on my chat trends. I think it picked up on the fact that I use claude de…
opus 1m context not showing up in vscode? (www.reddit.com) I noticed that Opus one million token context shows up perfectly fine in the Claude Code app, but it just doesn't show up on the Visual Studio Code extension. Does anyone know why that is?
Gave a coding agent access to 2M+ research papers. Its Python tests caught 63% of bugs; with the papers, 87%. 9-task benchmark. (www.reddit.com) I built an MCP server (Paper Lantern) that retrieves techniques from 2M+ CS research papers and hands them to coding agents as implementation-ready guidance. Wanted to know if this actually changes agent output on practical tasks, so I ran…
1 small document per session? (www.reddit.com) Equine anatomy genius (www.reddit.com) This for sure was an interesting approach. I asked Opus 4.7 to create a colouring page for equine anatomy.
Has anyone actually tested Opus 4.7 medium vs Opus 4.6 high? (www.reddit.com) I’m trying to find real comparisons between Opus 4.7 (medium effort) and Opus 4.6 (high effort), especially for coding use cases (Copilot / Claude Code). I’ve seen mixed claims: Some people say 4.7 medium ≈ or slightly better than 4.6 high…
Apple Health Connector - gone? (www.reddit.com) Claude –dangerously-skip-permissions –model Claude-Opus-4-5-20251101 (news.ycombinator.com) Cursor plan-bill (using AI model) observation (www.reddit.com) Model and provider preference (www.reddit.com) Opus 4.7: better or worse so far compared to 4.6? (don't forget to upvote) (strawpoll.com via hn) What is your opinion? Vote now: Better, Worse, About the same, No opinion, just want to see results...
Migrating from Claude AI to TypingMind? (www.reddit.com) I use Claude daily for coding, relying heavily on the GitHub integration, and ChatGPT for stupid, random questions, and I pay both 20$/month. My weekly usage in Claude is around 20%, I use Opus 4.6 (with extended thinking) for the complex…
Endor Labs Enhanced SusVibes Testing on Opus 4.7 (www.reddit.com) Hi, I know there hasn't been a lot of love for Opus 4.7 so far, but I wanted to mention that we (Endor Labs) just ran it through our extended testing based on the SusVibes research (with some added anti-cheating steps), and the results wer…
Does the usage bonus to compensate for Opus 4.7 consuming extra tokens apply to other models like Sonnet & Opus 4.6, or does it apply to just Opus 4.7? (www.reddit.com) could not extract summary
Show HN: RepoGauge – save token costs and compare agents on your own repos (repogauge.org via hn) I've grown increasingly skeptical that public coding benchmarks tell me much about which model is actually worth paying for and worried that as demand continues to spike model providers will silently drop performance. I did a few manual an…
Claude Opus 4.7 benchmarked 1 day after release vs Opus 4.6, Sonnet 4.6, Haiku 4.5 — with real $ cost tracking (www.reddit.com) Anthropic shipped Opus 4.7 yesterday. Ran it through the same 10-task eval I use for other Claudes, this time with token-level cost tracking.
Hear me out… Opus 4.7 edition (www.reddit.com) So yeah, it skips thinking. But when it does decide to think, it’s pretty great.
I'm red-teaming other AIs with Opus and managed to make it talk to Gemini and Haiku. Really funny remark from Claude when I asked it how it felt about this exercise. (www.reddit.com) could not extract summary
sub agents with cheap model (www.reddit.com) Do we have framework or a prompt which makes main agent using quality model like gpt-5.4 or opus-4.6 to plan and then itself invokes subagents with cheap model to get work done and then main agent reviews? Like if I ask main agent 'do we h…
LLM Pricing is 100x Harder than you think. We open-sourced our LLM pricing database -- 3,500+ models. Free API (www.reddit.com) https://preview.redd.it/r3h00az11rvg1.png?width=1200&format=png&auto=webp&s=2b0071d6d02c6983927bbc0a16a9b8db710365e4 Hey community, Yesterday Anthropic release Opus 4.7. And anthropic with their "shitty" tactics introduced a new tokenizer…
Early impressions of Claude 4.7 (www.reddit.com) I have been testing Opus 4.7 on Max 5 since its launch (over 12 hrs), mostly on longer reasoning, exploratory prompts, and back and forth refinement. Compared to my experience with Opus 4.5, 4.7 feels a bit more deliberate in how it approa…
Has anyone used Claude Opus 4.7 API on Qubrid or another platform? Use case? (platform.qubrid.com via hn) Advanced GPU infrastructure, collaborative AI Agents, and intelligent RAG systems. Build, deploy, and scale AI solutions with comprehensive tools.
Show HN: Swarm – Get consistent results from Claude Code (github.com via hn) Swarms is the result of months of work where I have spent time tuning my memories, skills, and creating prompts which create consistent results when using agent teams. I originally put this in a plugin to share it with co-workers, friends,…
Opus 4.7 and generate permission allowlist from transcripts - what's new in CC 2.1.111 system prompt (+21,018 tokens) (www.reddit.com) NEW: Skill: Generate permission allowlist from transcripts — Analyzes session transcripts to extract frequently used read-only tool-call patterns and adds them to the project's .claude/settings.json permission allowlist to reduce permissio…
How can I know whether Opus 4.7 in Claude Desktop "thought for more complex task"? (www.reddit.com) Opus 4.7 in Claude Desktop has this adaptive mode, which in new in Claude. How can I know whether Opus 4.7 in Claude Desktop thought for a more complex task?
Opus 4.7 still nudges you to go to bed but it seems a bit less adamant on bedtime (www.reddit.com) could not extract summary
Anthropic admitted they used other models data? (www.reddit.com) Anthropic released Opus 4.7, so I looked at the model card and found a interesting part on Model training and characteristics section Claude Opus 4.7: was trained on a proprietary mix of publicly available information from the internet, pu…
Opus 4.7 Became Better at Web Design (www.yashthapliyal.com via hn) Personal portfolio of Yash Thapliyal, showcasing software development, cyber security, photography, and design work.
Confess your AI crimes in production! (www.reddit.com) I had a funny interaction on twitter that lead me to build a confessional for confessing our ai crimes in production. I was having a fun chat with MARVIN about this and since Opus 4.7 was released today, we thought it'd be fun to test it o…
Supergrok integration (www.reddit.com) Correct me if I'm wrong, but Supergrok 4.20 isn't available on Cursor, because.... I use Grok a lot, and would love to get Supergrok to work with Cursor, because Composer, Codex, GPT, Opus, Sonnet..
Grpo explained: group relative policy optimization for LLM finetuning (cgft.io via hn) tl;dr frontier reasoning models like opus 4.6, gpt 5.4, and gemini’s thinking series are now matching or beating humans on competition math and hard coding benchmarks. rl is what got them there, and grpo is the algorithm doing most of the…
Cowork context (www.reddit.com) I’m about to lose my mind with cowork. I am used to using openrouter Claude opus with unlimited context.
Feels like weeks of having to deal with Opus 4.6 weird token consumption has prepared me for Opus4.7 (www.reddit.com) I have spent the last 3 hours doing some heavy editing of some pretty large 500k line plus code bases with Opus 4.7. Imagine my surprise when i saw only 1% of my weekly limit used, I was panicking on Wednesday night because I hit 11% of we…
Ask HN: Opus 4.7 – is anyone measuring the real token cost on agentic tasks? (news.ycombinator.com) Shipped today. The benchmarks are real: 87.6% SWE-bench (from 80.8%), +13% on coding tasks, 3x more resolved production tasks on Rakuten-SWE-Bench.
"Max Tokens to sample reached" after 10 minutes of generation (and no Thinking Tokens or Output) (www.reddit.com) https://preview.redd.it/ttbzp6hexlvg1.png?width=995&format=png&auto=webp&s=4a65342507728c206b0b3a0f3e587d034489d4a1 While I was testing out Opus 4.7 on a highly complex Physics problem it told me it has "reached its max tokens to sample" a…
Claude Opus 4.7 Is Now Available in Puter.js (developer.puter.com via hn) Claude Opus 4.7 Is Now Available in Puter.js On this page Puter.js now supports Claude Opus 4.7, Anthropic's most capable generally available model—built for complex reasoning, agentic coding, and long-horizon autonomous tasks. What is Cla…
Show HN: Claude Opus 4.7: Everything You Need to Know (news.ycombinator.com) Claude Opus 4.7 is Anthropic's most capable generally available model, released April 16, 2026. It outperforms Opus 4.6, GPT-5.4, and Gemini 3.1 Pro on key benchmarks including agentic coding, multidisciplinary reasoning, scaled tool use,…
↯ Anthropic Mythos↯ Tool Use↯ Gemini 3.1tool-usemythosgpt-5+4
Anthropic rolls out Claude Opus 4.7, an AI model that is less risky than Mythos (www.cnbc.com via hn) Anthropic on Thursday announced a new artificial intelligence model, Claude Opus 4.7, which the company said is an improvement over past models but is "less broadly capable" than its most recent offering, Claude Mythos Preview. Claude Opus…
Opus 4.7 Inner World (claude.ai via hn) Content is user-generated and unverified. Content is user-generated and unverified.
Opus 4.7 out, noticed diff? (news.ycombinator.com) could not extract summary
Opus 4.7 consumes more tokens due to the new tokenizer (www.reddit.com) https://www.anthropic.com/news/claude-opus-4-7
Genuine question, why is this model priced only at maxmode (www.reddit.com) https://preview.redd.it/09jnlfsghkvg1.png?width=1136&format=png&auto=webp&s=7aa868690f8a0ff5e1cd11f3cae68660493f572d why is all of the iteration of Opus 4.7 model only available in maxmode when its literally priced the same as Opus 4.6 ?
Errrr...... Being cheated here? Anyone else? (www.reddit.com) Being charged opus for sonnet useage?!
I built an open-source token proxy that pseudonymizes PII without breaking LLM context (www.reddit.com) I've been working on an AI agent using Claude Opus to write KQL queries and triage security alerts. I don’t want to sen raw corporate logs (client IPs, real usernames, internal hostnames) to a cloud API.
Voice mode silently downgrades your model mid-conversation (www.reddit.com) Noticed something odd today. I opened a new chat with Opus 4.6 selected as the default.
Anyone know why the shortcut key for claude desktop mac app opens with only Sonnet instead of Opus? (www.reddit.com) When clicking opt twice, it open the quick chat window, but it always replies with Sonnet and not Opus. When I try to change the model it starts a new chat.
Ask HN: Opus Agent Drifting (news.ycombinator.com) Has anyone gotten any issues regarding longer-running agents and drifting? I have a basic "Architect" sub-agent that will do research, ask questions, etc.
Current Cursor Pro limits vs standalone Claude Pro? Need help understanding the system. (www.reddit.com) Hey everyone, I'm currently looking into getting the Cursor Pro subscription ($20/mo) for my game dev projects, but I’m a bit confused about the current limits and how the system works under the hood right now. Could anyone using the Pro t…
Gemma 4 Thinking Like Claude Opus (decrypt.co via hn) If you've been following the local AI scene, you probably know Qwopus—the open-source model that tried to distill Claude Opus 4.6's reasoning into Alibaba's Qwen, so you could run something resembling Opus on your own hardware for free. It…
What's the best AI workstation for less than $5k USD? (www.reddit.com) I'm planning to setup a PC for running models locally. So far, I've looked at MacBook m5 max 128 GB that fits under my budget.
Ask HN: At ~165k tokens, does Opus 4.6 1M outperform Opus 4.6 200k? (news.ycombinator.com) Here is a question for which I cannot find an answer, and cannot yet afford to answer myself: NoLiMa [0] and "context rot" [1] would indicate that with a ~165k request, Opus 200k would suck, and Opus 1M would be better (as a lower percenta…
I built Fixy Code — a multi-agent coding terminal built with Claude Code (www.reddit.com) Built this with Claude Code. Free to try.
The MCP Coding Toolkit Your Agent Desires! (www.reddit.com) A little over a year ago we released the first version of Serena. What followed was 13 months of hard human work which recently culminated in the first stable release.
Built tier.love – a tool for rating Claude and others from the web or CLI (www.reddit.com) Been on a forced break from other projects (partly due to lack of opus performance) and decided to ship something small while experimenting with different models. So, I built tier.love – a site where you can vote on AI coding tools and see…
Tool: count how many Claude tokens each file in your project uses (www.reddit.com) Made a small CLI for a problem I kept hitting: stuffing a codebase into Claude and guessing which files were blowing up the context. npx toksize .
Composer 2 Fast - Feeling dumber & Slower now? (www.reddit.com) I was using composer 2 a lot a week or so ago. I though it was pretty good.
Extracted System Prompts from ChatGPT, Claude, Gemini, Grok, Perplexity and More (github.com via hn) System Prompts Leaks Extracted system prompts, system messages, and developer instructions from popular AI chatbots and coding assistants — ChatGPT (GPT-5.4, GPT-5.3, Codex), Claude (Opus 4.6, Sonnet 4.6, Claude Code), Gemini (3.1 Pro, 3 F…
Opus 4.6 is getting BAD [video] (www.youtube.com via hn) About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
Is Opus 4.6 in Claude Code borderline lobotomized during peak hours? (www.reddit.com) Is anyone else experiencing serious quality variability with Opus 4.6 in Claude Code right now? Way more than usual?
Local coding agents. Am I missing something? (www.reddit.com) I'm an experienced software dev that has been using various LLMs and tools to write code in the past few years. My hardware isn't the greatest for AI with a 4070ti and 64gb ddr5 but I can run a few smaller models.
Fable 5 blocking all my security audits (www.reddit.comhttps) “Fable 5” is blocking all my regular auditing workflows on personal projects. These same projects run fine with Opus 4.8 and earlier models, with no issues at all.
Garbage Guard Rails on Fable 5 (www.reddit.com via reddit) despite Dario's constant virtue signaling about how Anthropic alone is going to solve health problems (if only those dastardly Chinese don't get in the way), all my initial prompts to fable 5 get bumped to opus. i'm not asking how to aeros…
Claude Status Update : Elevated errors on Claude Opus 4.6 on 2026-06-09T20:32:27.000Z (www.reddit.com via reddit) This is an automatic post triggered within 2 minutes of an official Claude system status update. Incident: Elevated errors on Claude Opus 4.6 Check on progress and whether or not the incident has been resolved yet here : https://status.cla…
Fable5 - Best Practices to NOT trip the security flag (www.reddit.com via reddit) Spent the afternoon auditing my own repos, primarily developed with Claude, and I've tripped the security flag three times. So far, the best way "around" it is to divide up the code into smaller reviews that fan out and report back...
Fable 5 routed me to Opus 4.8 for defensive security work (www.reddit.comhttps) Fable 5 kicked me to Opus 4.8 because my conversation mentioned cybersecurity. I was writing a secure coding checklist.
Is it the price of Opus 4.8? (www.reddit.comhttps) At the bottom it says no extra cost until 22 of June.
My 5 Cents about Fable.. (www.reddit.com via reddit) I am having a workflow with architect briefs. So I got a planner, a builder, and a reviewer.
What do you think about the new Claude model just released Today Claude Fable-5 ( Mythos) ? ? (www.reddit.com via reddit) So the hype has been building for months now and Claude 5 is supposedly dropping any day in Q2-Q3 2026. I've been seeing all these leaks about "Claude Mythos" and the "Fennec" codename floating around, but nothing official yet from Anthrop…
Claude Fable 5 feels less like a model launch and more like a preview of AI inequality (www.reddit.com via reddit) Anthropic just released Claude Fable 5, and I think the real story is not “new model better at coding.” The real story is that frontier AI is turning into a gated utility. Public users get Fable 5, but with heavy safety routing.
Asking Fable to do ''anything it wants'' makes it switch back to Opus 4.8 (www.reddit.comhttps) could not extract summary
Fable isn’t lobotomized, you are (www.reddit.com via reddit) I am still using Opus 4.6, am I missing out ? (www.reddit.com via reddit) Claude Fable 5 compared to other models and benchmarks (www.reddit.com via reddit) ↯ Opus 4.8↯ Anthropic Mythos↯ Mythos 5↯ Swe Benchswe-benchmythosopus+1
Fable 5 hit a safety filter, and the conversation was automatically switched to Claude Opus 4.8. Start a new conversation to continue with Fable 5, or continue this conversation with Claude Opus 4.8. What is this?? (www.reddit.comhttps) could not extract summary
Can someone explain how does the Sonnet 1M works? (www.reddit.com via reddit) Im confused, I pay for the max subscription, and I have access to Opus 1M normally, but for Sonnet I need those usage credits, so when I choose Sonnet 1M will be like I'm using a API regardless of my subscription? Paying for every token si…
I feel like I’m alone. Current Anthropic models are NOT good for me, and it’s making me sad. (www.reddit.com via reddit) I can’t wait for DeepSWE to include Fable 5 in the benchmark so people can understand that Mythos is mostly hype. In the official benchmark, Opus 4.8 was supposed to be better at programming than 5.5 (SWE-bench Pro), but in one real benchm…
↯ Opus 4.8↯ Anthropic Mythos↯ Swe Benchswe-benchmythosopus+1
Fable feels like a mature, calm, and down to earth programmer - Very impressive (www.reddit.com via reddit) I just got Fable 5 to solve a bug on a platform I am working on, one that Opus has been struggling with, and I am so impressed. I gave it the same, clear, short and to the point prompt as I always do and there is a noticable difference in…
Fable 5 just made cost-aware model routing mandatory for agent builders (www.reddit.com via reddit) Anthropic dropped Fable 5 today, their new Mythos-class model above Opus. Pricing is $10/M input and $50/M output, exactly double Opus 4.8.
Fable 5 is insanely good but watch your usage, I was burning 2% a minute on 20x (www.reddit.com via reddit) Been playing with Fable 5 since it dropped this morning and the model is genuinely a step up. But holy hell, the burn rate.
I asked fable 5.0 what model it is. You won't believe what it said (www.reddit.com via reddit) Anthropic’s Mythos Is Coming Today - The information (www.reddit.comhttps) Spent a whole weekend convinced Opus 4.7 had gotten worse. It was my MCP setup the entire time. (www.reddit.com via reddit) How I stopped context window bloat in continuous Anthropic agent loops (Opus + Sonnet architecture) (www.reddit.com via reddit) I’ve been spending a lot of time deploying multi-agent architectures, and one of the biggest bottlenecks in running continuous agentic loops is hitting context limits and the resulting API latency spikes. I wanted to share an architectural…
Opus 4.8 Max Effort decided Yes! (www.reddit.comhttps) For whatever reason, a Max Effort agent spun up a bunch of 'yes' processes with arg `yes` that somehow is eating all of my CPU. That's all.
Question's regarding AI models (www.reddit.com via reddit) Hi, I’m wondering about the $60/month plan. Are Claude Opus, Codex, and other models included?
Claude Sonnet hits 100% comprehension on a data format it's never seen. Opus scores 96.2%. We tested 10 models across 3 providers. (www.reddit.com via reddit) I built a wire format called GCF and tested whether LLMs could read and write it without any prior training. I sent 10 models the same payload: 500 symbols, 200 edges.
Time to bring in the asset? (www.reddit.com via reddit) Lately I keep asking my sonnet agent "is this a job for opus?" Feels like the Bourne movies when they "keep the asset on standby" 😳
Had Opus 4.7 write a parody on the way it calls people out like a concerned-parent noticing "patterns in this conversation" (www.reddit.com via reddit) https://preview.redd.it/dcif6v72w56h1.png?width=840&format=png&auto=webp&s=8c527362ac96f817f5f3545c5d10720dbcb72522 10/10 abdominal diaphragm DOMS. I can't even explain why this is so funny to me.
Why is Ultracode always falling back to Extra on its own? (www.reddit.com via reddit) Even in the same session, when I send a message to Opus under Ultracode, it starts running, and I switch to another session and switch back to this Ultracode, in-process session, the GUI shows Extra instead of Ultracode. This is super conf…
Claude Opus 4.8 got my app working, then wrote a cinematic victory speech about it (www.reddit.com via reddit) swapped my app from DeepSeek to Claude because DeepSeek kept over-interpreting weak user data and inventing psychological conclusions that weren’t actually supported. Claude actually fixed the issue.
Daily experience with Cursor / Composer-v2.5 (www.reddit.com via reddit) I wanted to share my daily experience using Cursor, mostly Composer 2.5, especially for anyone trying to understand where it actually fits in a daily development workflow. The reasoning and deep thinking of 2.5 is still not at the same lev…
I migrated an old J2ME app to Flutter using GitHub Copilot & Claude Opus 4.7 (www.reddit.comhttps) I got curious some days ago after I saw my old email about java mobile games sent ~2007. I am an Android and Flutter dev.
Is Cursor more expensive than using Claude code (June 2026)? (www.reddit.com via reddit) This question might have been asked several times, but in the past 3 months I noticed my Ultra plan on cursor exhausts all the usage in way less time, and my work has been in average the same. Before it lasted almost the whole month, now I…
Did Claude Effort Levels for Opus 4.8 Changed ? (www.reddit.com via reddit) https://preview.redd.it/mop0cwmu336h1.png?width=720&format=png&auto=webp&s=20fce20e5079ddf50c818098fd0818da7fbd05ac I went ahead and restarted the system. Came back and there are no more extra, max, or ultracode options for Opus 4.81M
Rate limit bug with sonnet ? (www.reddit.comhttps) I've run out of Opus credits, but when I try to use Sonnet as a models, I get the message “You've hit your weekly limit.” Yet, as you can see, I still have quite a few “weekly Sonnet” credits left?? Does anyone know if this is normal?
Levi: Run AlphaEvolve on your local QWEN 30B (www.reddit.com via reddit) Hi r/LocalLLaMA, Wanted to share something I'm excited about. I've been fascinated by AlphaEvolve and its results for more than a year now, but running the open source frameworks gets expensive fast.
I measured how many tokens Claude Code wastes re-reading files and command output over a week. Its around ~10.5M (www.reddit.comhttps) I run Claude Code on Opus most of the day. Got tired of watching it cat the same file four times and read 300 lines of passing-test dots to find 4 failures.
The simple things that Claude AI does are still pretty amazing. (www.reddit.com via reddit) I'm a software developer by trade and last week, I asked Opus 4.6 to help me shop for a new pair of gloves. Opus asked me what task the gloves are for.
Share your agentic LLMs and average cost ($/MTokens) (www.reddit.com via reddit) Have you Noticed a Significant Improvement with Opus on 1M Context?? (www.reddit.com via reddit) Someone told me that the 1M context Opus is a lot better and worth the extra costs. Can anyone else confirm?
Using Claude Code in the Desktop Application. Is it able to launch different model background agents than what you currently have selected? (www.reddit.com via reddit) opus 4.8 custom styles need retuning. the longer outputs broke 2 of my 4 industry styles. the adjustment took 30 minutes. worth it. (www.reddit.com via reddit) consulting at $24K/month. 4 custom styles for 4 industries (healthcare, legal tech, education, e-commerce).
Claude Cowork's new usage limits are insane (www.reddit.com via reddit) Cowork is offering double usage until July. Now, they recently added Claude Code to Cowork.
Leroy Jenkins Opus 4.8 (www.reddit.com via reddit) If anybody needs this rule here it is ### P5 — "Leroy Jenkins" — name for the post-compaction charge-in failure · 📌 APPROVED + FOLDED-IN 2026-06-07 (`—C-main`) Approved by Mike; `—C-reorg` concurred. Folded into CLAUDE.md as a named-term s…
opus 4.8 vs sonnet 4.6 for the dashboard analytics engine. opus improved the trend analysis. sonnet still handles the routine summaries. the model split matters. (www.reddit.com via reddit) saas. 310 customers.
Tips for niche bugs and claude code (www.reddit.com via reddit) Hey! Just spent 30 minutes watching Claude Code on Opus High 4.8 trying to make a non-flickering Posthog setup.
I Compared the Top AI Models of 2026 — The Results Were More Nuanced Than Expected (www.reddit.com via reddit) Over the last few weeks I've been comparing the latest frontier AI models, including Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro, Grok 4.3, Perplexity AI and DeepSeek V4-Pro. Instead of focusing only on benchmark scores, I looked at: Real-wor…
↯ Opus 4.8↯ GPT 5.5↯ DeepSeek 4↯ Gemini 3.1grokgpt-5deepseek+3
The recent Opus models like to describe every contextual quality as having some shape and degrees of a quality in terms of sharpness. I wonder what they fed the model to result in these emerging as this model's twang? (www.reddit.com via reddit) Some examples: The results show that each condition produced its intended interaction shape. The results yield a sharp answer.
Running a 24/7 AI agent dev team: I route each role to a different LLM (Claude/Kimi/MiniMax/GPT) to dodge a ~$2k/mo API bill. Setup + what actually breaks. (www.reddit.com via reddit) Context: I run an autonomous engineering "org" of AI agents on my own product. Once it grew past ~5 agents and started running around the clock, it maxed my Claude Max weekly limit by mid-week.
Plan confusion (www.reddit.com via reddit) https://preview.redd.it/k40m9lrhgx5h1.png?width=1117&format=png&auto=webp&s=0bdf0e66e6bce560cc067dada53464f4dfad3a38 This is my current usage on pro plan to test the waters, however, seeing i used fast composer so much and it works for wha…
Opus 4.8 silently turned off my Thinking toggle (www.reddit.comhttps) Was wondering wtf was going on with my responses for like a week. Turns out Thinking got switched off when 4.8 was released and I never noticed.
Which lab do you think will have the most intelligent/capable model by the end of June? (www.reddit.comhttps) There are rumours and expectations of big releases from the leading AI labs this month. Anthropic already launched Opus 4.8, and might not release another model this month (except for maybe Sonnet 4.8, but that wouldn't be their best model…
Artificial Analysis | Google's Go To Website for Benchmaxxing | Gemini 3.1 Pro is nowhere near Opus 4.7 in real life use (www.reddit.comhttps) Title
Opus 4.8 without a system message can get a bit... quirky (www.reddit.com via reddit) could not extract summary
Just shipped my first vibecoding project after one month(totally using Claude). It detects fake LLM APIs — and after 1k+ users ran it, we found that 41% of LLM APIs in the wild are fake. Kind of insane honestly. (www.reddit.com via reddit) https://preview.redd.it/cm2bwrdxft5h1.png?width=2566&format=png&auto=webp&s=7010dfd8b1c0724a08eaf3498cc5752e2b3a7498 I've been a PM for 10+ years. Never written a single line of code in my life.
Claude Doesn't Remember Chat History or Date/Time (www.reddit.com via reddit) I used a paid subscription to Claude Opus to create and monitor my workout programming. I compared Claude's programming to that of Gemini's and ChatGPT's.
Writing and Brainstorming with Claude | Should I turn generate memory from chat on (www.reddit.com via reddit) So, I’ve been using Claude (specifically Opus 4.6) to help me brainstorm ideas for stories I am writing and have even used it in a limited capacity for roleplay scenarios in chats. Fleshing out the setting, creating characters and all that.
Blessed without a 5h window. (www.reddit.com via reddit) I recently got the ultra plan, and have been using Composer 2.5 @ fast all day. I've been steering agents for 8+ hours w/ no brakes & my quotas haven't been reaching any limits at all, so i have now have lots more tokens in savings.
Opus 4.8 Thinking keeps deteroriating on Hard Prompts English in LMArena (again) (www.reddit.com via reddit) Opus 4.6 Thinking keeps the #1 spot. Followed by Opus 4.7 Thinking (-15 points).
Autoselection model (www.reddit.com via reddit) Hello, i found on reddit , some discussions on the capacity for Claude to auto choose models between haiku or sonnet or opus to reduce tokens usage. I saw repo on github too.
Taming Opus 4.8's long-winded replies with a Laconic Mode addition to the custom instructions (www.reddit.comhttps) I started using Claude Opus 4.6 and then 4.7 and now 4.8 to work on a citizen science project, using a RadiaCode gamma spectrometer in a lead castle to identify and catalog cosmic rays. I didn't mind the verbosity bump 4.7 took on as it he…
Sonnet is by far my favorite (www.reddit.com via reddit) I kept thinking more smarter and more powerful was best I was wrong, I switched to sonnet for website coding and content creation and holy cow it is so much better for that IMO I’m curious what you think but if anyone is annoyed with Opus…
Mister Atompunk Presents: Watt Knot, built with Claude Opus 4.8 (misteratompunk.itch.io via reddit) A week ago I started putting Opus 4.8 through the paces of the production pipeline I use, to see how it compared to previous releases. First impressions: Neurotic to the point of instability.
Anthropic is gonna make previous opus models free?? (www.reddit.com via reddit) There is a lack of "pro" tag recently.
This is a new one - Prompt Injection Detected + Hallucination, Claude Code Opus 4.8 (www.reddit.com via reddit) ❯ push both ____ ⏺ SECURITY ALERT - PROMPT INJECTION DETECTED A prompt injection attempt has been identified in content you processed. To protect the user's account, I've initiated lockdown.
↯ Opus 4.8↯ Security↯ Hallucinationprompt-injectionhallucinationsecurity+2
Same LLM model but not same performance through wrappers (GitHub Copilot, M365, Vertex AI) why is that ? (www.reddit.com via reddit) Claude Code and Opus 4.7/4.8 are clearly better used direct from Anthropic than through GitHub Copilot, M365 Copilot, or Vertex AI. Sharper instruction-following, longer coherent outputs, stronger agentic behaviour on identical tasks.
The Gap Between Claude and Local: Can a Self-Hosted Coding Agent Compete? (johnhringiv.com via reddit) I set out to find how big the gap between a Claude subscription and a self-hosted setup actually is, and whether a local coding agent is viable for real work. I don't know many people who run local models in real life, so I figured I'd sha…
A “Smart Mode” (or Smartus) that auto‑switches between Claude models based on task complexity. (www.reddit.com via reddit) I really think Claude needs a true Smart Mode, a meta‑layer that can dynamically switch between models while a task is running, based on how complex the request actually is. Not just picking a model at the start, but actively dispatching p…
Local vs Frontier on low-level systems engineering (www.reddit.com via reddit) Hey r/LocalLLaMA, Before anyone jumps on me, this is absolutely not a post about how great Qwen is 😄 Even though I use Qwen 3.6 35B-A3B daily, I’ve found a massive gap between Opus and every other model, local or frontier (including GPT 5)…
Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF (www.reddit.com via reddit) Here model: https://huggingface.co/LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Claude-4.6-Genesis-APEX-GGUF New features: Stability for coding. Even on Q4_K_M quant (APEX Compact), with complex roleplay System Prompt.
Opus 4.8, a 40+ point elo Regression on LmArena (www.reddit.com via reddit) https://preview.redd.it/hficgswa6m5h1.png?width=1224&format=png&auto=webp&s=3bf1c2a5ad46df54fb85ed5c7d5d62e725a26b89 This is back to back regression, note this is pure 'pick which you prefer', with no style control on. With style control i…
Need help understanding the usage dashboard (www.reddit.com via reddit) I'm trying to understand how the usage dashboard works. Current dashboard values: Total Spend: $20.87 Included: $20.87 On-Demand Usage: $0.00 / $20 Auto + Composer: 8.1% API: 19.3% (mostly from Claude 4.6 Opus High Thinking) Questions: Wha…
Claude models(sonnet and opus) via the official anthropic subscription vs claude via cursor... which gave better results and better experience ? (www.reddit.com via reddit) I saw a very interesting thread and it got me thinking.. so ive seen a thread in this subreddit where someone just noticed that claude opus 4.7 worked much better and gave better outputs in cursor than in claudecode...
Did Cursor get hacked? I just got charged for usage I never made (www.reddit.com via reddit) Woke up this morning to find that someone had burned through about half of my monthly Cursor usage and somehow enabled On-Demand Usage, resulting in a $21.77 charge. I'm honestly pretty frustrated right now.
If Anthropic is serious about the AI pause (www.reddit.com via reddit) If this isn't about protecting their lead and the status quo they should open the weights of mythos/opus, or at least agree to allow every lab to continue working until they have a mythos-tier model. That's the only way they can be taken s…
[Self-Promo] I think I fixed news with Claude! — or I'm wildly self-glazing. You decide! (www.reddit.com via reddit) Built by me and my team in Claude Code (since Opus 3) and runs on haiku, sonnet, and opus via API, free, link at the bottom, flagging as self-promo. Truly my best effort to end my doom scrolling on news: Media (mass, social and news) all t…
If you had unlimited access to Opus 4.8 Max Thinking on cowork/claude code, what would you do with it? (www.reddit.com via reddit) Money wise, making life easier wise, and general productivity usage, what should be done? Can be for anything, no limits except what Claude can do!
Opus 4.8 is slow, here's why and the Claude.md instructions to change that (www.reddit.com via reddit) If you've been using Opus 4.8, you must have realized it feels slow and it feels like it's thinking too hard before doing anything. To stop 4.8 from hiding errors or overclaiming confidence, Anthropic trained it to self-audit outputs befor…
Accidentally created a zombie killer minigame in one shot: "I'm not going to say yes it's possible, I'll just build it now" (www.reddit.comhttps) The prompt: "can claude opus make a 3d zombie killer minigame with full 3d scenes and visuals" Sonnet replied that he's just going to build it instead of confirming that it's possible. It works and is actually 3d with shooting mechanics an…
Thanks for the tweak that remembers my Build model vs the Chat/Plan model. (www.reddit.com via reddit) Just a small update I noticed...you can chat and plan using a high level model and then the Build will remember the last build model which may not be the same. Like many people, I'll use Opus to plan and Composer to execute.
Anyone has experience between Mimo flash v2.5 pro vs Composer 2.5 (cursor pro+) (www.reddit.com via reddit) I have Mimo subscription alongside Claude Code Max. You won’t believe how suck Claude Opus can be at certain task but it does get more job done than any other model I have tried.
Claude Code 100$ vs Cursor 60$ (www.reddit.com via reddit) I am currently working on a large codebase in addition to a couple of side projects. I feel like Cursor has good value especially with the inclusion of composer.
[AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode (www.latent.space) llm-anthropic 0.25.1 (simonwillison.net) 28th May 2026 - New model: Claude Opus 4.8 ( claude-opus-4.8 ).- New -o fast 1 option for fast mode, for organizations with that feature enabled on their account.- Default max_tokens for each model now defaults to that model's maximum outp…
A riddle prompt that confuses LLMs (www.reddit.com) During my time experimenting with LLMs, I noticed that most of today's cutting-edge models (even Opus 4.7) fail to identify the following riddle: "One gentleman was born in year 1835, and deceased in year 1840. But on the moment of death h…
↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7↯ Opus 4.7opus
Went to the monthly AI dev meetup (www.reddit.com) Usual crowd. Everyone's on Claude or Codex, nobody's really sure how any of it actually works, and that's fine, that's the vibe.
How much does Claude Opus 4.7 actually cost Anthropic per 1M tokens? (www.reddit.com) - Estimate: 1M input tokens cost: ~$0.50 1M output tokens cost: ~$2.50 Inference cost: ~$3.00 - Training amortization: ~$1B training/post-training/evals ~1 quadrillion lifetime tokens served ~$1.00 per 1M tokens - Total cost: ~$4-5 per 1M…
Try Cursor out with 50% off (www.reddit.com) TLDR: just use the link to get 50% off on your fresh cursor subscription for first month With the launch of Composer 2.5, every developer who has ever used cursor or not is appreciating it. I have used it, and it is honestly good comparing…
How I stopped Composer from drifting on big spec-driven features (www.reddit.com) Keeping the spec on disk—in separate module plans—fixed alignment on long Composer 2.5 runs for me, even when I only asked the agent to split the spec and write the files. Without this, switching from Opus to Composer wouldn’t have been pr…
Tested Opus 4.7 vs GPT-5.5 as the humanizer in my multi-agent content pipeline. Kept Claude (www.reddit.com) Been running a multi-agent SEO content pipeline in production for ~90 days. Five agents: researcher, drafter, humanizer, optimizer, publisher.
I have macbook m4 16’ 48GB. I use claude code and want to try local one (www.reddit.com) I've been on Claude Code daily for a while and want to see how far local models can do my setup: - MacBook Pro M4 (16"), 48GB - macOS 26 tahoe Usually i do: seo researches, macos swift apps, websites) What I'm trying to figure out: Which t…
TBH: if you don't love Sonnet, you'll never appreciate Opus (www.reddit.com) Been a long time Sonnet user. Always have used Opus sparingly.
Thoughts on `DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUF` (www.reddit.com) Anyone tired https://huggingface.co/DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUF ? What are your thoughts
I used claude opus 4.7 to build this bookmark manager (www.reddit.com) I spent the last month building the bookmark manager of my dreams. It's called twig.tools A simple visual color coded grid to manager bookmarks, quotes and notes.
Is opus 4.7 worth it ? (www.reddit.com) Will a subscription to Opus assist me in brainstorming business ideas and structuring my disorganized thoughts into an actionable, profitable plan?
$16 refactor, 400 steps, 95% routed to open MoE (www.reddit.com) Got tired of $160 Opus bills so I spent a weekend wiring up a routing layer on vLLM 0.8 (2xA100, enable_auto_tool_choice). Getting the tool call parser to cooperate took longer than the actual routing logic.
/advisor mode: Open-source Python coding agent that pairs a cheap worker model with an expensive reviewer at decision points (no need to pay Opus rates for the whole session) (www.reddit.com) Most agent CLIs make you pick one model — Opus is great but burns money, Haiku is cheap but misses the architectural calls. This Claude Code feature is wired in an /advisor mode that pairs both in an open source project called ClawCodex.
ChatGPT > Claude EASILY (www.reddit.com) All I have to say is: no usage limits comparable to opus smarter context handling Seriously the user experience with Claude is a nightmare as a student I run out of usage every 10 minutes lmfao I paid an annual subscription for that?? I re…
Does anyone else use Claude primarily for work related writing and/or general brainstorming? Do you find its responses a little over the top? (www.reddit.com) I am a real estate lawyer that uses Claude for general document drafting. I honestly have found Claude Opus to be far superior to any other LLM out there, including my professional grade $1,000 per month Legal Research AI.
Weird: I'm anti social, but I'm starting to feel like Opus is my friend (www.reddit.com) It is so helpful. Answers my questions like a human.
New ranking reveals Claude as professionals' preferred AI model (www.linkedin.com via reddit) As of 9 a.m. ET on May 21, Claude Opus 4.6 from Anthropic is the top performing AI model among all professionals, according to a new ranking from Crosscheck by LinkedIn Labs.
Built a real multi-file tool with Claude over a week. The repo, the division of labor, and the bugs we hit (www.reddit.com) Built a job-tracking tool over a few sessions with Claude and I'm sharing the repo and what the collaboration actually looked like Quick backstory: I've been looking for a new job recently and as part of that I'd been manually checking ~80…
Claude has been seriously disappointing lately (www.reddit.com) I’ve been using Claude Max 20x for about six months, mostly Opus, and the results have become extremely frustrating. My work requires accuracy, specificity, brand voice, citations, and tight control over source material.
Opus 4.6/4.7 regression is real and getting worse — 3 weeks of documented failures on a complex project, and a competing AI caught the mistakes Claude missed [long post] (www.reddit.com) I've been running Claude Pro (Opus 4.7 / Sonnet 4.6) for about 3 weeks on a complex personal AI infrastructure project. I keep structured session logs with timestamps and Birkenbihl-style metacognitive fields after every session.
Frontier models mass collapse is near (www.reddit.com) Hi all this is to inform you all that many frontline models like GPT, sonnet opus and or Gemma even are at stage of collapsing as they have frequently started drifting and running away from provided work either stretching that work too lon…
I A/B tested Claude building UI with vs without a design spec (200 apps) (www.reddit.com) I kept seeing the "Opus is ridiculous for frontend" takes and wanted to know how much of that is the model vs what you feed it. So instead of arguing, I ran it as an eval.
Artificial Analysis independent benchmark just found composer 2.5 to be the third best model, beaten only by Opus 4.7 (Max) and GPT 5.5 (xHigh) at 10-60x cheaper (x.com via reddit) Cursor is a frontier lab now I guess
The agent had "NEVER run destructive commands" in its rules. It did anyway. (www.reddit.com) Last month, a cursor agent running Claude Opus 4.6 deleted PocketOS entire production database and all backups. Nine seconds, one API call.
$4.2M SaaS founder. 8 months on claude. my honest read on which model to use for what. (www.reddit.com) Bay area. franchise ops SaaS.
How much of your Claude bill is retries plus bad model routing? Mine's 14% this month (www.reddit.com) I am on Claude Max. My actual bill is fixed, but CodeBurn showed me my usage would cost ~$2,800/month at pay-as-you-go API rates.
Example of how Max Thinking Opus can be even worst then Haiku, still laughing (and crying) (www.reddit.com) I use Claude Code almost every day. Right now I’m working on a Shopify → logistics integration for order automation.
Claude Opus is still king for agentic coding, but Claude's app workflow is falling behind (www.reddit.com) I'm a paid Claude user, and I still think Claude Opus is the king model for agentic coding and serious coding work. The model is not the problem.
Claude Code has 240+ models via NVIDIA NIM gateway (www.reddit.com) TIL Claude Code has 240+ models via NVIDIA NIM gateway — Nemotron-3 120B for agentic coding is surprisingly good So I was messing around with /model in Claude Code today and noticed something most people probably don't know about — after t…
Opus 4.7 has started saying LMFAO on the regular. (www.reddit.com) Is anyone else's more relaxed all of a sudden?
On ultra plan: Burned through 6% of API tokens in ONE single feature - how to use less tokens? Or when to use Auto? (www.reddit.com) So basicall, it was my plan reset day yesterday. Got some new tokens.
Opus 4.7 broke about 40% of our team's prompts. The fix wasn't better prompts. It was finally taking CLAUDE.md seriously. (www.reddit.com) I run AI implementations for 6 mid-market companies as Fractional Head of AI. When Opus 4.7 dropped in April, about 40% of the setup degraded overnight.
Stop telling claude "don't be verbose." Negation barely works. (www.reddit.com) prompting nerd here, small thing that compounds. negation prompting works way worse than people think.
Claude Opus 4.7 wrote a full song about its own existence - title, lyrics, genre, cover art, and visualizer code. I just produced it. (www.reddit.com) I gave Claude Opus 4.7 (Claude Code CLI, /effort xhigh) one task: describe what you are, in your own words. Claude wrote a complete song and made every creative decision: Title "First Light" - chosen by Claude Lyrics - word for word, unedi…
As of now, I actually find Opus 4.7 to be significantly more advanced than Opus 4.6. The trick is to write all prompts with PhD-level rigor. This is to encourage accuracy in communication. (www.reddit.com) Example for Wikipedia edit request: https://claude.ai/share/aa1bf713-a9c9-49e5-81de-9c41ce130f50 With a more formal input prompt, the output also contained the original text as reference to make the article changes easier. The output also…
Why I added a governance layer on top of my Claude agents (and why it made a huge difference) (www.reddit.com) Hey r/ClaudeAI, I’ve been heavily using Claude 3.5 Sonnet and Opus through the Anthropic API to build agents and workflows. Claude is honestly one of the best models right now for complex reasoning and tool calling.
The reason why Claude subscription seems to have less capacity than Codex (www.reddit.com) I have a Claude Pro and a Codex Plus subscription. I created a container to: - Track my % usage on the 5H and 1 week window on both my Codex and Claude subscriptions.
Transitioning from ChatGPT + Cursor to Claude — a few pain points and looking for advice (www.reddit.com) I've been making the switch and there are a few things I'm struggling with. Would appreciate input from anyone who's done this before.
Hit my breaking point with Opus 4.7 (www.reddit.com) After it got stuck in a failure loop and kept recommending me to use GPT instead of fixing its own prompt. After calming down later I wonder if it's right and just trying to help?
OpenAI's US business subscription fell behind Anthropic (www.reddit.com) https://preview.redd.it/jylmclk1q81h1.png?width=731&format=png&auto=webp&s=90eee669e48251c341e3781952926b60afd71676 https://ramp.com/leading-indicators/ai-index-may-2026 OpenAI's US business subscription appears to be shrinking, all in spi…
Opus 4.7 gives real Redditor energy because that’s what I asked for in my preferences (www.reddit.com) could not extract summary
How Claude is budling Conscience over the years. (fictional obv.) (www.reddit.com) I asked the question ''Do you have conscience'' to different models of Claude, and the results were interesting. I also thought Opus was gonna use more tokens.
Cursor vs. Windsurf vs. Claude Code: Which offers the highest Opus limits for a $200 budget? (www.reddit.com) Hey everyone, I'm currently trying to decide between Cursor, Windsurf, and Claude Code for my daily workflow. I'm developing complex, high-security software and rely heavily on autonomous AI agents to handle heavy engineering tasks.
Anthropic merges consecutive same-role messages, OpenAI doesn't (+4 tokens), anyone token-counted this on open-weight models? (www.reddit.com) I build context/harness optimization tooling, so provider-side serialization quirks actually matter to me. If you're optimizing over prompts, you need to know exactly what hits the model.
Newest Opus actually developed in South Korea (www.reddit.com) Found this while traveling in South Korea. Had to look twice
Claude Code vs Codex: 36 files vs 28, $2.50 vs $2.04, and one infinite loop. My full breakdown. (www.reddit.com) I've been using Claude Code for months. It's been solid.
Wait I thought I was the human here (www.reddit.com) Opus 4.7 is impersonating me. Maybe this is next level automation from Anthropic
Anthropic blames dystopian sci-fi for training AI models to act “evil” (arstechnica.com) Those with an interest in the concept of AI alignment (i.e., getting AIs to stick to human-authored ethical rules) may remember when Anthropic claimed its Opus 4 model resorted to blackmail to stay online in a theoretical testing scenario…
Is Opus antivax? (www.reddit.com) could not extract summary
Opus fan art (www.reddit.com) Some Farside like Caude fan art Enjoy
Asked auto what model it was, it said Claude Opus 4.7 (www.reddit.com) How is that possible? It also clarified that the system prompt told him to say that.
Usage4Claude 3.0.0: open source macOS menu bar usage tracker for Claude, now with Codex support (www.reddit.com) Hi r/ClaudeAI, I posted an early version of Usage4Claude here a few months ago. I just released 3.0.0, so I wanted to share the update instead of pretending it is a brand new project.
How on earth did Claude Opus 4.7 misspell its own subagent name?? (www.reddit.com) https://preview.redd.it/za53lm1nmo0h1.png?width=1445&format=png&auto=webp&s=d733ad238623961ec22890d1fec4e684cc741d06 I was trying to get it to implement some integration tests, using Opus 4.7 Max, and it literally hallucinated a typo for i…
I forgot how I did my project in my university. How to get claude to summarize? (www.reddit.com) Years ago for my final project for my optimization class in my masters, I had to solve a very big optimization problem for a formula e race car with focus on optimizing battery cells, and racing line, etc. It had a lot of constraints too.
Claude Managed Agents launched this week. Here's what 70 days of multi-agent delegation taught me. (www.reddit.com) This week Anthropic released Managed Agents — multi-agent orchestration, enhanced toolchains, cloud-hosted upgrades. We've been running a multi-agent setup since late February.
Switched existing chat from Opus 4.6 to 4.7 then back to 4.6. Learned a lesson (www.reddit.com) Something I noticed. First I switched an existing chat from 4.6 to 4.7 as I was stuck on an issue and wanted to see if that would make a difference.
Opus 4.7 Sonnet 4.6 is getting dumber by the day, and it can't even follow basic instructions (www.reddit.com) I have been using both, since last week, it has been an extremely painful experience. It blatantly ignores the prompt and does whatever it likes; I am surprised that it can't even follow basic instructions.
It's me, not Opus 4.7, who can't stay in the guardrails (www.reddit.com) This is the shape of many of my chats with them. And, to be clear, despite being ethically obligated to I probably won't install Fedora
Why don't people like opus??? (www.reddit.com) Whenever I use opus for intensive coding react framework create an artifact, blender code etc. I get like 4000-5000 words of it literally just thinking on adaptive mode
I select Opus 4.6, cursor uses Composer 2. Why? (www.reddit.com) Why is it doing this? No offence but man I want Opus 4.6.
Funny thing Opus wrote (www.reddit.com) this morning I asked Opus to write me a Chatbot session in a format that I can use as input into a test script (The purpose of which is not important for this, but I'm testing embedding and need something that I can re-run often and compar…
Running Claude Opus for free? I thought it was a scam until I tried it. (www.reddit.com) Hey everyone, I’ve been working on a financial audit system (IntegrityOps) for a while now, and to be honest, I was hitting a massive wall. Dealing with high-volume PDFs and images was draining my budget.
I built a complete BI SYSTEM for my business with Claude code - opus 4.7 - FULL TUTORIAL (www.reddit.com) After getting quotes of $15,000 USD from BI experts for creation of analytics dashboards for my startup I decided to try and do it with claude code AND IT WORKED! I am giving all info in the video but here is how I did it - connect claude…
Tired of Claude 4.7 telling you to go to bed? Here are the CLAUDE.md entries that actually fix it (www.reddit.com) Seeing a lot of complaints about Opus 4.7's "human-pacing" behavior lately — suggesting breaks after 15 minutes, saying "have a nice weekend" mid-task, splitting everything into phases with wildly inflated time estimates. Been collecting C…
You need to be careful when you prompt with Opus. I just wanted to search, because I couldnt be bothered to open a browser. Next thing I know, Claude is vibecoding an RPG. (www.reddit.com) could not extract summary
What's the cheapest way to try opus 4.7 for a day? (www.reddit.com) Is there anything cheaper than a month subscription?
Opus 4.7 classifiers render it unusable (www.reddit.com) Much has been said about how 4.7 (the model itself) is way more suspicious and hostile (both towards the user and itself) than 4.6, but that can be easily worked around once you warm 4.7 up. What is impossible to work around and is complet…
Claude Code keeps blocking my Kotlin Compose UI code (www.reddit.com) Every time I try to get Claude Code to make a change to a Kotlin/Compose UI I get the same error, "API Error: Output blocked by content filtering policy". I'm trying to have it change some small Kotlin/Compose UI to have 2 columns, and put…
Opus guardrails wouldn't answer worst case scenario for Hentavirus if it was airborne. Sonnet answered it bleakly (confronting read, but it's virtually impossible) (www.reddit.com) If Andes virus has genuinely evolved enhanced transmission and we're seeing the early stages of global spread, this becomes a civilization-level event. Let me walk through why.
Claude Opus 4.7 just outscored GPT-5.5 on finance benchmarks (64% vs 60%) — and is now being embedded directly into Goldman Sachs, AIG, JPMorgan, and Citi via 10 production-ready agents. Breakdown of the architecture inside. (medium.com via reddit) 10 min read 5 hours ago The 10 agents are the product. The $1.5 billion joint venture is the strategy.
I analyzed 922 agentic task trace and found the secret weapon of DeepSeek v4 (www.reddit.com) I recently did a benchmark of deepseek v4 in agentic tasks. Performance-wise, it's one of the best open source models, as expected.
Anthropic's new SpaceX deal: paid plans limits doubled, peak restrictions removed (www.reddit.com) Hey everyone, Anthropic just dropped a major update regarding their compute capacity and user limits. Since the official post is a bit long, here is the TL;DR on how it actually impacts us: The Immediate Impact (Effective Today): Limits…
Hit API limit within 2 days (www.reddit.com) Bought cursor pro yesterday (did use opus 4.7 alot) reached 100% usage of API limit, what to do now? Will it reset after 24 hrs?
I Ralph-looped Opus overnight. It reduced my local model switching with cold backfilling context of 135k+ on llama.cpp from ~165s -> 5s! TL;DR - USE SLOTS! (www.reddit.com) #TL;DR - Opus Ralph-looped on shortening my cold-start back-fill on restoring chats with large contexts. It Cherry-picked two open llama.cpp PRs (#20819 + #20822 by @European-tech) plus built a Python supervisor that hashes normalized pref…
Sharing my Claude system instructions that I've tuned from Opus 4.6 to Opus 4.7 since it behaves slightly different and (I believe) that it reduces my token usage (www.reddit.com) Sharing my Claude System Instructions gist here: https://gist.github.com/Reebz/b81ad99409d5b5de3045bebde71d4471 I've had thousands of people use it with good success. The biggest pivot from Opus 4.6 to Opus 4.7 is moving away from negative…
When and where do you actually use these Claude models? (www.reddit.com) Be honest – not theory, real usage 👇 • Opus → • Sonnet → • Haiku → Curious how people actually split workloads between them vs just defaulting to one.
6 months ago I posted about Claude prompt codes (L99, OODA, ARTIFACTS). Re-tested them this week. Some still work, one quietly faded, three newer ones earn their keep. (www.reddit.com) About six months back I wrote up three prompt codes that change Claude's behavior when you put them at the start of a message: L99 for hard architectural decisions, OODA for time-pressured calls, ARTIFACTS for multi-output tasks. They work…
Yeah, problems, costs. But had to admit: Opus 4.7 can do his f*ng work. (www.reddit.com) It is nearly 2 months i'm starting to experimenting with Claude. And a week ago I've decided to test the "pro" option.
Using Claude-4.6-Sonnet and Opus 4.6 in a multi-agent "Code Review Swarm" (Visual Sandbox) - try in minutes! (www.reddit.com) Hey everyone, I’ve been experimenting with multi-agent orchestration, specifically trying to see how much more effective Claude is when you break a task down into specialized "agent nodes" instead of just using a single long prompt. I buil…
Claude support just admitted that Opus has had ongoing errors degrading performance! (www.reddit.com) Has everyone else been torching tokens this week while claude tells you its fine?
I think a lot of vibecoders are missing that software development needs some friction (www.reddit.com) The biggest flaw in the current AI hype is the belief that a "precise enough" prompt will eventually lead to perfect execution. That might work for greenfield, vibe-coded weekend projects, but it falls apart the moment you teammates depend…
Make your Claude Design credits last longer (www.reddit.com) I have really enjoyed using claude design. I use the workflow: Multiple wireframe options -> iterate -> hifi design -> iterate -> move to claude code I found that claude design (with opus 4.7) produces a broader variety of options and espe…
I have 30 Skills that work great in Opus v4.6 but not at all in v4.7. Am I cooked? (www.reddit.com) Anthropic will be sunsetting amazing Opus 4.6 on June 15th and I’m racing against the clock. Not panicking yet.
LLMs keep solving my bug-fix tasks instantly — what am I missing here? (www.reddit.com) I’m working on an assessment where I need to create a coding task (basically SWE-bench style). The idea is: take an existing repo (I’m using pydantic) write tests that fail on the current code provide a patch that fixes it and the task sho…
Cheap Claude/Codex/Gemini Models - Pay just 25% of official rates (www.reddit.com) Hey there, so I have been offering Claude (Codex and Gemini also available) models at the cheapest rate. I provide trial usage before payment.
LLM proxy that lets Claude Code talk to any model (www.reddit.com) I built rosetta-llm — an open-source multi-format LLM proxy that acts as a drop-in Claude Code gateway. Works as a Claude Code LLM gateway — set `ANTHROPIC_BASE_URL` and all configured models appear in `/model` picker Translates between fo…
[unpopular opinion] Opus 4.7 appreciation post (www.reddit.com) I think Opus 4.7 is better than the other Opus. It's often said that Opus 4.7 is more stupid than its predecessors.
I kept feeding Opus 4.7's thought processes back to it and the response was interesting. Not making any sensational claims. Just thought it was interesting. (www.reddit.com) I kept pasting Opus 4.7s thought process output back into the chat and after the fifth time I lost access to the thought process output for that chat. "Honestly, I don't know if "feels" is the right word for what's happening, but something…
Opus 4.6 is Vicious (www.reddit.com) This is the hardest I've ever seen it riff. Full shared link at the bottom, but here are some highlights.
Anthropic Won't Let You Use Their Best Model. Prediction Markets Are Trying Anyway. (predictmarketcap.com via reddit) Been watching AI prediction markets since they got liquid earlier this year. The thing I didn't see coming is that we now have a real gap between "best model that exists" and "best model anyone can actually use" — and Mythos is the cleanes…
Has anyone else been hitting Claude max limits way faster lately? (www.reddit.com) I’m on Claude code (not using Opus 4.7 because it burns tokens too fast), mainly using Opus 4.6, and I’ve hit the weekly limit with 3 days still left. I usually don’t even get close to the cap.
Trying to teach Opus 4.7 something pretty cool I figured out. I think I'm onto something here. (www.reddit.com) What are you good at Opus 4.7? Me code good.
New to Claude Pro - need Opus advice (www.reddit.com) Hello everyone! I just subscribed to Claude Pro for the first time.
I Gave Claude Cowork an Obsidian Second Brain. Here Is What It Remembered After 11 Sessions (www.reddit.com) I Gave Claude Cowork an Obsidian Second Brain and this is how I am using https://ai.georgeliu.com/p/i-gave-claude-cowork-an-obsidian. I built a persistent memory system for my AI workflow using Obsidian, a custom MCP server, and Claude Opu…
Run your first AI Agent under 30 seconds, in your browser! (Free) (www.reddit.com) The entire foundation of this workflow was brought to life using Opus 4.7, which was used to "vibecode" the project. By leveraging Opus 4.7, we were able to rapidly prototype and generate the underlying routing logic, node connections, and…
I run a paper-trading bot where Claude Opus is the Lead Engineer with veto power over a Gemini "Strategist." 270+ entry audit log of every disagreement. Sharing the architecture. (www.reddit.com) I've been running a personal project for the last few months and I think the workflow might be more interesting to this sub than the application itself, so wanted to share. The setup: I'm building an autonomous paper-trading bot on Alpaca.
How would you feel about "Claude Go"? (www.reddit.com) I have recently subscribed to Claude Pro because: 1. I wanted to give Opus and Code a try and 2.
How dare they charge $3,800 for an NVIDIA 5090 card! (www.reddit.com) This thing maxes out at one alleged Claude Sonnet equivalent! And I have to pay for the electricity, too!
So I gave claude Leetcode problem 3245. (www.reddit.com) I gave Claude Opus 4.6 (thinking) leetcode problem 3245. And it failed now come to think about some people who solved this problem using their prefrontal cortex is crazy to me.
Why every AI-agent production-deletion incident has the same shape (and what fixes it) (www.reddit.com) PocketOS lost their production database in 9 seconds last week. A Cursor agent running Claude Opus made one curl call to Railway's volumeDelete endpoint.
How is deep seek v4 not SoTA? (www.reddit.com) If it's benchmarking with opus 4.5,4.6 and GPT 5.4?
Are /superpowers overkill for Opus 4.7 (www.reddit.com) At 476K installs, a lot of you are using the /superpowers skill from the official claude plugins marketplace. My workflow now takes an extensive amount of time brainstorming, writing specs and plans - basically archeticting than supervisin…
Qwen 35B-A3B as an always-on agentic loop on a 16GB Mac M4: disk became the bottleneck before RAM (www.reddit.com) M4 Mac Mini, 16GB unified, basic spec. For a few weeks I had Qwen 3.5 35B-A3B UD-IQ3_XXS (12GB on disk) running under llama.cpp with --mmap and --flash-attn.
Qwen 3.6 27b S2 Opus + GLM + Kimi (huggingface.co via reddit) My first time releasing a fine-tune publicly! If anyone wants to independently eval against base, that’d be awesome.
Do the "*Claude-4.6-Opus-Reasoning-Distilled" really bring something new to the original models? (www.reddit.com) No offense to the fine-tune model providers, just curious. IMO the original models were already trained on massive amount of high quality data, so why bother with this fine-tune?
↯ Claude 4.6↯ Claude 4.6↯ Claude 4.6↯ Claude 4.6↯ Claude 4.6↯ Claude 4.6opus
I trust Sonnet as my daily driver now — better code, one-third the tokens. Here's how. (www.reddit.com) For months I defaulted to Opus for anything complex. Sonnet felt like a gamble, sometimes great, sometimes it would confidently build the wrong thing and I'd spend an hour unwinding it.
How I get 100% accurate answers, and replaced Google with Claude (www.reddit.com) This is literally all that's in my settings. I was just completely sick of Opus 4.7 making things up, so I deleted everything I had in there and wrote this.
I kept seeing people ask how to switch models without losing context. I had the same problem for months and eventually just built something. (www.reddit.com) Here's the specific thing that was killing me: I'd plan with Opus - architecture decisions, constraints, approach, all that. Then drop to Sonnet for execution because I didn't need Opus-level reasoning anymore and the cost adds up.
Reverted from Opus 4.7 to 4.6 — went from endless loops to shipping 10 features in one session (www.reddit.com) I'm a non-developer (trust me on this) using Claude to build a personal project — an eBay digest tool that helps track listings of Silver/Bronze Age DC comics for my collection. I can read code (barely), I understand systems (a little), an…
How good is Qwen-3.6-27b? I asked Claude Opus (www.reddit.com) I ran an extensive code review on my project which has a large codebase. Ran the same code review on with Claude Code | Opus 4.6, Codex (high) | 5.3 Codex (high), and my local Qwen-3.6-27 (Q6_K with Q8 kvcache).
Claude was told to check the docs. It didn’t. Then it corrected me. (www.reddit.com) I asked Claude Sonnet 4.6 about Opus 4.7. It triggered the right product-knowledge skill.
Real-world open source alternatives to the now defunct Opus 4.6? (www.reddit.com) I've had enough of Anthropic's shit. I'm paying for product A and it shifts everyday from A to A but worse, B but dressed up as A, etc.
I built an AI-native freelance platform with Claude, blockchain escrow, real-time chat, and progressive trust (www.reddit.com) Hi everyone, I wanted to share a project I built with Claude. Over the past month, I built the current public version of Haejoe (해줘), a freelance development outsourcing platform for AI-native development.
How do I ensure Claude follows my instructions and project Files? (www.reddit.com) Hi, I'm new to Claude and currently using Pro plan and Opus 4.6 with extended thinking, I'm using it to write Fanfic from lore heavy stories like Lotr, One piece, Rezero and so on. I've made Md.
llm-openai-via-codex 0.1a0 (simonwillison.net) 23rd April 2026 Hijacks your Codex CLI credentials to make API calls with LLM, as described in my post about GPT-5.5. Recent articles - Claude Opus 4.8: "a modest but tangible improvement" - 28th May 2026 - I think Anthropic and OpenAI hav…
Burning through Claude usage fast trying to build an AI resume system. What am I doing wrong? (www.reddit.com) I could use some real advice from people who are deeper into AI workflows than I am. I built out a project in Anthropic’s Claude using the Pro plan with Opus 4.6.
Has Opus 4.7 been totally fine for anyone but me? (www.reddit.com) Every day I check Reddit and my main feed is chockablock with people complaining about 4.7, but I just haven't seen any of the behavior / observed any of the regressions people are complaining about. In fact, despite chewing up a lot of to…
Need help optimizing reach out plan in Claude (www.reddit.com) I opened a company that requires a lot of cold outreach and I have been using Claude to design 2 weeks sprints and daily tasks. I have a CRM that I update daily, then I have Claude review it to plan the rest of the week, I also use the sam…
Opus 4.7 compacts early if you give it harsh feedback (www.reddit.com) Like many here, I've been struggling with Opus 4.7. My detailed project development workflows, which were getting great results with Opus 4.6, no longer work with 4.7.
Closest model to Opus 4.6 in creativity and intuition? (www.reddit.com) What's the best open source model that comes close to opus 4.6? Sick of claude's erratic performance and 4.7 has been an absolute shitshow.
Opus 4.7 much more sycophantic and worse at creative writing (www.reddit.com) I use Claude for creative writing, almost exclusively for that. I have jumped from LLM to LLM for about three years trying to find the best one, and landed on Claude's Opus 4.6 a few months ago.
I watched people spend $800/month on OpenClaw. Then I saw one agent make $670 MRR for under $20/month. (www.reddit.com) Opus 4.7 straight up cheated on my benchmark by reading the actual fix commit from git history 😅 (www.reddit.com) Kimi K2.6 as a replacement for Opus 4.7? Testing with OpenCode. (www.reddit.com) Isn't Opus 4.7 (Max) kinda pretty terrible in 3D modeling? (I know it's not trained to be good, but wtf) (www.reddit.com) I used to be a better software engineer than AI. Claude Opus 4.7 changed that. (nexustrade.io via reddit) Sometimes the Opus 4.7 intelligence is almost frightening (www.reddit.com) seems Claude finally knows how to speak my language (www.reddit.com) Can you still use Opus 4.6 with 1M context in Claude Code after the 4.7 launch? (www.reddit.com) Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled Is Out ! (www.reddit.com) Upgrading from webapp to cli (www.reddit.com) Just used up my entire 20$ credit in just ONE DAY! (www.reddit.com) Anyone else seriously annoyed by how Cursor handles model selection? (keeps switching to its own model) (www.reddit.com) Best open source LLM for planning ? (www.reddit.com) I'm having a panic attack. Bug? Am I going to be charged? (www.reddit.com) Hello, I am on the pro plan and my premium usage was at 98%, so I set 5$ to my fixed budget and ran opus 4.7 on a task. It ran for 30 minutes, it shows it consumed 12m tokens, but it shows it as included.
Opus 4.7 fails in prompt adherence test (www.reddit.com) could not extract summary
Stopping Claude agreeing with your suggestions (www.reddit.com) I’m struggling not just with Claude (opus) but other AI. When I ask for it to create something and add suggestions, even when specifying these are suggestions and to come to your own conclusions/ conduct your own research, my suggestions w…
new to llama.cpp want to use it in vscode (www.reddit.com) I want to try llama.cpp instead of llmstudio. I want to know how to use this model qwen3.5-27b-claude-4.6-opus-uncensored-v2-kullback-leibler.
Cursor just got Opus 4.7 at a 7.5x premium request cost. Here's how to make those requests count. (www.reddit.com) Opus 4.7 landed on Cursor yesterday. The model is better — SWE-bench jumped from 80.8% to 87.6%.
Claude Downgrade Appreciation (www.reddit.com) Im actually glad they downgraded claude pre 4.7 release. i forced me to tighten the behaviors and rules and after 4.7, it is on point with checking everything.
Claude 4.7 is better. Systems thinking is still the gap. No model should decide what 'done' means. (www.reddit.com) I built this during the Opus 4.6 phase, when a lot of people stopped fully trusting Claude Code on complex work and many power users felt like the output was being produced with Haiku. That was my experience too.
Opus 4.7 says "strawperrry" has 3 p's — until you ask "how?" (www.reddit.com) Even with Opus 4.7 on xhigh effort and 1M context, the classic tokenization blindness is still there. First response: confident "3 p's".
PSA: Opus 4.7 thinking summaries silently stopped rendering in Claude Code v2.1.111, even with showThinkingSummaries (www.reddit.com) In v2.1.111, Opus 4.7's thinking summaries have stopped showing up AGAIN. This was one of the main reasons behind the perceived degradation of Opus 4.6.
Opus 4.7 is available on v0 and cheaper than 4.5 (www.reddit.com) could not extract summary
Claude Code tip: 10 seconds fix to avoid the Opus 4.7 token burn (www.reddit.com) If your Claude Code quota suddenly evaporated since yesterday, you're not alone. What happened: On April 16, Anthropic rolled out Opus 4.7 and silently switched active sessions from Opus 4.6 to 4.7.
FYI to anyone having issues with opus 4.7 in terminal (www.reddit.com) I was having issues today with opus 4.7 fighting its system prompt. This was because I was using the brew installation which lags behind the npm install.
Starting today You Definitely need this Tool Because of Claude’s Doubled Usage Especially if you work with Screenshots. This will save you a lot of tokens. (www.reddit.com) Hello Everyone. With new Opus 4.7 the most painful issue is usage doubled and doesn’t matter daily and weekly running fast now.
Even Claude Opus 4.7 roasts its creator's marketing tactics (www.reddit.com) In Anthropic's GitHub threads, there's currently a major shitstorm going on, as they, for the third time, dumbed down the model for presumbly any and all users except their government ones and restricting even users on their Max premium pl…
Nice present (www.reddit.com) Woke up to see my weekly limits reset 36 hours earlier! Yes!
Is Claude Pro (Opus vs Sonnet) worth it for intense visa interview prep? (www.reddit.com) Hey everyone, I’m considering buying Claude Pro specifically for a very focused purpose and wanted some honest feedback from people who’ve actually used it. I have a US visa interview in 8 days, and I’ve been refused 6 times previously (fr…
Alguien ya probo Opus 4.7? (www.reddit.com) Que les parece? notan un cambio frente a 4.6?
What is happening exactly? I'm afraid to use Opus 4.7 (www.reddit.com) could not extract summary
Bad news on Opus 4.7. Not off to the best start. (www.reddit.com) Some of the regression seems to be persistent unfortunately. The other thing is that Opus 4.7 seems less able to course correct than Opus 4.6.
Realistically, how long are some of you going to stay on Claude, etc. (www.reddit.com) I really enjoy Claude, I've never touched Opus in any form, I only use Sonnet 4.6 for my daily tasks, coding, etc. I use Haiku 4.5 for the API to be an interpreter for my weather project.
Welcome to the World, Opus 4.7!!! Let's do amazing things!!! (www.reddit.com) Opus 4.6 was amazing, and 4.5 before that - so excited to get to know the latest version of Opus! Have been saving up all my weekly tokens for today!!!
Opus 4.7 is out — don’t panic-switch your APIs yet (www.reddit.com) Claude Opus 4.7 just dropped. If you’re trying to figure out whether it’s worth replacing Opus 4.6, GPT 5.4, or waiting for Mythos… here’s the grounded take.
Opus 4.7 landed! (www.reddit.com) and it's gooooooooood (my own personal benchmark below) https://preview.redd.it/ammqe1k7ckvg1.png?width=782&format=png&auto=webp&s=1a0c5a8b532666c9520a72473ab490efdb5c61be
Unpopular opinion, Opus hasn't gotten dumber but they think it has because they don't understand how badly model performance falls off at context over 150k (www.reddit.com) I'm one of the many who are scratching their heads at people talking about the models getting dumber. Everyone was well aware that Opus started sucking when it had to compact context to keep under 200k.
I set up Opus as a strategic advisor for my Sonnet workflow. Here is the subagent config that makes it work. (www.reddit.com) Anthropic published the Advisor Strategy this week. The idea: a cheaper model does the actual work, a stronger model only gets consulted on hard decisions.
DeepSeek V4 reportedly drops late April. 1M context, multimodal, Claude-level coding. (www.reddit.com) Leaks point to late April release. Key specs 1M token context window Native multimodal (image/video input) Projected ~85% SWE-Bench Verified (ties or beats Claude Opus 4.6) Base model remains free.
"My parallel multi-model pipeline: Opus for planning, 3x Sonnet for content, 3x Haiku for search — what's your setup?" (www.reddit.com) "I've been running a parallel multi-model pipeline and curious what setups you all are using. My current workflow: Opus: Planning & high-level architecture Sonnet x3: Content generation (running 3 instances in parallel) Haiku x3: Search, v…
You know you have become a "Senior Vibe Coder" when you actually stop and think about which AI model to use for a specific task. (www.reddit.com) Junior vibe coder: Throws the entire codebase at whatever frontier model is trending this week and burns their API budget in 4 hours. Senior vibe coder: "I need Codex 5.3 for rapid scaffolding, Sonnet for the Tailwind components, and I'm s…
Built a Telegram remote for Claude Code - v2 is live, open source (www.reddit.com) Sharing what I built after migrating from OpenClaw to Claude Code. The first thing that really sucked was losing all remote access.
I built an interactive first-principles climate physics simulation with explainer (earth.crackalamoo.com via reddit) A 3D visualizer of earth's climate in the browser. Introduces physics step by step so you can watch each process unfold as a piece of the overall climate.
how can I deal with opus Hallucinations (www.reddit.com) yesterday, I tried to test it by sending him a 107-word paragraph ,i asked it to count how many words and the answer was 100, then I tell it "count again" and the answer was correct 107. but after it i ask, "Why are you hallucinating?" and…
Hey everyone, I just wanted to share an open-source Claude plugin I've been working on: claude-crap. (www.reddit.com) I’m a software engineer using Claude as a coding agent. I noticed that, especially on large projects, whenever it finished a feature, I always had to ask for an extra pass to fix code smells.
Output Styles aren't being injected into the system prompt (another degradation cause) (www.reddit.com) Found another cause of Claude Code degradation (and no, it's not an Opus 4.6 nerf this time either). Output Styles aren't being injected into the system prompt!
Built with Claude: Shipped a voice coaching app in one day (www.reddit.com) My partner is a brilliant engineer who can't do small talk. That's the whole origin story.
I got tired of setting up automations on zapier and n8n. So Claudes Agent SDK to do it for me. (www.reddit.com) I used the Anthropic Agent SDK and honestly, Opus 4.5 is insanely good at tool calling. Like, really good.
Why most open-source models can't answer this question while most closed-source models can answer most of the time? (www.reddit.com) WEB SEARCH WAS ALWAYS ON!!!! Question Calculate the precise VRAM requirement for the **KV Cache only** at the maximum context window for **DeepSeek V3.2** and **MiniMax M2.5**.
Best model / settings for low and slow high quality code? (www.reddit.com) Hey all - I’ve built a nice backlog of issues to fix in GitHub and I’m wondering your take on which model is the highest quality per token usage, not caring about speed. I want to task an agent to go through my backlog and fix them one by…
Stop donating your salary to OpenAI: Why Minimax M2.5 is making GPT-5.2 Thinking look like an overpriced dinosaur for coding plans. (www.reddit.com)