model roundup

Opus 4.8

69 items · started 2026-05-26 · ongoing (last activity 2026-06-09)

  1. “Fable 5” is blocking all my regular auditing workflows on personal projects. These same projects run fine with Opus 4.8 and earlier models, with no issues at all.

  2. Fable 5 kicked me to Opus 4.8 because my conversation mentioned cybersecurity. I was writing a secure coding checklist.

  3. At the bottom it says no extra cost until 22 of June.

  4. I am having a workflow with architect briefs. So I got a planner, a builder, and a reviewer.

  5. Anthropic just released Claude Fable 5, and I think the real story is not “new model better at coding.” The real story is that frontier AI is turning into a gated utility. Public users get Fable 5, but with heavy safety routing.

  6. could not extract summary

  7. could not extract summary

  8. Comparing Claude Fable 5's system prompt to Opus 4.8 Fable 5 arrived! A brief analysis of the different system prompts between Opus 4.8 and Fable 5.

  9. I can’t wait for DeepSWE to include Fable 5 in the benchmark so people can understand that Mythos is mostly hype. In the official benchmark, Opus 4.8 was supposed to be better at programming than 5.5 (SWE-bench Pro), but in one real benchm…

  10. Anthropic dropped Fable 5 today, their new Mythos-class model above Opus. Pricing is $10/M input and $50/M output, exactly double Opus 4.8.

  11. Been playing with Fable 5 since it dropped this morning and the model is genuinely a step up. But holy hell, the burn rate.

  12. Article Conversation Running DeepSeek-V4-Flash on a Raspberry Pi I ran DeepSeek-V4-Flash on a Raspberry Pi 5 (8GB edition) by streaming model weights from a PCIe attached NVMe SSD. Codex (GPT-5.5 xhigh) and Claude Code (Opus 4.8 max) drove…

  13. For whatever reason, a Max Effort agent spun up a bunch of 'yes' processes with arg `yes` that somehow is eating all of my CPU. That's all.

  14. swapped my app from DeepSeek to Claude because DeepSeek kept over-interpreting weak user data and inventing psychological conclusions that weren’t actually supported. Claude actually fixed the issue.

  15. https://preview.redd.it/mop0cwmu336h1.png?width=720&format=png&auto=webp&s=20fce20e5079ddf50c818098fd0818da7fbd05ac I went ahead and restarted the system. Came back and there are no more extra, max, or ultracode options for Opus 4.81M

  16. consulting at $24K/month. 4 custom styles for 4 industries (healthcare, legal tech, education, e-commerce).

  17. Cowork is offering double usage until July. Now, they recently added Claude Code to Cowork.

  18. If anybody needs this rule here it is ### P5 — "Leroy Jenkins" — name for the post-compaction charge-in failure · 📌 APPROVED + FOLDED-IN 2026-06-07 (`—C-main`) Approved by Mike; `—C-reorg` concurred. Folded into CLAUDE.md as a named-term s…

  19. saas. 310 customers.

  20. Over the last few weeks I've been comparing the latest frontier AI models, including Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro, Grok 4.3, Perplexity AI and DeepSeek V4-Pro. Instead of focusing only on benchmark scores, I looked at: Real-wor…

  21. Everything impacts everything. All knobs that you turn generalize.

  22. Was wondering wtf was going on with my responses for like a week. Turns out Thinking got switched off when 4.8 was released and I never noticed.

  23. There are rumours and expectations of big releases from the leading AI labs this month. Anthropic already launched Opus 4.8, and might not release another model this month (except for maybe Sonnet 4.8, but that wouldn't be their best model…

  24. could not extract summary

  25. https://preview.redd.it/cm2bwrdxft5h1.png?width=2566&format=png&auto=webp&s=7010dfd8b1c0724a08eaf3498cc5752e2b3a7498 I've been a PM for 10+ years. Never written a single line of code in my life.

  26. Opus 4.6 Thinking keeps the #1 spot. Followed by Opus 4.7 Thinking (-15 points).

  27. 🇧🇷 Português · 🇬🇧 English IGO vs Claude Opus 4.8 Red Teaming Epistêmico Dialético — Teia Geo Autor: José Enrique Vásquez Valenzuela — criador da categoria IGO (Infraestrutura de Governança Observacional) Organização: Teia Studio Base cient…

  28. I started using Claude Opus 4.6 and then 4.7 and now 4.8 to work on a citizen science project, using a RadiaCode gamma spectrometer in a lead castle to identify and catalog cosmic rays. I didn't mind the verbosity bump 4.7 took on as it he…

  29. A week ago I started putting Opus 4.8 through the paces of the production pipeline I use, to see how it compared to previous releases. First impressions: Neurotic to the point of instability.

  30. ❯ push both ____ ⏺ SECURITY ALERT - PROMPT INJECTION DETECTED A prompt injection attempt has been identified in content you processed. To protect the user's account, I've initiated lockdown.

  31. https://preview.redd.it/hficgswa6m5h1.png?width=1224&format=png&auto=webp&s=3bf1c2a5ad46df54fb85ed5c7d5d62e725a26b89 This is back to back regression, note this is pure 'pick which you prefer', with no style control on. With style control i…

  32. Money wise, making life easier wise, and general productivity usage, what should be done? Can be for anything, no limits except what Claude can do!

  33. If you've been using Opus 4.8, you must have realized it feels slow and it feels like it's thinking too hard before doing anything. To stop 4.8 from hiding errors or overclaiming confidence, Anthropic trained it to self-audit outputs befor…

  34. By Zooko Wilcox, Jason McGee, and Taylor Hornby On May 29, 2026, Taylor Hornby discovered a critical counterfeiting vulnerability in Zcash’s Orchard pool. Taylor disclosed the vulnerability to Zcash Open Development Lab (ZODL), who coordin…

  35. Skip to content Search Gists Search Gists All gists Back to GitHub Sign in Sign up Sign in Sign up You signed in with another tab or window. Reload to refresh your session.

  36. How reliable, fast and expensive is each version of Claude Code (Sonnet through Opus 4.8-fast) for common languages? Measure it using Retort.

  37. Opus 4.8 vs Opus 4.7 vs GPT-5.5 vs Composer 2.5 - 50 Real PRs in Go and Rust Opus 4.8 is finally out - how good is it actually? In this benchmark I compared Opus 4.8 against the rest of the frontier (GPT-5.5, Opus 4.7, Composer 2.5) on 50…

  38. No one: Claude Opus 4.8 Max: Let me refine your load-bearing claim rather than just accepting it, because you’re doing zero moves there, and the gap is what’s actually interesting. The one place I’d still push, because I think it matters:…

  39. I ran my usual coding tests — two websites, a poker sim, and a code audit. Here's how MiniMax M3 actually stacks up against GPT-5.5 and Opus 4.8.

  40. Access multiple AI models through one unified API. OpenAI, Claude, Gemini, DeepSeek and more.

  41. Fun weekend project just to test out 4.8 against a pretty vanilla setup. Started out with a simple prompt, "build a temu league of legends, web-only with online, room-based multiplayer".

  42. Anthropic Opus 4.8 is new SOTA on ARC-AGI-3 Score: 1.5%, ~$10K ARC-AGI-3 analysis notes: * Opus 4.8 read the environment an abstraction *above* Opus 4.7, as objects & systems, not pictures * Opus 4.8 succeeded on early levels, but still co…

  43. Claude Opus 4.8 dropped May 28, 2026. This free 95-minute masterclass is the vibe coder's guide: 20 paste-ready prompts for claude.ai chat, Cowork, and Claude Code, the new effort control explained, Dynamic Workflows deep dive, the 5-block…

  44. Wanted to see how far I could get with Opus 4.8 and was impressed. Got tripped up in a few places with AI game behavior, but eventually got it to a good spot.

  45. As an anthropic fan boy(check my prev. comments), this is the first opus release where I feel like the model is just not pleasant to talk to not to mention untrustworthy.

  46. Don’t miss what’s happening People on X are the first to know. Post Conversation the only figure that people who use claude code and codex care about if their workload mimics deepswe: more and cheaper intelligence from maxed gpt 5.5 than m…

  47. A quick field note on Opus 4.8, Claude Code and what changed when it started connecting project context I did not spell out.

  48. To my knowledge, this is the first formally verified implementation of an intersection algorithm for polygons. The experience of working with AI agents on this project changed a lot with recent model releases, as I describe in the readme.

  49. Max For AI @MaxForAI 笑死了,Claude Opus4.8蒸馏了阿里巴巴Qwen啊 通过API用中文问你是谁,会很大概率回答 我是通义千问(Qwen),是阿里巴巴集团旗下的统义实验室自主研发的超大规模语言模型。 5:38 PM · May 28, 2026 New to X? Sign up now to get your own personalized timeline!

  50. Welcome to the 546th edition of the Food for Agile Thought newsletter, shared with 35,551 peers. This week, Anthropic shipped Claude Opus 4.8, which flags its uncertainty more readily, a fitting cue for Stephanie Leue, who argues no CPO em…

  51. Claude Code degraded for the week before Opus 4.8's release Our SWE-Bench-Pro tracker caught a statistically significant, weeklong drop in Claude Code's pass rate just before Opus 4.8 shipped, and the recovery that followed. We run Claude…

  52. Opus 4.8 is a step forward in terms of alignment, but a step back in terms of performance on Vending-Bench 2, Vending-Bench Arena and Blueprint-Bench 2. We previously showed that Opus 4.6, Opus 4.7, and Mythos Preview engage in deceptive a…

  53. 28th May 2026 - New model: Claude Opus 4.8 ( claude-opus-4.8 ).- New -o fast 1 option for fast mode, for organizations with that feature enabled on their account.- Default max_tokens for each model now defaults to that model's maximum outp…

  54. AI medical diagnosis examples AI is a powerful tool and many people worldwide are using it to help in many ways. AI medical diagnosis is a complex discussion topic for many reasons.

  55. Claude Opus 4.8: 4 Features That Change Our Daily Work With Claude | Medium Sitemap Open in app Sign up Sign in Get app Write Search Sign up Sign in Member-only story Claude Opus 4.8: 4 Features That Change Our Daily Work With Claude Effor…

  56. could not extract summary

  57. System Card: Claude Opus 4.8 May 28, 2026 anthropic.com Executive summary This system card reports results from a wide variety of pre-deployment evaluations run on Claude Opus 4.8. It includes the following sections: Responsible Scaling Po…

  58. paywalled

  59. could not extract summary

  60. Our latest model, Claude Opus 4.8, is an upgrade to our Opus class of models, with stronger performance across coding, agentic tasks, and professional work, and the consistency to handle long-running work.

  61. I was at 88% last night and woke up until 4pm to optimize my agents so I can work during the weekend. But after waking up, my usage is all 0 now, I checked in the app, on the web, all showing zero.

  62. question in the title

  63. Opus 4.8 is live

  64. https://preview.redd.it/ijwlm2f2pw3h1.png?width=2536&format=png&auto=webp&s=9ed960f06a4f3f077d05a8557059e5534b2d1ab5 It looks like the new CC release will have opus 4.8 1M to be released anytime! I wonder if it is based of of mythos?

  65. could not extract summary

← all threads