I wrote a hopefully humanistic and warm and gentle prompt (johndoeisntfound.github.io via reddit)
I wrote in the title that I wrote a prompt, techinically it's a set of custom instructions, but you see "a set of custom instructions" is too long to fit into a readable title, so :p Anyway- yeah! I wrote those instructions in hope for let…
Claude Fable is relentlessly proactive (simonwillison.net)
Claude Fable is relentlessly proactive 11th June 2026 After two days of experience with Claude Fable 5 I think the best way to describe it is relentlessly proactive. It knows a whole lot of tricks and it will deploy pretty much any of them…
Joining the Claude partner network? (www.reddit.com via reddit)
Hi, I tried to join but was rejected as I'm building with Claude rather than delivering projects to clients. Has anyone succefully joined who isn't selling services to clients?
- Claude Partner Network Questions (www.reddit.com)
The Role of Feedback Alignment in Self-Distillation (arxiv.org) discussed ↗
built a tool that yells at me for underusing my claude max plan (www.reddit.com via reddit)
turns out i was using ~22% of my weekly cap on average. paying for 5x, behaving like 1x.
Access OpenAI models and Codex through your Oracle cloud commitment | OpenAI Use your existing Oracle cloud commitment to give teams access to OpenAI’s most advanced models and Codex, without creating a new purchasing path. Listen to artic…
-
132 items
model roundup
Opus 4.8Claude AI has released Opus 4.8, an upgrade to their Opus class of models available in version 2.1.154 of their software on March 16, 2023, which includes enhanced coding and professional task capabilities along with improved judgment and honesty. Users are reporting usage resets following the update.
360 itemsevent
Anthropic MythosAnthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.
- 1h tested Claude Fable 5 and Opus 4.8 across 917 coding-agent scenarios. Fable won by 0.9 points.
- 2h Show HN: We're inviting Anthropic to put the real Mythos 5 on our open benchmark
- 3h Anthropic Mythos: Modelling Bank Strategies
- 7h Canceled my sub over the silent-sabotage guardrail, renewed when they walked it back
- 10h Fable 5 added to the Artificial Analysis Coding Agent Index... barely 1 point ahead of GPT-5.5 ???
Superficial Beliefs in LLM Decision-Making (arxiv.org) discussed ↗
Investing in multi-agent AI safety research (deepmind.google)
Using Cloudflare's Agentic Interface to (Mostly) Seamlessly Launch a Website (theautomatedoperator.substack.com via hn)
Using Cloudflare's Agentic Interface to (Mostly) Seamlessly Launch a Website A small task, but a nice peek into how things may look when our agents are taking care of tedious tasks in the background. I recently had to set up a website.
AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis (arxiv.org) discussed ↗
Local-first runtime governance layer for AI systems Guardian Runtime A Zero-Latency FinOps & Security Firewall for AI Applications. Intercept every prompt and response locally.
ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity (arxiv.org) discussed ↗
datasette-agent 0.2a0 (simonwillison.net)
10th June 2026 Highlights from the release notes: - Tools can now ask the user questions mid-execution. Tools that declare a context parameter receive aToolContext object, andawait context.ask_user(...) can ask a yes/no, multiple-choice (o…
- datasette-agent 0.1a4 (simonwillison.net)
- Show HN: Datasette Agent (simonwillison.net via hn)
- datasette-agent 0.1a3 (simonwillison.net)
+2 more
- datasette-agent 0.1a2 (simonwillison.net)
- datasette-agent 0.1a1 (simonwillison.net)
Mac App Crashing after upgrade to Tahoe 26.5.1 - how I resolve it (www.reddit.com via reddit)
Ever since I applied the latest update, Claude GUI has been crashing about 20 seconds after loading. I don't even have to do anything, just opening it and letting it sit there will crash.
-
76 items
model roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, including sizes up to 31B parameters and featuring Dense and Mixture-of-Experts architectures. Notable community highlights include the release of Gemma 4 12B as an encoder-free unified model for laptops, its availability via llama-server on a RTX 5070 Ti GPU, and detailed visual guides showcasing its capabilities.
51 itemsevent
DeepmindGoogle DeepMind has released "Deep Research Max," advancing autonomous research agents, while also facing challenges and competition from other AI companies like Anthropic and Ineffable Intelligence. Meanwhile, DeepMind workers in the UK have voted to unionize, and former DeepMind architect Demis Hassabis is at the center of legal drama involving Elon Musk.
- 1d Google DeepMind is worried about what happens when millions of agents start to interact
- 1d Show HN: Magenta Real-Time Music Generation on iPhone, Without the GPU
- 2d The Great Reframing...
- 2d Show HN: VQAScore – open eval metric/reward model, now for text-to-video
- 7d Inside Google DeepMind: Reasoning, Omni, and Shipping Frontier AI
Steganography Without Modification: Hidden Communication via LLM Seeds (arxiv.org) discussed ↗
Initial impressions of Claude Fable 5 (simonwillison.net)
Initial impressions of Claude Fable 5 9th June 2026 I didn’t have early access to today’s Claude Fable 5 release, but I’ve spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast.
Claude AI system design (www.reddit.comhttps)
I tried and there can be flaws in this and this is an open ended question. Any suggestions or corrections are appreciated.
- Claude Design and Code (www.reddit.com)
- Claude design backfire? (www.reddit.com)
- Claude design to code (www.reddit.com)
+16 more
- Claude design is down? (www.reddit.com)
- Thoughts on Claude Design (www.reddit.com)
- Editing with Claude Design (www.reddit.com)
- Claude design is saving me (www.reddit.com)
- Claude Design bug? (www.reddit.com)
- Claude Design is... clumsy (www.reddit.com)
- Claude Design Is Real Design (diverging.run via hn)
- Tips for Claude Design (www.reddit.com)
- Claude Design (www.reddit.com)
- Claude Design (www.reddit.com)
- Claude Design - How creative is it? (www.reddit.com)
- Claude Design is Incredible... (www.reddit.com)
- Claude Design System Prompt (gist.github.com via hn)
- Claude Design (claude.ai via hn)
- Claude Design (www.anthropic.com via hn)
- Claude's new System Reminder (www.reddit.com)
Breaking the Ice: Analyzing Cold Start Latency in vLLM (arxiv.org) discussed ↗
llm 0.32a3 (simonwillison.net)
9th June 2026 Almost entirely written by the new Claude Fable 5, see my write-up for more details. Recent articles - Initial impressions of Claude Fable 5 - 9th June 2026 - Running Python code in a sandbox with MicroPython and WASM - 6th J…
Fable passes the "When A.I. Passes This Test, Look Out" test (www.reddit.com via reddit)
New York Times article on Jan 2025 - "When A.I. Passes This Test, Look Out" and Claude Fable just passed it at 53%.
BeamWeaver Build AI agents and durable LLM workflows in Elixir. BeamWeaver brings the practical parts of LangChain, LangGraph, and Deep Agents to the BEAM: agents, tools, graph workflows, streaming, memory, persistence, retrieval, provider…
Qwen-Image-Flash: Beyond Objective Design (arxiv.org) discussed ↗
vid-line 🎬 Watch videos in your Claude Code statusline while your agents grind. *(Demo rendered programmatically with Remotion from the actual ANSI frames — see demo/.)* vid-line converts any video into ANSI half-block pixel art and plays…