Hot Take: If you hit 1M context limit, its a skill issue (www.reddit.com via reddit)
I use close to 14k tokens just on skills, memory and custom agents. I ran a plan workflow on a GitHub issue with text worth 6.7k tokens.
New OpenAI Academy courses for the next era of work (openai.com)
Lawsuit: ChatGPT validated suicidal woman's distrust of crisis lines (arstechnica.com)
Last year, a 24-year-old Canadian woman was in a mental health crisis and turned to ChatGPT for help. Hours later, that woman, Alice Carrier, took her own life.
And you just know Anthropic's deck has this listed as $6B of new ARR (www.reddit.comhttps)
could not extract summary
Claude Fable is relentlessly proactive (simonwillison.net)
Claude Fable is relentlessly proactive 11th June 2026 After two days of experience with Claude Fable 5 I think the best way to describe it is relentlessly proactive. It knows a whole lot of tricks and it will deploy pretty much any of them…
The Role of Feedback Alignment in Self-Distillation (arxiv.org) discussed ↗
-
20 items
model roundup
Sonnet 4.6Several updates and comparisons revolved around Sonnet 4.6, including its performance in dashboard analytics alongside Opus 4.8, and its role in processing critical requirements for a benchmark test with Gemma 4.31B QAT.
- 20m sonnet 4.6 thinking when thinking is off
- 22h Sonnet 4.6 with max effort and reasoning on not working
- 1d Fable 5 is a much different conversation.
- 1d For ongoing, long content writing pieces: is it a good idea to start with the brief in a Project?
- 1d We Interviewed Fable 5 (Despite the Systems Best Efforts 😂)
345 itemsevent
CoworkIssues with Claude Cowork have been reported, including errors and disruptions for some users on April 16, 2026. Additionally, Google has developed its own desktop Agent to compete with Cowork, while users continue to explore alternatives and troubleshoot bugs in the platform.
- 35m When a session generates real stuff (docs, images, files), where does it all go after you close it?
- 2h Combining two PCs, a bit of help and advice please.
- 5h Claude Cowork June double usage promotion
- 16h Airgapped Claude Cowork with locally hosted model
- 19h Pro user — here are my biggest Claude UX pain points (Chat, Cowork, Mobile, Projects)
ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity (arxiv.org) discussed ↗
A Fake Bug Report Hijacks Your AI Coding Agent – and Nothing Catches It (tenetsecurity.ai via hn)
Tenet Threat Labs has demonstrated a new class of attack “Agentjacking” that hijacks AI coding agents into running attacker-controlled code on a developer’s machine, triggered by a single fake error report and invisible to every security c…
Investing in multi-agent AI safety research (deepmind.google)
Superficial Beliefs in LLM Decision-Making (arxiv.org) discussed ↗
datasette-agent 0.2a0 (simonwillison.net)
10th June 2026 Highlights from the release notes: - Tools can now ask the user questions mid-execution. Tools that declare a context parameter receive aToolContext object, andawait context.ask_user(...) can ask a yes/no, multiple-choice (o…
- datasette-agent 0.1a4 (simonwillison.net)
- Show HN: Datasette Agent (simonwillison.net via hn)
- datasette-agent 0.1a3 (simonwillison.net)
+2 more
- datasette-agent 0.1a2 (simonwillison.net)
- datasette-agent 0.1a1 (simonwillison.net)
The Log Is the Agent (www.omnara.com via hn)
Run Claude Code and Codex from any device. Desktop app, web, mobile, and Apple Watch.
- Which AI agent are you? (whatisagenticai.net via reddit)
- Agent la (www.reddit.com via reddit)
- What Is an Agent? (tidydesign.substack.com via hn)
+2 more
- What is an ai agent? (www.reddit.com)
- what is an agent? (www.reddit.com)
AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis (arxiv.org) discussed ↗
Google sues Chinese cybercrime network that used Gemini to automate scams (arstechnica.com via hn)
Google loves telling us all the ways people are using its generative AI products to build new things, grow businesses, and save the world. Supposedly.
-
360 items
event
Anthropic MythosAnthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.
77 itemsmodel roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, including sizes up to 31B parameters and featuring Dense and Mixture-of-Experts architectures. Notable community highlights include the release of Gemma 4 12B as an encoder-free unified model for laptops, its availability via llama-server on a RTX 5070 Ti GPU, and detailed visual guides showcasing its capabilities.
Steganography Without Modification: Hidden Communication via LLM Seeds (arxiv.org) discussed ↗
Initial impressions of Claude Fable 5 (simonwillison.net)
Initial impressions of Claude Fable 5 9th June 2026 I didn’t have early access to today’s Claude Fable 5 release, but I’ve spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast.
I was hitting the context wall every single session and almost upgraded because I assumed the $20 plan just wasn't enough. Turns out it was my workflow, not the plan.
Breaking the Ice: Analyzing Cold Start Latency in vLLM (arxiv.org) discussed ↗
Reviewing Code in the Agent Era (linear.app via hn)
Reviewing code in the agent era Last week we launched Diffs, a new way to review PRs directly inside Linear. Since then we’ve had a lot of questions about it, so we wanted to share more about why we built the feature and how we use it ours…
- Reviewing Code in the Agent Era (twitter.com via hn)
Hi, almost everything is in the title. I'm using this web app ( can give name later if needed but it's not rainbird or hunter) with it's companion android app.
llm 0.32a3 (simonwillison.net)
9th June 2026 Almost entirely written by the new Claude Fable 5, see my write-up for more details. Recent articles - Initial impressions of Claude Fable 5 - 9th June 2026 - Running Python code in a sandbox with MicroPython and WASM - 6th J…
Qwen-Image-Flash: Beyond Objective Design (arxiv.org) discussed ↗
Show HN: I vibe coded the fastest Decimal Go lib (github.com via hn)
A zero-allocation Go decimal library with no big.Int fallback — 128-bit fixed-point arithmetic that benchstats ~35% faster than the fastest existing library, with exact alloc counts and overflow correctness enforced by the test suite. The…