Ona Is Joining OpenAI (ona.com via hn)
Ona has entered into an agreement to join OpenAI as part of the Codex team. Our life's work just got bigger and more important.
- Hiro Is Joining OpenAI (hirofinance.com via hn)
datasette-agent 0.2a0 (simonwillison.net)
10th June 2026 Highlights from the release notes: - Tools can now ask the user questions mid-execution. Tools that declare a context parameter receive aToolContext object, andawait context.ask_user(...) can ask a yes/no, multiple-choice (o…
- datasette-agent 0.1a4 (simonwillison.net)
- Show HN: Datasette Agent (simonwillison.net via hn)
- datasette-agent 0.1a3 (simonwillison.net)
+2 more
- datasette-agent 0.1a2 (simonwillison.net)
- datasette-agent 0.1a1 (simonwillison.net)
ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity (arxiv.org) discussed ↗
- Anthropic Walks Back Policy That Could Have 'Sabotaged' Researchers Using Claude (www.wired.com via hn)
-
17 items
model roundup
Sonnet 4.6Several updates and comparisons revolved around Sonnet 4.6, including its performance in dashboard analytics alongside Opus 4.8, and its role in processing critical requirements for a benchmark test with Gemma 4.31B QAT.
- 4m For ongoing, long content writing pieces: is it a good idea to start with the brief in a Project?
- 2h We Interviewed Fable 5 (Despite the Systems Best Efforts 😂)
- 2h Is Fable 5 actually better for writing than Sonnet 4.6 and Opus 4.8?
- 14h I got tired of hitting the weekly limit mid-task, so now my menu bar shows my Claude Code usage as a live % — zero network calls, it reads what Claude Code already knows
- 21h The real price of Claude, where is this road leading?
331 itemsevent
SecurityOpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.
- 10m Visa Vulnerability Agentic Harness for Project Glasswing
- 57m Claude Fable 5: mid-tier results on coding tasks
- 4h Are we defaulting to VM-level sandboxing before understanding the threat model?
- 5h Your AI Agent is one bad prompt away from ruining your brand (And why traditional QA is useless)
- 10h Claude Code filled almost my entire SSD with random nonsense overnight
Finally I HIT 1 Billion Tokens on claude code (www.reddit.comhttps)
could not extract summary
The Role of Feedback Alignment in Self-Distillation (arxiv.org) discussed ↗
Steganography Without Modification: Hidden Communication via LLM Seeds (arxiv.org) discussed ↗
How my AI agent acquire customers for $0.20 only (www.reddit.com via reddit)
WOW, I just turned OpenClaw into an autonomous sales agent It's finally here. Paste your website and it builds your outbound pipeline automatically.
- How to acquire customers for only $0.20 with agents (www.reddit.com)
Initial impressions of Claude Fable 5 (simonwillison.net)
Initial impressions of Claude Fable 5 9th June 2026 I didn’t have early access to today’s Claude Fable 5 release, but I’ve spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast.
AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis (arxiv.org) discussed ↗
-
351 items
event
Anthropic MythosAnthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.
- 14m Anthropic Said This AI Was Too Powerful for Public Release. Now Anyone Can Use It.
- 40m I just learned how they built fable and it will blow your mind
- 1h what is openai planning with the next release
- 5h Fable/Mythos safeguards are overly strict
- 10h Claude Fable 5 is the best AI model right now — and it's not even a debate
114 itemsmodel roundup
Opus 4.8Claude AI has released Opus 4.8, an upgrade to their Opus class of models available in version 2.1.154 of their software on March 16, 2023, which includes enhanced coding and professional task capabilities along with improved judgment and honesty. Users are reporting usage resets following the update.
Disclosure: I built this. I like turbovec for compact local vector search, but in real RAG apps my bottleneck was often outside the vector index: tenant filters, source/time/tag constraints, graph neighborhoods, BM25 candidates, rerank, an…
An Agent Holds the Fort: Three Days of Autonomous Compiler Work (rue-lang.dev via hn)
An Agent Holds the Fort: Three Days of Autonomous Compiler Work View the prompt so, we are now at 96% usage for the week, so i think that what i'd like you to do right now is to write me a blog post about the experience you've had of the l…
Breaking the Ice: Analyzing Cold Start Latency in vLLM (arxiv.org) discussed ↗
Investing in multi-agent AI safety research (deepmind.google)
Show HN: AgentStore – a self-hosted datastore for AI agent teams (github.com via hn)
AgentStore - a datastore for AI agents Git is for one agent. AgentStore is for agent teams.
TripoSplat Generate 3D models from a single image I asked a coding agent to build a beautiful website showcasing the monuments of Paris as 3D Gaussian splats. I never opened an image generator.
-
77 items
model roundup
Gemma 4Gemma 4 is a family of open-source multimodal models from Google DeepMind, including sizes up to 31B parameters and featuring Dense and Mixture-of-Experts architectures. Notable community highlights include the release of Gemma 4 12B as an encoder-free unified model for laptops, its availability via llama-server on a RTX 5070 Ti GPU, and detailed visual guides showcasing its capabilities.
- 1h Any chances for a 12B diffusion Gemma?
- 14h Monitor your screen using local LLMs with only one sentence! Free, Open Source and Local.
- 17h LLMs and tabletop games
- 21h Are these quants of QAT better than non-QAT? What do I use?
- 1d Gemma-4-31B at 256K context on a $1,400 AMD GPU – measured, with patches
51 itemsevent
DeepmindGoogle DeepMind has released "Deep Research Max," advancing autonomous research agents, while also facing challenges and competition from other AI companies like Anthropic and Ineffable Intelligence. Meanwhile, DeepMind workers in the UK have voted to unionize, and former DeepMind architect Demis Hassabis is at the center of legal drama involving Elon Musk.
- 5h Google DeepMind is worried about what happens when millions of agents start to interact
- 18h Show HN: Magenta Real-Time Music Generation on iPhone, Without the GPU
- 1d The Great Reframing...
- 2d Show HN: VQAScore – open eval metric/reward model, now for text-to-video
- 6d Inside Google DeepMind: Reasoning, Omni, and Shipping Frontier AI
Qwen-Image-Flash: Beyond Objective Design (arxiv.org) discussed ↗
llm 0.32a3 (simonwillison.net)
9th June 2026 Almost entirely written by the new Claude Fable 5, see my write-up for more details. Recent articles - Initial impressions of Claude Fable 5 - 9th June 2026 - Running Python code in a sandbox with MicroPython and WASM - 6th J…
Outreach email using Claude end-to-end (www.reddit.com via reddit)
Have a lot of friends building stuff like crazy now with Claude but they are all struggling with GTM and finding customers. There's an entire cottage industry for outbound emails, and was thinking what it would be where the entire thing ca…
8 things about using Claude for writing that took me embarrassingly long to learn (www.reddit.com via reddit)
Access OpenAI models and Codex through your Oracle cloud commitment | OpenAI Use your existing Oracle cloud commitment to give teams access to OpenAI’s most advanced models and Codex, without creating a new purchasing path. Listen to artic…