1. Hi Peeps, I'm an open-source maintainer (Goldziher on Github) and the CTO of kreuzberg.dev. I published basemind — a pure-Rust MCP server and Claude / Codex / Gemini etc.

  2. From the Hugging Face Hub to robot hardware with Strands Agents and LeRobot You have a robot, a folder of demonstration data on the Hugging Face Hub, and a new task you want it to learn. Today that takes five separate tools: one to record…

  3. There's no easy way to see what your coding agents have actually installed — skills, subagents, commands, plugins, MCP servers, hooks — or which sessions are still alive vs. safe to delete.

  4. For some searches reddit is a very useful resource. ChatGPT has it, Gemini has it, Grok even has it.

  5. Large Language Models (LLMs) achieve strong performance on reasoning tasks, but whether this reflects faithful logical inference or heuristic approximation remains unclear. We study this question in legal entailment by comparing three para…

  6. Hallo Community, weiß jemand, ob es eine Möglichkeit gibt, Claude pro zu testen? Auch wenn es nur 24 Std sind?

  7. event

    Security
    376 items

    OpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.

    event

    Mistral
    117 items

    Mistral, a French AI company, is set to release a medium-sized model with 128 billion parameters and is planning to launch Workflows in public preview. The company, founded by Arthur Mensch, continues to grow its AI empire despite not being based in the United States.

  8. 15th June 2026 - New tool, execute_write_sql , which requests user approval and then writes to a database - taking user permissions into account. #27 I added a mechanism for asking user approval in datasette agent 0.2a0.

  9. If you are building on top of multiple LLM APIs or even a single one amongst OpenAI, Claude, Gemini, etc. what do you do when the API starts degrading (slow TTFT, elevated error rates, timeouts).

  10. Anthropic CEO Dario Amodei and OpenAI CEO Sam Altman were among tech bosses at a G7 working lunch on AI, as the US decision to restrict access to Anthropic's most advanced models causes tension among allies. Fable soon guys?

  11. Import AI 461: "Alignment is not on track"; FrontierCode; and synthetic research interns Where are your agents right now? Welcome to Import AI, a newsletter about AI research.

  12. Agent systems are advancing quickly across domains, but their evaluation remains fragmented. Most benchmarks rely on fixed, LLM-centric harnesses that require heavy integration, create test-production mismatch, and limit fair comparison ac…

  13. I work at a startup that makes martial arts gym software (MAAT). We handle the memberships of students so gym owners don't have to, using a payment system and a database.

  14. event

    Glm
    139 items

    Recent developments in the AI space highlight significant advancements from Chinese companies, particularly Zai's upgrade of GLM-5.1, which has shown substantial improvements. Meanwhile, there are concerns about a widespread intelligence drop across various models and discussions around the potential openness of leading AI projects like GLM 5.1.

    415 items

    Anthropic's new update, Claude Mythos, has garnered attention from top AI security researchers like Carlini, who found numerous bugs. The update is noted for its speed and effectiveness, with Anthropic identifying a significant security flaw in FFmpeg and quickly submitting patches.

  15. SpaceX will acquire AI coding tool Cursor for $60 billion in an all-stock transaction, the companies announced today. The deal is expected to close in the third quarter.

  16. 12th June 2026 - Link Blog OpenAI WebRTC Audio Session, now with document context. I built the first version of this tool in December 2024 to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.

  17. For me it’s probably CORS. Claude Code can build half the app, refactor the backend, add auth, write tests, and then somehow I still end up staring at one browser error for 40 minutes.

  18. When large language models (LLMs) fail to generalize or make haphazard errors in reasoning, it is often taken as evidence that LLMs are not truly reasoning, but rather performing a kind of pattern matching. The implication is that people's…

  19. I was an old subscriber, who decided to unsubscribe when Anthropic unilaterally cut limits during peak working hours. I am aware that the subscription is not usable for coding purposes, and that's ok.

  20. [AINews] Open Models, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo a quiet day lets us reflect on a great essay Sarah Guo is a friend of the pod and Queen of AI, and after our Satya crossover pod (great recap here from Goku…

  21. Disclaimer: I’m not an engineer I have just set up boutique financial advisory firm and want to get the basics (logo, wordmark etc in place quickly). I know exactly what I want as the logo.