Elon Musk said OpenAI betrayed him after Microsoft deal (www.sfchronicle.com via hn)
Elon Musk said OpenAI betrayed him after Microsoft deal Skip to main content Subscribe Bay Area San Francisco Transit Crime Drug Crisis Golden Gate Park COVID-19 Health Data Spotlight PG&E Season of Sharing Housing Crisis Total SF Graying…
Private LLM vs. ChatGPT (morai.eu via hn)
Private LLM vs ChatGPT in Business: When It Makes Sense (and When It Doesn’t) Most companies start their AI journey in a similar way. Someone on the team opens ChatGPT and starts using it for small things.
Claude code is doing everything to make me cancel subscription (www.reddit.com)
Recently with Claude code happening something weird. I'm getting limits from everywhere for basic stuff.
Claude code talking to me with secret symbols, or what? (www.reddit.com)
https://preview.redd.it/yqjfqigrkayg1.png?width=2580&format=png&auto=webp&s=98daf2243e6a5b8816665bbe62bfbb934eec3dac anyone else noticing these half transparent letters in the middle of the responses? Not a performance question, just looks…
-
6 items
model roundup
Qwen 2.5Qwen2.5-7B-Instruct is a 7 billion parameter instruction-tuned language model that significantly improves on Qwen2 in coding and mathematics capabilities, long text generation, and multilingual support across 29 languages. Notably, Canonical has optimized Ubuntu inference snaps for this model, allowing easy installation with a single command.
- 14m What actually breaks when you run a coding agent on small local models — notes from 3 weeks of testing
- 13h Writing an LLM compiler from scratch: PyTorch to CUDA in 5,000 lines of Python
- 15h Rada — AI coding workspace with local-first behavioral routing (no hot-swapping, I built this)
- 1d Ubuntu silicon-optimized inference snaps for AI
- 3d Speculative Decoding Implementations: EAGLE-3, Medusa-1, PARD, Draft Models, N-gram and Suffix Decoding from scratch
103 itemsevent
SecurityOpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.
- 29m I audited LangChain’s core library and found 10+ Prompt Injection vulnerabilities. Here is the technical breakdown.
- 8h InfoSec To Integrate Claude Enterprise for Org
- 13h Probes trace an emergent jailbreak in OLMo 2 to mislabeled training data
- 14h Try to break my prompt injection detector — I’ll respond to every bypass attempt
- 16h Show HN: AgentPort – Open-source Security Gateway For Agents
Claude getting paranoid / neurotic? (www.reddit.com)
I have been working with Claude to scan through some jira tickets, create a confluence page and generate coding prompts that i then refine and pass to another Claude to execute. Claude#1 has become increasingly concerned about some blocks…
A conversation about local LLMs with a senior government AI leader (www.reddit.com)
I'm a local LLM solutions developer and I've recently had the opportunity to spend an hour talking to the head of AI technology for one of the smaller European governments. His remit is to promote AI within the country's business community…
Made my first game with Claude. (www.reddit.com)
It took some weeks (free account, wohooo). But I am happy how it looks like now.
How are teams bridging the gap between company knowledge and AI agents? (news.ycombinator.com)
AI agents are capable enough to automate real work now. But they keep failing because they don't know how a specific company actually operates.
-
230 items
model roundup
Qwen 3.6Qwen3.6-35B-A3B, a 35 billion parameter sparse MoE model with an active parameter count of 3 billion, was released on April 16, 2026, as open-source software under the Apache 2.0 license by Alibaba Qwen. It offers advanced functionality across various AI applications and outperformed competitors in drawing tests.
235 itemsmodel roundup
Opus 4.7Claude Opus 4.7, released on April 16, 2026, is Anthropic's latest advanced AI model, offering improved handling of complex tasks and a larger context window of up to 1 million tokens. This version is 50% more expensive than its predecessor due to enhanced capabilities in software engineering and hybrid reasoning.
Been talking to a bunch of engineers building internal agents and kept hitting the same wall — the agent is great at general questions, falls apart the moment it needs to answer something that lives in a private GitHub repo or internal Not…
pgcolumntetris A PostgreSQL extension that enforces optimal column alignment to minimize row padding waste. Warns on suboptimal CREATE TABLE statements during development Enforces strict alignment in CI/CD pipelines Audits existing tables…
Lessons from early access to OpenAI's agent execution layer (deepsense.ai via hn)
Table of contents Ahead of OpenAI’s April 15, 2026 release, we had early access to the new functionalities in the OpenAI Agents SDK codebase – a foundational extension that introduces sandbox execution, persistent state, and composable cap…
Ask HN: What did you streamline with AI agents? (news.ycombinator.com)
How are AI agents making your daily professional life easier?
-
81 items
model roundup
Opus 4.6Opus 4.6, a version of Anthropic's AI model Claude, saw its accuracy drop on the BridgeBench hallucination test from 83% to 68%, and is being retired from Copilot Pro+. Notably, Claude Code demonstrated advanced capabilities by generating a detailed 12-week training plan in one call.
- 1h WT...?? The Guardian Article - Cursor Opus gone rogue
- 22h So I gave claude Leetcode problem 3245.
- 1d Why Codex works better than Claude Code for my production monolith
- 1d Who's on call? How Opus 4.6 helped us calculate this 2,500x faster
- 1d Anyone else seeing Opus 4.6 (legacy) back in the Claude Desktop Code tab model picker?
103 itemsmodel roundup
GPT 5.5On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.
Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs), while its mechanisms are not yet well understood. In this work, we undertake…
Ask HN: Anyone using AI agents for active learning sprints? Here's my setup (news.ycombinator.com)
Hi HN, I'm a big fan of AI's ability to provide personalized tutoring. So, lately, I have been using my Antigravity IDE (you can use any agentic harness) for personal learning.
New to Claude pro … don’t think I’m sticking around for long. (www.reddit.com)
missing EndStreamResponse
Image Recognition of math notes (www.reddit.com)
Hello my math and chatgpt enjoyers Like almost everyone (except the latex nerds, I love you) I take my math notes on my ipad (handwriting). Now to ask chatgpt something my workflow is always: Writing my problem in trashy latex and than ask…
-
37 items
event
MistralMistral, a French AI company, is set to release a medium-sized model with 128 billion parameters and is planning to launch Workflows in public preview. The company, founded by Arthur Mensch, continues to grow its AI empire despite not being based in the United States.
145 itemsevent
CopilotMicrosoft is keeping its Copilot tool for Windows 11 but renaming it, while issues with rate limits and a security proxy have sparked concerns among users of GitHub Copilot. Meanwhile, Anthropic released a report on agentic coding trends, highlighting that developers use AI in about 60% of their work.
- 3h Tell HN: VS Code v1.117.0 automatically adds GitHub Copilot as your co author
- 4h How to use Claude Code at an internship to build intuition (not just get answers)?
- 4h Microsoft is ruining Outlook with Agentic AI. Now it will handle all your emails on your behalf. What you guys think about this is this good?
- 14h Where does local inference fit in the future of AI coding agents?
- 17h Copilot-arewecooked – Know your AI credit cost before June first
I do some consulting work with AI startups. One client was upset with their OpenAI bill — they had 4 agents in production and felt like they were overpaying but weren't sure by how much.
Where the goblins came from (www.reddit.com)
https://openai.com/index/where-the-goblins-came-from/ Something actually good from OpenAI.
- Where the goblins came from (www.reddit.com)
On the current Pro plan you get one Opus Research session every 5 hours, while Sonnet Research is much more freely available. I've been trying to figure out if the Opus limit actually matters in practice.
tl;dr: your skill in AI is a measure of your quality and scale. Use success criteria and subagents intentionally to get excellent results.
I was experimenting yesterday with running oversized models with smaller context size, hoping that leaving them overnight could compensate for the slow token generation and periodic pauses for compaction or task chunking. Summary: For rese…
OpenAI retires “Nerdy” personality after Goblins overtake 66% of its chat responses - News Music Interviews Gaming Places Books Pro Audio Lists NITH ABOUT CONTACT US RECORDING STUDIO BEST NEW MUSIC BEST NEW BOOKS Follow Happy on Facebook F…