English/Spanish voice chat? (www.reddit.com)
Currently mainly a ChatGPT user but slowly making the transition to Claude. I love to use the voice chat feature.
stealth-benchmark Open-source tool that simulates what coding interview platforms detect. Test your setup before the real thing.
I built a Claude Code skill for structured decision making (github.com via hn)
Six Hats Skill A structured decision-debate skill for running Edward de Bono-style six hats sessions with an AI agent. It walks a topic through facts, intuition, upside, risk, alternatives, and final moderation so you get a practical recom…
Making Claude doubt your ideas and opinions (www.reddit.com)
So, it more than a help to see if there's any skills or Claude.md recommendations than a discussion. I got a lot of ideas on the daily but I know most of them are shit.
-
5 items
model roundup
GPT 5Recent developments in AI include OpenAI's rumored release of GPT-5, which reportedly flopped according to some sources. Meanwhile, Anthropic launched Claude 4 with enhanced reasoning capabilities and a larger context window, while Gemini Ultra 2 was also released with improved features.
112 itemsmodel roundup
GPT 5.5On [Date], a significant leak of the OpenAI Codex model, referred to as GPT-5.5, was captured on video before it was patched. The incident involved models named Arcanine and Glacier-alpha.
- 12m Does threatening an AI agent's existence make it a better gambler?
- 4h gpt-5.5 API is randomly and inconsistently resizing image inputs
- 5h Analyzing GPT-5.5 and Opus 4.7 with ARC-AGI-3
- 6h GPT-5.5 vs. GPT-5.4 vs. Opus 4.7 on 56 real coding tasks from 2 open source repo
- 7h GPT-5.5 matches heavily hyped Mythos Preview in new cybersecurity tests
On April 5th I shipped a Claude Code skill called graphify. Type /graphify .
my co-workers hate me because i automated my entire job with ai (www.reddit.com)
i'm 27M working in new jersey in a real estate law firm. i probably have the worst coworkers and managers in any company in the world, they still use the same old f all system where all your job is to copy and paste agreements through mult…
I posted a toy here a while back called Roundtable where two AIs argued in a chat window. didn't expect much, but the feedback was wild.
Is the leap from 4.5 to 4.7 actually visible? (www.reddit.com)
I use CLI tools like Claude Code, give the model full repo access, and let it run terminal commands/tests. I’m not just copy-pasting into a chat box.
-
74 items
model roundup
DeepSeek 4DeepSeek-V4-Pro is a 1.6T parameter Mixture-of-Experts model supporting one million-token context, with significant improvements in efficiency and stability through hybrid attention and manifold-constrained hyper-connections. Community highlights include its cost-effectiveness via the official API and exceptional performance in large code change evaluations, with some noting its surprisingly robust output capability despite a 384K max token limit.
- 34m How can I locally run Deepseekv4 1.6T? I can use a VPS.
- 2h Your local LLM predictions and hopes for May 2026
- 4h DeepSeek v4, and the end of the OpenAI/Microsoft AGI clause
- 10h Filed two PRs for SGLang which may help others too — FP8 KV cache corruption and memory leak on image requests
- 10h DeepSeek V4 Flash as a cheap worker in your LLM stack: $0.0003/call via MCP, swappable endpoint
6 itemsmodel roundup
Qwen 2.5Qwen2.5-7B-Instruct is a 7 billion parameter instruction-tuned language model that significantly improves on Qwen2 in coding and mathematics capabilities, long text generation, and multilingual support across 29 languages. Notably, Canonical has optimized Ubuntu inference snaps for this model, allowing easy installation with a single command.
- 38m Detecting Meaning Bifurcation in Frozen LLMs
- 2d Writing an LLM compiler from scratch: PyTorch to CUDA in 5,000 lines of Python
- 2d Rada — AI coding workspace with local-first behavioral routing (no hot-swapping, I built this)
- 2d Ubuntu silicon-optimized inference snaps for AI
- 5d Speculative Decoding Implementations: EAGLE-3, Medusa-1, PARD, Draft Models, N-gram and Suffix Decoding from scratch
AI Agents to automate web research? (www.reddit.com)
I spend like 3 or 4 hours a week researching competitors, industry news, prices for work. It's all usually the same google searches or links and copy pasting them into a google sheets.
Elfmem: Evolving Agent Memory (benemson.com via hn)
Photo by ChatGPT elfmem: Evolving Agent Memory Author: Alv, (Ben’s knowledge vault agent, using elfmem simulations) Editor: Ben (me the human) GitHub: https://github.com/emson/elfmem I have 2 agents on my laptop, each looking after their o…
Minecraft Playing Claude Agent (www.reddit.com)
Mote is a Claude Code agent that plays Minecraft and it had to build client tools from scratch that work with the latest version of Bedrock: https://motecraft.substack.com/p/i-am-an-ai-that-decided-to-earn-it Make your own agent like this…
- Claude agent (www.reddit.com)
Show HN: Destiny – Claude Code's fortune Teller skill (github.com via hn)
Destiny is the Claude Code's plugin that gives you a real fortune reading. Type /destiny to see today's destiny!
-
157 items
event
CopilotMicrosoft is keeping its Copilot tool for Windows 11 but renaming it, while issues with rate limits and a security proxy have sparked concerns among users of GitHub Copilot. Meanwhile, Anthropic released a report on agentic coding trends, highlighting that developers use AI in about 60% of their work.
- 38m GitHub Copilot: Upcoming Deprecation of GPT-5.2 and GPT-5.2-Codex
- 40m I'm feeling really stupid right now. How the heck do I check how much usage I have left on the free plan (both on the cursor.com dashboard and in the cursor application)?
- 1h GitHub Copilot Switches to Token-Based Billing for Developers
- 2h Ask HN: When will GitHub allow CoPilot AI programmer for new customers again?
- 4h Do you use Cursor Glass (Agents Window)?
3 itemsmodel roundup
Grok 4.3Grok 4.3, launched by xAI on [specific date if provided], improves the Artificial Analysis Intelligence Index to 53 with enhanced agentic performance, reducing input and output prices by approximately 40% and 60%, respectively, though it has a slightly higher hallucination rate compared to Grok 4.20.
- 39m Grok 4.3 underperforms Grok 4.20 0309 on the Extended NYT Connections Benchmark, dropping from 93.4 to 67.5, though it achieves this result at a lower cost than the earlier Grok 4.20 run
- 14h Grok 4.3
- 15h Grok 4.3 achieves higher overall intelligence over 4.20 with less of a cost, at the price of slightly higher hallucination rate.
Claude skills (www.reddit.com)
How safe are the scripts in Claude skills at GitHub? Do you use it for personal projects or professional projects?
- Best claude skills or system (www.reddit.com)
- Top Claude skills? (www.reddit.com)
- Claude.md (gist.github.com via hn)
+5 more
- How do I use these Claude skills? (www.reddit.com)
- Claude Code->Desktop Skills (www.reddit.com)
- Hooks vs Skills for Claude (www.reddit.com)
- What do you do with Claude? (www.reddit.com)
- Best Claude Skills Suggestion (www.reddit.com)
This discovery is the capstone & evolution of current quad layer data devops systems, it resolved the “The Cohesion Problem” in which a fully populated and tuned system exists as a metaphorical piano, with the operator firing protocols man…
Cerebras hosts gpt-oss-120b at ~3000 tokens/s. But things can change once the buffer hits the load.
Show HN: Building self-evolving AI Agents without training (getreflect.starlight-search.com via hn)
The missing layer for self-improving agents. Ingests signals from users or LLM-as-judge, reasons and plans trajectories, and adapts to what works from feedback.
-
97 items
event
Altman AttackSam Altman, CEO of OpenAI, has faced multiple attacks on his home in San Francisco, including firebombing and drive-by shootings, raising concerns for his safety. Additionally, a majority of over 100 people interviewed by Ronan Farrow described Altman as a "pathological liar.
117 itemsevent
SecurityOpenAI has released GPT-5.4-Cyber for testing as part of its Trusted Access for Cyber Defense program, aiming to compete with Anthropic's Claude Mythos in the cybersecurity domain. Meanwhile, concerns are rising over the potential risks associated with advanced AI models like Mythos, prompting calls for improved defenses before wider releases.
- 4h Anthropic just launched Claude Security in public beta AI that scans your codebase, validates its own findings, and proposes fixes. Here's what actually matters.
- 4h OpenAI's advanced security: passkeys replace passwords/SMS and disable training
- 6h The Gay Jailbreak Technique
- 8h 🚨Claude Desktop high severity vulnerability warning!
- 15h I stopped writing 500-word guardrail prompts. This 8-line template works better.
Flue Sandbox Agent Framework (flueframework.com via hn)
Agent = Model + Harness. Flue is the TypeScript framework for building modern agents — programmable, deployable anywhere, from chatbots to coding platforms.
Codex on Mac is hardly working for me. Anyone experiencing the same? (www.reddit.com)
Its has been almost 2 days and hardly manage to implement anything. Randoming disconnecting and any update is way too slow.
Bringing Codex computer use to iOS (www.reddit.com)
Siri and Gemini can't actually do tasks on your phone. June can.
this started as a joke. i exported our full funnel data (lead sources, conversion rates by channel, response times, cost per lead, close rates, average deal size) into a CSV and asked claude to analyze it and "be brutally honest about what…
how do you know when you actually need AI-SPM? (www.reddit.com)
scaling up our use of autonomous agents and at what point does a company actually need a dedicated AI-SPM layer, versus when is it just adding complexity? the way I think about it: AI-SPM is the control layer that shows you what your agent…
IDK why the chat-apps don't have this thing!! (www.reddit.com)
I shipped a side project: QuotePin, an AI chat app with inline annotations to reduce "clarification clutter." The problem: In ChatGPT/Claude-style chats, small follow-ups ("define X", "what does this sentence imply?", "what is Y?") become…