Article Conversation Single-agent AI coding is a nightmare for engineers Created by and I pay my upfront subscription ($200/month), write what I hope is the right prompt (prompt AND context engineer), and wait. 35 minutes later, the agent…
Single-agent AI coding is a nightmare for engineers (twitter.com via hn) Used Claude Personal Account on Work Computer for a Few Weeks. Will I get fired? (www.reddit.com via reddit) We have been encouraged at work to use Co Pilot and to explore use cases for AI. As part of this exercise I started to (stupidly) use my personal claude account on my work computer to compare the quality of output.
I built a Claude Code plugin that optimizes your codebase through experiments (autoresearch for code) (www.reddit.com via reddit) Inspired by Karpathy's autoresearch idea — an LLM runs training experiments autonomously to beat its own best score — but applied to code instead of ML training runs. I built this plugin as a way to set up an optimization loop on a codebas…
Gemini Vs Chatgpt (www.reddit.com via reddit) I use ChatGPT to make very detailed, evidence-heavy essays. However, ChatGPT isn't very good at doing it.
Claude Code keeps misreading its own malware instruction as a blanket ban on editing code (www.reddit.com via reddit) You've been blocked by network security. To continue, log in to your Reddit account or use your developer token If you think you've been blocked by mistake, file a ticket below and we'll look into it.
Show HN: AI compatibility without compromises (supercompat.com via hn) Built a library to translate between OpenAI Responses/Assistants APIs and other provider APIs. Provides full compatibility so it’s a total drop-in regardless of which provider you use or which features (computer use, web search).
Claide app cowork no longer shows thinking process (www.reddit.com via reddit) Is there any settings that changed because previously when i clicked on thinking , it showed the whole thinking process. Now it doesn't show anything and shows thinking and some time later Working through a complex response .
Harnessed Performance Benchmarks? (www.reddit.com via reddit) I'm not quite sure what the aftermath of the anthropic leak was. I know that there's an open source python project that essentially cloned the code.
Mechanisms of introspective awareness in LLMs [pdf] (arxiv.org via hn) Recent work has shown that LLMs can sometimes detect when steering vectors are injected into their residual stream and identify the injected concept -- a phenomenon termed "introspective awareness." We investigate the mechanisms underlying…
What is the simplest architecture for running a multi-agent system at scale? (www.ashpreetbedi.com via hn) Scaling Agentic Software: Part 1 What is the simplest architecture for running a multi-agent system at scale? I want to deploy agents as a real service.
Sparser, Faster, Lighter Transformer Language Models (arxiv.org via hn) Scaling autoregressive large language models (LLMs) has driven unprecedented progress but comes with vast computational costs. In this work, we tackle these costs by leveraging unstructured sparsity within an LLM's feedforward layers, the…
Dispatch no Longer Replies When a Taks Completes (www.reddit.com via reddit) Claude Dispatch no longer creates a reply once its done with an assigned task, it only replies when the task is started to confirm that it has begun. I think this is because Dispatch now assigns its tasks to sub-agents in Cowork, so the ta…
Building The payment layer for APIs and AI agents (chexhq.com via hn) Identity, payment, and execution — combined into a single request. Let machines transact autonomously in USDC, settled on-chain, verified in milliseconds.
Bloomberg: OpenAI Takes on Google with New AI Model Aimed at Drug Discovery (www.bloomberg.com via hn) We've detected unusual activity from your computer network To continue, please click the box below to let us know you're not a robot. Why did this happen?
Most recent update of desktop app completely unusable when invoking skills in Cowork, anyone else getting this today? (www.reddit.com via reddit) https://preview.redd.it/6cep3du3flvg1.png?width=1442&format=png&auto=webp&s=3ca3b852c86bdb20fa1d2d750a1cde66a1230278
Quick question: Should I stick with my M4 Max or grab a Corsair AI Workstation 300 for local LLM stuff? (www.corsair.com via reddit) So I already have a Mac Studio M4 Max (return window still available)with 64GB RAM, but I’m eyeing the Corsair AI Workstation 300 (Ryzen AI Max+ 395, 96 VRAM out of 128GB, $3,250). Both seem decent for running models locally with Ollama.
How are you actually using AI agents in real workflows right now? (www.reddit.com via reddit) I’m building some infrastructure around AI agents and I’m trying to understand how people are actually using them in real workflows, not demos. Specifically curious about: - What your agent actually does day-to-day (not hypotheticals) - Wh…
Will Google V8 Zebrafish TPU not get anything from Broadcom? (www.reddit.com via reddit) The next two TPUs from Google are the Zebrafish and Sunfish. Zebrafish will be with Google partnering with MediaTek for some components.
- 14 items
thread
GPT 5.4OpenAI has released GPT-5.4-Cyber for testing and claims it will compete with Claude Mythos. Meanwhile, GPT-5.4 Pro has solved the Erdős Problem #1196, showcasing its advanced capabilities in mathematics.
Remote Controlled agents? (www.reddit.com via reddit) It seems everyone is releasing their version of OpenClaw-like agents. BlackBox, Claude, Kilo Antigravity, and even providers like Kimi and Moonshot.
Google Prepares Rollout of Skills for Gemini and AI Studio (www.testingcatalog.com via hn) Google appears to be preparing a broader rollout of "Skills" functionality across its AI product lineup, with the latest signs pointing to AI Studio's Build section as the next destination. Skills, in this context, are reusable instruction…
I built an MCP server that lets Claude Desktop talk to your Claude Code sessions (www.reddit.com via reddit) I use Claude Desktop for brainstorming and Claude Code for implementation. Both build up deep context on the same project, but they can't see each other.
Thinking about building agents for humans (frontierai.substack.com via hn) Build agents for humans Why copying human workflows is an anti-pattern We’ve been running an experiment since the beginning of the year with a new AI SDR product that we bought. The tool uses signals from activity out in the world to quali…
Need a brutally honest answer: what can realistically be achieved on consumer hardware? (www.reddit.com via reddit) I have a PC with a 4090. I’m also in need of a new MacBook generally.
Cowork Orchestrator Patterns (www.reddit.com via reddit) While working in Cowork, I have been experimenting with designing plugins that try to apply some established agentic patterns to help manage the context window. The problem that I'm running into is with Cowork the main orchestrator is the…
the overlooked trend of building custom ai agents (www.reddit.com via reddit) i keep noticing that a lot of the discussions here don’t really touch on how important it is for companies to build their own AI agents rather than just relying on generic solutions. It seems like there’s this underlying trend where busine…
€54k spike in 13h from unrestricted Firebase browser key accessing Gemini APIs (discuss.ai.google.dev via hn) Hello, We are looking for guidance regarding an unexpected €54,000+ Gemini API charge that occurred within a few hours after enabling Firebase AI Logic on an existing Firebase project. Background: We created the project over a year ago a…
New secret Claude.ai feature gets its own rate limits (www.reddit.com via reddit) Bench 8xMI50 MiniMax M2.7 AWQ @ 64 tok/s peak (vllm-gfx906-mobydick) (www.reddit.com via reddit) Inference engine used (vllm fork): https://github.com/ai-infos/vllm-gfx906-mobydick/tree/main Huggingface Quants used: cyankiwi/MiniMax-M2.7-AWQ-4bit Relevant commands to run: docker run -it --name vllm-gfx906-mobydick-mixa3607 -v ~/llm/mo…
Built a Claude Code token monitor for Windows — because Mac has several apps for this and we have zero (www.reddit.com via reddit)