#prompt-injection
9 items
Benchmarked Gemma 4 E2B: The 2B model beat every larger sibling on multi-turn (70%) (aiexplr.com via reddit) Prompt Injection Is Unfixable (So We Stopped Trying) (grith.ai via hn) Prompt Injection Is Unfixable (So We Stopped Trying) A security proxy for AI coding agents, enforced at the OS level. Register your interest to be notified when we go live.
Draining Wallets via Prompt Injection in Coinbase AgentKit (457e884c.x402warden-blog.pages.dev via hn) Coinbase AgentKit Prompt Injection: Wallet Drain, Infinite Approvals, and Agent-Level RCE# Reported 13 days after Coinbase launched Agentic Wallets. Validated by Coinbase.
Comment and Control: Prompt Injection in Claude Code, Gemini CLI, and Copilot (oddguan.com via hn) Anthropic Claude Code Security Review, Google Gemini CLI Action, and GitHub Copilot Agent are vulnerable to prompt injection via GitHub comments — turning PR titles, issue bodies, and issue comments into attack vectors for API key and toke…
How my agents know it's actually me sending commands (and not a prompt injection) (www.reddit.com via reddit) I built an AI security layer that blocks prompt injection in under 1ms looking for devs to break it and give honest feedback. (www.reddit.com via reddit) I've been building something for the past few months and I think it's ready for real eyes. It's called Secra.
Free Red Team Security Audit for AI Agents & RAG Systems (limited) (www.reddit.com via reddit) Defender – Local prompt injection detection for AI agents (no API calls) (www.npmjs.com via hn) For those running an OpenClaw instance, how do you manage sandboxing and prevention of unwanted behavior? (www.reddit.com via reddit) Right now, I'm working on a small app to help eliminate my own doomscrolling by automatically crawling sites and summarizing news articles. However, I don't like the idea of giving OpenClaw free reign of my system, nor giving it any sort o…