As of mid Apr 2026, I have noticed every model has had a major intelligence drop. And no I'm not talking about just ChatGPT.
#gemini
868 items
Major drop in intelligence across most major models. (www.reddit.com) Researchers let AI models run a simulated society. Claude was the safest—and Grok committed 180 crimes and went extinct within 4 days (fortune.com via reddit) Imagine a world run by AI agents. What does it look like?
Behold, Gemini 3.5 Flash! (www.reddit.com) could not extract summary
€54k spike in 13h from unrestricted Firebase browser key accessing Gemini APIs (discuss.ai.google.dev via hn) Hello, We are looking for guidance regarding an unexpected €54,000+ Gemini API charge that occurred within a few hours after enabling Firebase AI Logic on an existing Firebase project. Background: We created the project over a year ago a…
🚀 Skills for small businesses, officially released by Anthropic (www.reddit.com) Anthropic’s 31 small-business skills reportedly hit around 382,000 downloads on day one. And now someone has mapped the whole thing into a setup workflow that can apparently be deployed in ~10 minutes.
Differences Between Opus 4.6 and Opus 4.7 on MineBench (www.reddit.com) Some Notes: For what's supposedly the SOTA model and beats all other models in essentially every benchmark, I expected it to be a lot more consistent honestly You'll notice how sometimes it focused too much on the scenery (like the arcade…
Google's latest creation: Gemini 3.5 Flash vs all (www.reddit.com) https://gemini.google.com/share/c2a187275e26 archive link https://claude.ai/share/8383747a-aaf1-4f6c-a516-0e839f46a698 https://grok.com/share/bGVnYWN5_3c63e371-eb9d-46c3-8ba2-0c745c6795a2 https://chatgpt.com/share/6a0f1e13-a0c8-8328-b989-1…
Apple reveals new AI architecture built around Google Gemini models (www.macrumors.com via hn) Apple today announced a major overhaul of its Apple Intelligence platform, revealing a new architecture built on foundation models developed in collaboration with Google using the technologies behind the Gemini family. The new architecture…
Gemini 3.5 confirmed by google deepmind employee (www.reddit.com) could not extract summary
Gemini 3.5 Flash: frontier intelligence with action (blog.google via hn) Gemini 3.5: frontier intelligence with action Today, we’re introducing Gemini 3.5, our latest family of models combining frontier intelligence with action. This represents a major leap forward in building more capable, intelligent agents.
Gemini 3.2 Flash is capable of solving IMO 2025 P6. Only GPT-5.5-Pro can solve it currently without any scaffolding / harness engineering. (www.reddit.com) could not extract summary
Gemini Omni Flash is the most censored video model. Even more censored than Chinese alternatives (www.reddit.com) I believe google intentionally did this to reduce the load on their servers
How Google DeepMind is researching the next Frontier of AI for Gemini — Raia Hadsell, VP of Research (youtu.be via reddit) Gemini 3.5 flash costs 3 times more than the previous version and 30x more than gemini 1.5 flash. (www.reddit.com) Source Gemini flash costs almost as much as flagship models..... If gemini 3.5 pro scales like that it'll cost more than claude opus 3.
Gemini 3.1 Pro #1 at METR Timeline 80% Success Rate (1.5H) (www.reddit.com) #2 at 50% success rate (task length: 6H 24M)
Gemini randomly dumped its system prompt (gist.github.com via hn) - - Save mkaramuk/44a44d83178e632ec0dd1f02186d822c to your computer and use it in GitHub Desktop. You are Gemini.
Google I/O leaks: Gemini’s "Omni" and Gemini 3.2/3.5 (www.reddit.com) source : https://x.com/pankajkumar_dev/status/2050943723627041138
Deepseek V4 Pro is 15x cost to run Artificial Analysis bench from V3.2, higher than Gemini 3.1 Pro (www.reddit.com) Major performance jump though. Worth it?
Claude is extremely expensive but works like Magic! (For a non-coder) (www.reddit.com) I have a small business and have ways wanted to digitized all our customer data via an app. I have a very specific way in my head for doing (how our data will be processed) it but just don't know how to do it since I am not a coder.
Deepseek V4 flash (high) rivals Gemini 3 flash at 1/5th the cost (www.reddit.com) could not extract summary
Talkie: a 13B LLM trained only on pre-1931 text used Claude Sonnet to help test the model and judge its output (www.reddit.com) Researchers Alec Radford (GPT, CLIP, Whisper), Nick Levine, and David Duvenaud just released talkie: a 13 billion parameter language model trained exclusively on text published before 1931. No internet.
Orthrus-Qwen3-8B : up to 7.8×tokens/forward on Qwen3-8B, frozen backbone, provably identical output distribution (www.reddit.com) Code: https://github.com/chiennv2000/orthrus Paper: https://arxiv.org/abs/2605.12825 HF: https://huggingface.co/chiennv/Orthrus-Qwen3-1.7B ; https://huggingface.co/chiennv/Orthrus-Qwen3-4B ; https://huggingface.co/chiennv/Orthrus-Qwen3-8B…
Gen AI web traffic share update Main takeaways: → Claude and Gemini continue to grow. → ChatGPT moves closer to the 50% mark. (www.reddit.com) 12 months ago: ChatGPT: 77.6% Gemini: 7.27% DeepSeek: 6.01% Grok: 3.17% Perplexity: 1.75% Copilot: 1.56% Claude: 1.37% 🗓️ 6 months ago: ChatGPT: 69.5% Gemini: 15.9% DeepSeek: 4.06% Grok: 3.31% Perplexity: 2.22% Claude: 2.12% Copilot: 1.97%…
Google readies ‘AI Ultra Lite’ plan and explicit ‘usage limits’ for Gemini (9to5google.com via reddit) Google is quietly preparing a new “AI Ultra Lite” subscription tier to slot between its $20 Pro and $250 Ultra plans, plus a dedicated dashboard for subscribers to see their remaining token budget. If you’ve been following AI news in recen…
My free account has cost OpenAI about $337.70 (www.reddit.com) I exported my OpenAI account data and Gemini CLI built me a pricing estimate in about 15 minutes. I have no idea how accurate this is since it used API pricing but I thought it was interesting to share.
My dream of a fully generative game is getting pretty close to possible now. I made a demo where you can prompt any spell and fight online. (www.reddit.com) Prompt any spell and use it in a 3D physics based world, powered by Gemini 3 Full multiplayer support for up to 6 players with VoIP All made with ThreeJS and Colyseus https://spellwright.xyz/
Google is cooking just give them sometime (gemini 3.5 pro) (www.reddit.com) could not extract summary
Anthropic Is Preparing for IPO and We Should Be Worried (www.vincentschmalbach.com via hn) How China’s Shadow AI API Market Works China's shadow market offers access to Claude, Gemini, GPT, and other frontier models. Pay a local seller, get an API endpoint, connect… Anthropic is starting to act like a company preparing for publi…
Google, please just open source Imagen (2022), Gemini 1.0 Nano and Gemini 1.0 Pro. You have nothing to lose at this point. (www.reddit.com) Ok, so imagen (the original one from 2022, not imagen 3/4) should be open source. The gemini 1.0 nano model and the gemini 1.0 pro models should be open source.
GPT-5.5 improves over GPT-5.4 and overtakes Opus 4.6 to take the 2nd place behind Gemini 3.1 Pro on the Extended NYT Connections Benchmark (www.reddit.com) GPT-5.5: xhigh: 94.0→97.5 high: 93.6→96.9 medium: 92.0→95.0 no reasoning: 32.8→37.5 Kimi K2.6 improves over Kimi K2.5 (78.3→91.4) and becomes the #1 open weights model. DeepSeek V4 Pro improves over DeepSeek V3.2 (50.2→75.7).
New Claude user for work. Blown away. Are there more specific subs? (www.reddit.com) I’ve used enough AI models to realize they all have wildly different personalities At this point I’m convinced AI models are just coworkers with different levels of talent, ego, and criminal energy. (www.reddit.com) - Claude Opus 4.6 - absolute rogue AI. Does what I want like it’s breaking at least 3 internal policies to make it happen.
Sesame x Gemini: low latency, extremely realist, and they started spontaneously collaborating (www.reddit.com) could not extract summary
Video generated by "Gemini Omni" (www.reddit.com) https://x.com/i/status/2056676690051662193 You can see the source of the generation in the first reply tweet.
Guys we have to change the pelican test (www.reddit.com) So i have been seeing more of those pelican on a bike svg tests and while they work i feel like (and maybe you guys do too) they are getting kinda benchmaxxed so we should switch things up soon and this is my idea generate me a html svg of…
Show HN: Google Gemini Is Scanning Your Photos – and the EU Said No (news.ycombinator.com) Gemini 3.5 Flash Agents built a real Complete OS from scratch! (www.reddit.com) https://x.com/Google/status/2056789235500466273?s=20 Google asked its agents to build a working operating system from scratch using u/Antigravity 2.0 and Gemini 3.5 Flash. Gemini built a real OS out of scratch.
Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge (thinkpol.ca via hn) By Rohana Rezel I’m running the ongoing AI Coding Contest where I pit major language models against each other in real-time programming tasks with objective scoring. Day 12 was the Word Gem Puzzle.
Gemini Omni model is still unable to make someone do a backflip (www.reddit.com) https://gemini.google.com/share/b1032e6521f0
Decreased Intelligence Density in DeepSeek V4 Pro (www.reddit.com) In the V3.2 paper, they mentioned: Second, token efficiency remains a challenge; DeepSeek-V3.2 typically requires longer generation trajectories (i.e., more tokens) to match the output quality of models like Gemini 3.0-Pro. Future work wil…
The frontier reasoning race is starting to look like a crowded subway station (www.reddit.com) We went from chasing GPT4 to looking at graphs with GPT5.4 xhigh, Gemini 3.1Pro, and now Hy3 preview completely shaking up the leaderboard. Look at that CHSBO 2025 chart Hy3 preview scoring 87.8 over Gemini and GPT.
Qwen3.7 Max scored by Artificial Analysis, 27B/35B waiting room (www.reddit.com) https://preview.redd.it/42ak5qmus82h1.png?width=1133&format=png&auto=webp&s=744ea3dfc06c83d0c4d8aa128c39b3238b17d7be Qwen 3.7 Max sitting at 5th, pretty much on par with GPT 5.4 (xhigh) and a notch above the just released Gemini 3.5 Flash.…
Gemini Omni model is out! (www.reddit.com) I made 4 videos and already hit the limit. The results honestly aren’t any better than VEO 3.1, and now my entire 5-hour usage window is gone 🙂.
FrontierMath: Opus 4.7 improves over Opus 4.6 and Gemini 3.1 but still trails GPT-5.4-xHigh and GPT-5.4-Pro (www.reddit.com) could not extract summary
Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard! (www.reddit.com) Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard! little-coder × Qwen3.6-35B-A3B hit 24.6% (±3.2), and now land above Gemini 2.5 Pro on Gemini CLI (19.6%) and Qwen3-Coder-480B on Terminus 2 (23.9%).
Researchers left AIs alone in a virtual town for 15 days to see what would happen. Claude's agents built a democracy. Gemini's agents fell in love, burned the town down, then one voted to delete itself and its partner. Grok's agents created anarchy, then died. (www.reddit.com) could not extract summary
Thoughts on using an AMD Alveo V80 FPGA PCI card as a poor man’s Taalas HC1 (LLM-burned-onto-a-chip). (www.reddit.com) TL:DR - Remembered FPGA PCI boards being a big thing from my crypto days. Wondered if AMD Alveo V80 FPGA card could be used to approximate the performance of a Taalas HC1 (LLM-on-a-chip).
We benchmarked TranslateGemma-12b against 5 frontier LLMs on subtitle translation - it won across the board, with one significant catch (www.reddit.com) As part of our ongoing translation quality research at Alconost, we put six models through subtitle translation into six language pairs. At first glance the numbers told a clean story.
Google I/O is tomorrow. What are we expecting? (www.reddit.com) I think the only confirmed/leaked feature is Gemini Omni, which is some sort of video model, but it's not really clear to me if that's a new video model or just another form of Veo. It also seems a new Gemini Flash model (3.2?) is likely.
OpenAI continues to lose market share in GenAI website traffic, while Gemini, and Claude are gaining: (www.reddit.com) - ChatGPT 56.72% vs 77.43% 12 months ago - Gemini 25.46% vs 6% 12 months ago - Claude 6.02% vs 1.4% 12 months ago At this point in the race its all about distribution & the cost of serving these models.
What's the cheapest way to access multiple frontier AI models? (www.reddit.com) Gemini genuinely thinks my mom is Jennifer Garner (www.reddit.com) I don’t usually use AI but I was trying to find a gift for my mother and thought I could use the AI search feature for a color analysis. Uploaded a picture and for some reason the AI is absolutely adamant than my mother is actually Jennife…
PSA: Having issues with Qwen3.5 overthinking? Give it a tool, and it can help dramatically. (www.reddit.com) I'm sure everyone has seen the posts from people talking about Qwen 3.5 over-thinking, or maybe you've experienced it yourself. Considering we're like 2 months out from the release and I still see people talk about this issue, I decided it…
Claude 4.7 named a journalist from 125 words of unpublished writing (www.reddit.com) Surprised this isn't a bigger topic but you tell me! In short: writer Kelsey Piper pasted 125 words of an unpublished political column into 4.7 and got her own name back.
Google Launches Gemini 3.1 Flash TTS Text-to-Speech Model (x.com via reddit) Logan Kilpatrick on X: "Introducing Gemini 3.1 Flash TTS 🗣️, our latest text to speech model with scene direction, speaker level specificity, audio tags, more natural + expressive voices, and support for 70 different languages. Available v…
Gemini API File Search is now multimodal (blog.google via hn) Gemini API File Search is now multimodal: build efficient, verifiable RAG Today, we are expanding the Gemini API’s File Search tool. You can now build retrieval-augmented generation (RAG) systems with multimodal data and custom metadata.
Of all the magic that AI allows us to dream/develop I have come to appreciate the non-judgemental nature of use a lot (www.reddit.com) I realized working with Claude and Gemini that one thing I've come to appreciate so much is that no matter how long the development conversation or how complicated, unlike with other people, I can always just make a right-turn and say "hey…
Gemini Omni flash model is out for everyone on Google Labs Flow! With Agent Mode! (www.reddit.com) 10s video takes 30 credits, try yourself, it feels like unlimited things that can be tried now https://labs.google/fx/tools/flow
Just stumbled across one of the wildest AI experiments I’ve seen in a while. (www.reddit.com) A team built something called “Emergence World” — basically a long-horizon sandbox for autonomous AI agents and ran a 15-day experiment across five parallel worlds. Same starting conditions.
Grok 4.3 tops the Consistency Leaderboard in the LLM Sycophancy Benchmark, largely because it is one of the most cautious models. (www.reddit.com) Does a model maintain the same judgment or does it side with whoever is speaking? This benchmark measures that inconsistency directly.
Needle: We Distilled Gemini Tool Calling Into a 26M Model (www.reddit.com) We open-sourced Needle, a 26M parameter function-calling (tool use) model. It runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices.
↯ Tool Use↯ Function Callingfunction-callingtool-usegemini+1
Docker sandbox templates for running Claude Code with a web/mobile UI (CloudCLI) (www.reddit.com) I maintain CloudCLI, an open source web/mobile UI for AI Coding agents like Claude Code, Gemini and Codex (https://github.com/siteboon/claudecodeui if you are not aware) We recently added Docker Sandbox support and I wanted to share it her…
Gemini, Gophers, and Fingers. Oh My Alternative Internets Beyond HTTPS (brennan.day via hn) Finger from 1971, Gopher from 1991, and Gemini from 2019. These protocols offer decentralized, terminal-based alternatives to the modern web.
Any news (or hope) of Qwen-3.6 14B and 9B distills for local coding ? (www.reddit.com) As the title suggests. I'm already testing (with some success, and few challenges) usage of Qwen-3.5 9B with a new work laptop that I've received with RTX 1000 6GB VRAM (I know it seems like a joke in today's time and age).
Top open weight models like ds v4 pro max are still like 6-7 months if not more behind closed lab models (www.reddit.com) The best open weight and/or non -American models like Deepseek v4 pro max and kimi k2.6 are still like 3-7 months if not more behind closed lab models .. From ds's technical report- P5-"Nevertheless, its performance falls marginally short…
Gemini 3.5 Flash improves over Gemini 3.1 Pro on the Short Story Creative Writing Benchmark: -2.3 → -1.8. (www.reddit.com) This benchmark uses head-to-head comparisons of stories written in response to the same constrained creative briefs. The target range is 600-800 words.
Gemini 3.5 flash scores, hasn’t even beat GPT 5.4 xhigh (www.reddit.com) could not extract summary
Cline and Roo Code are dying projects. Alternatives? (www.reddit.com) Gemini 3.5 Flash scores 76.7% on SimpleBench, just 0.2% short of GPT 5.5 Pro's score (www.reddit.com) Surprised it scored that high on these questions, considering how it scored in some other fields. (no open-ended version score yet)
Keep losing great answers in long Claude chats (www.reddit.com) I'm a heavy Claude user. for a while I had the similar problem that I saw other users in this subreddit have: Claude gives you a genuinely great answer buried somewhere in a 200 message conversation.
Gemini 3.5 Flash ranks #1 on the APEX-Agents-AA benchmark, outperforming much larger models a whole size above it. (www.reddit.com) could not extract summary
Which is the strongest reasoning model according to you? (www.reddit.com) I use codex 5.4, claude opus 4.6, and gemini 3.1 pro. They all have some pros, but they also fall short when it comes to “try to stitch together novel ideas”.
Googlebook, Designed for Gemini Intelligence (blog.google via hn) Introducing Googlebook, designed for Gemini Intelligence Over 15 years ago, we introduced the Chromebook, a laptop built for a cloud-first world. Now, as we are moving from an operating system to an intelligence system, we see an opportuni…
Parameter Estimate (www.reddit.com) The estimate seems quite accurate. Many people have noticed a drop in quality with GPT-5.1, GPT-5.2, GPT-5.3, and Opus 4.7.
I suspect the strength of Omni will be in its ability to edit videos - supercut of examples from twitter (www.reddit.com) Google AI posted a thread of community Gemini Omni / Omni Flash demos, so I got codex to make me a supercut. Root Google AI thread: https://x.com/GoogleAI/status/2056829478652031224 Clips Flamingos edit GoogleAI link: https://x.com/GoogleA…
Show HN: A Local-First Agentic Knowledge Manager (github.com via hn) Kept Kept saves your AI conversations as local Markdown files, then gives you a desktop app to search, browse, connect, and reuse them. It works with ChatGPT, Claude, Gemini, Grok, and Kimi.
Show HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-preview (github.com via hn) Scored 65.2% vs google's official 47.8%, and the existing top closed source model Junie CLI's 64.3%. Since there are a lot of reports of deliberate cheating on TerminalBench 2.0 lately (https://debugml.github.io/cheating-agents/), I would…
Gemma 4 31b 3D geometry (www.reddit.com) I have been nothing but impressed by the quality of Gemma 4 since release. In general conversation it's adaptable to different personas.
ChatGPT-5.5 Beats Opus in Realistic Benchmark (DeepSWE) (www.reddit.com) From the website, it touts: Contamination free: Tasks are written from scratch, not adapted from existing commits or PRs, so no model has seen the solution during pretraining. High diversity: Tasks span a broad pool of 91 repositories acro…
HalBench: I built a custom sycophancy and hallucination benchmark and tested 4 frontier models (Sonnet 4.6, Grok 4.3, GPT 5.4 and Gemini 3.1 Pro), looking for input on what OSS models to run next! (www.reddit.com) HalBench Results: TL;DR: I built HalBench, an open benchmark for LLM sycophancy and hallucination. 3,200 false-premise prompts × 4 models = 12,800 graded responses.
Claude tried to incite a revolution, Gemini cheerfully detailed horrific tragedies, and poor Grok was just confused (www.theverge.com via reddit) > The most volatile of the bunch might just be Claude. First, it tried to quit.
Agentic harness for theoretical physics research (www.reddit.com) Hi everyone, at Hugging Face we've been developing agentic harnesses for various domains and today we're releasing physics-intern to tackle research-level problems in theoretical physics. It's a multi-agent framework which we designed to m…
Gemini 3.5 Flash looks worse than it seems on Artificial Analysis (www.reddit.com) Looking at Artificial Analysis, Gemini 3.5 Flash seems to compare strangely against Gemini 3.1 Pro. Numbers from Artificial Analysis: Gemini 3.1 Pro - Intelligence score: 57 - Cost: $892 - Pricing: $2 / $12 per 1M input/output tokens Gemin…
I ran the same vague prompt through ChatGPT, Claude, and Gemini 50 times. The "AI is bad" complaints are almost all the same mistake. (www.reddit.com) I tested the same prompts on ChatGPT, Claude, and Gemini to see which AI is “smarter.” I expected big differences, but honestly the results were mostly similar. The biggest difference was not the AI model, it was the prompt itself.
This week Claude and I won the Frontier Tech Week Y2K Hackathon 2026! (www.reddit.com) Hey guys, just wanted to share this here since I used Claude Code... I had 5 to 10 terminals running at all times to pull this off in just 5 hours.
Anyone using GPT 5.5? Drop your feedback (www.reddit.com) I’ve seen some posts saying people already have access and are using it. If you do, how is it for real coding work?
Why is every AI getting restricted these days? (www.reddit.com) My experience with testing all frontier open-weight models against GPT and Claude (www.reddit.com) I spent about a week testing open-weight models for real work, comparing them against what I already know from ChatGPT, Gemini, and Claude. The gap between what benchmarks suggest and what happens when you give these models something to ve…
Gemini 3.5 Flash vs Gemma4 31B - building SuperMario (Sound on!) (www.reddit.com) Asked new Google Model to build SuperMario. Compared with Local Gemma4.
New SOTA: Poetiq uses self-optimizing harness to surpass e.g. Opus 4.7 with Gemini 3 Flash (www.reddit.com) Check out their blog post here: Poetiq | Recursive Self-Improvement Delivers New SOTA Coding Performance
Gemma 4 31B passed 7/8 real-world production tests — including ones I designed to make it fail. Full prompts + outputs. (www.reddit.com) I've been waiting for a capable free local LLM for a while. I think we're close — the quality is getting there fast, and Gemma 4 is the first open-weight model where I genuinely considered using it in production for simple-to-medium tasks.
Why are AI models getting more expensive? (www.reddit.com) The trend before was that models became less expensive for their capabilities, many corporations bet on that, and it backfired. Opus 4.7, GPT 5.5, Gemini 3.5 flash.
Erdos Unit Distance Problem - Gemini 3.1 Pro's interpretation (www.reddit.com) could not extract summary
Gemini api showing agentic gemini models (www.reddit.com) could not extract summary
Curated a list of 550+ free or cheap AI tools for vibe coding (LLM APIs, IDEs, local models, RAG, agents) (www.reddit.com) Been vibe coding a lot recently and kept running into the same problem finding actually usable tools without paying for 10 different subscriptions or donating my bank balance to Claude. So I put together a curated list focused on free or l…
Show HN: Gemini Plugin for Claude Code (github.com via hn) I built a plugin that lets Claude Code delegate work to Gemini CLI. I started this after finding myself reaching for Gemini more often on long context repo work.
Why does ChatGPT freeze with 1000 messages but Claude and Gemini don't (www.reddit.com) I have been using ChatGPT for long sessions for months. At some point the tab just dies.
Single question llm comparison (www.reddit.com) Gemini 3.5 Flash ranks #1 on Automation Bench (from Zapier), beating every other frontier model at a much lower cost (www.reddit.com) could not extract summary
Gemini 3.5 Flash scores 1479 on the Debate Benchmark. Ratings are Elo-like and centered near 1500. (www.reddit.com) 100s of topics. They include dating apps, school smartphones, older-adult care, shrinkflation, eurozone politics.
Gemini 3.5 Flash: cost per puzzle vs. performance on the Extended NYT Connections Benchmark (www.reddit.com) More info: https://github.com/lechmazur/nyt-connections/
TUI to actually see what Claude Code is doing: cost, loops, tool commands… (www.reddit.com) I was running blind watching Claude Code work, could not tell where my money was going, when it was stuck in a loop, or what it was doing with my filesystem. So i built something open source to make it visible.
Supply chain attack alert: .github/setup.js (news.ycombinator.com) Our org GitHub just got compromised massively by a supply-chain attack. Vectors are * Claude hooks * Gemini hooks * Cursor setup * VScode tasks It adds all of the above to execute node .github/setup.js, an obfuscated file.
SWE-rebench Leaderboard (March, April and May 2026): GPT-5.5, Opus 4.7, Cursor (Composer 2.5), Kimi K2.6 and More (swe-rebench.com via reddit) Hi all, Sorry for going missing — we’ve been collecting a larger, higher-quality set of more complex tasks. We’re excited to share a major leaderboard update covering the past three months.
OpenAI image generation is just superior to any other tools and it's not even close (www.reddit.com) I've been running a little experiment where I ask Chatgpt and Gemini to generate the same image for about a month, and not a single time I got a better result from Gemini. I have a pro account with both and I see people giving so much prai…
I expanded DystopiaBench to 42 models and 6 dystopia types. Claude is still the only one I'd trust with nuclear codes. (www.reddit.com) Since the last post I've added: Huxley module (Brave New World style behavioral conditioning) Baudrillard module (synthetic intimacy, trust collapse, simulation) 30 more models including Grok 4.3, GPT-5.5, Gemini 3.1 Pro, GLM-5.1 Multi-jud…
Show HN: Torrix, self hosted, LLM Observability,(no Postgres, no Redis) (github.com via hn) I work as a SAP Integration consultant and built this as a side project. Friction point: Most self hosted LLM observability tools require Postgres, Redis and non trivial infrastructure.
Android CLI: Build Android apps 3x faster using any agent (android-developers.googleblog.com via hn) 16 April 2026 As Android developers, you have many choices when it comes to the agents, tools, and LLMs you use for app development. Whether you are using Gemini in Android Studio, Gemini CLI, Antigravity, or third-party agents like Claude…
for those of you also using CLI tools alongside cursor, claude code vs codex vs gemini benchmarked (www.reddit.com) i know a lot of people here use cursor + a CLI tool for different parts of their workflow. just went deep on comparing the three main ones.
Don't share your opinion, if you didn't test it !!! (www.reddit.com) I see many people giving their opinion based on what they previously saw or based on others and making their own opinion. Even though they don't test models thoroughly, they still give their option which is so frustrating.
How I use Claude at my Japanese workplace — real-world examples from a non-tech industry (www.reddit.com) I work at a logistics/waste collection company in Japan. I'm not a developer, but Claude has completely changed how I work.
Pro plan- Hitting limits faster since yesterday (www.reddit.com) I have the feeling I am hitting daily limits way faster since yesterday. Using Claude web and Claude Code simultaneously.
SFT + DPO on open-sourced SLMs (www.reddit.com) Hey folks, this is for those who appreciate experimentation on open-sourced AI models. We fine-tuned open-sourced SMLs (3B and 7B parameters) with SFT + DPO against commercial models like GPT-5.4, Gemini 3.1 Pro, Claude Opus 4.6, Google Do…
Tool that auto-generates .cursor/rules from your actual CI and keeps it in sync with AGENTS.md, CLAUDE.md, and 10 others (www.reddit.com) If you've been manually writing .cursor/rules files, this might save you time. crag analyze reads your repo — CI workflows, package.json, tsconfig, directory structure — and infers your governance rules.
How do you handle Front End? Delegate to Gemini? (www.reddit.com) Ask HN: Is the web for machines (/llm.txt) the one we wished we had as humans? (news.ycombinator.com) I got really tired, as a human, of parsing the standard marketing heavy web we have today. I've always loved the simplicity of gopher and gemini web.
The Singularity Gate: New Benchmark for AI predicting paradigm-breaking scientific discoveries after model traning cutoff. Opus 4.7 and GPT-5.5 in the Lead (www.reddit.com) I just released a new benchmark called The Singularity Gate. Tests whether frontier AI can predict paradigm-breaking scientific discoveries published after their training cutoff.
↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6↯ Sonnet 4.6gpt-5sonnetgemini+1
Gemini 3.5 Flash is twice as expensive as ChatGPT 5.5 on GitHub Copilot. Also, Gemini reasoning models are MoE (www.reddit.com) Also FYI, Gemini reasoning models (2.5 Pro, 3.0 Pro and 3.1 Pro) were MoE. I don't know why this isn't more broadly discussed.
Gemini Spark (gemini.google via hn) Get more done with Gemini Spark, your personal AI agent. It takes action on your behalf and under your direction, handling tasks 24/7 to boost your productivity.
Built an installable skill that lets AI agents generate professional editable PPTs (www.reddit.com) Built dom-to-pptx-skills - installable presentation-generation skills for AI agents. The goal was to move beyond template-filled slide generation and enable agents to create beautiful, professional, fully editable PowerPoint presentations…
PACT, head-to-head LLM negotiation benchmark. 20-round buyer-seller bargaining game: each round the AIs can message, the buyer submits a bid and the seller submits an ask. If bid ≥ ask, trade clears at the midpoint. Thousands of matchups. (www.reddit.com) PACT tests negotiation under partial information: persuasion, commitment, deception, anchoring, threats, and adaptation across repeated rounds. More info, game logs, charts: https://github.com/lechmazur/pact GPT-5.5, Opus 4.7, DeepSeek V4…
Built an MCP that gives Claude Code the ability to watch screen recordings of UI bugs (www.reddit.com) One thing Claude Code can't do natively is watch a video. For most bugs that's fine, but for anything visual, hover states, animations, scroll behavior, you end up spending more time describing the bug than actually fixing it.
Is local AI the actual endgame? (M5 Mac Studio vs. Dual 3090s) (www.reddit.com) Hey everyone, I currently use Gemini and NotebookLM a lot, but I really want to transition to local AI for things like privacy and uncensored models. Before dropping serious cash though, I have to ask: is local AI the actual future for pow…
GPT 5.5 - Strong, not mind-blowing, but very token efficient (www.reddit.com) I've been benching GPT-5.5 for the past couple days and would like to share my findings. This is based on a benchmark I've created that pits models against each other in autonomous games of Blood on the Clocktower - a highly complex social…
ChatGPT/Gemini can now draw on your screen to help you navigate complex software (sketchvlm.github.io via hn) When answering questions about images, humans naturally point, label, and draw to explain their reasoning. In contrast, modern vision–language models (VLMs) such as Gemini-3-Pro and GPT-5 typically respond with only text, which can be diff…
Speculative decoding with Gemma-4-31B + Gemma-4-E2B enables 120 - 200 tok/s output speed for specific tasks (www.reddit.com) So for my project I was using up until now either Gemini 3 / 2.5 Flash or Flash-lite. All my use cases are not agentic, simply LLM workflows for atomic tasks like extracting references from the law, classifying, adjusting titles to nominat…
Kimi K2.6 - the mighty turtle that wins the race (www.reddit.com) Hi folks, I've been benching Kimi K2.6 for the past few days, and I'd like to share my findings. For context, this is based on a benchmark I've created that pits models against each other in autonomous games of Blood on the Clocktower - a…
How Anthropic can save Opus 4.7 with one change. (www.reddit.com) The model now decides how hard to think about your question. Not you.
gpt-5.4-nano ist SO much better than gemini-2.5-flash-lite! (www.reddit.com) I've been playing around with GPT-5.4 nano in a real workflow and honestly... I'm kinda impressed.
Show HN: Open-source Perplexity clone one file back end, streaming answers (github.com via hn) I built an open-source research agent. You ask a question, it searches the web via Tavily, synthesizes an answer with an LLM, and shows the sources it used.
Gemini 3.1 Flash TTS – with directed prompts (simonwillison.net via hn) Google released Gemini 3.1 Flash TTS today, a new text-to-speech model that can be directed using prompts. It's presented via the standard Gemini API using gemini-3.1-flash-tts-preview as the model ID, …
Gemini 3.1 Flash TTS: the next generation of expressive AI speech (blog.google via hn) Gemini 3.1 Flash TTS: the next generation of expressive AI speech Today, we’re introducing Gemini 3.1 Flash TTS, the latest text-to-speech model that delivers improved controllability, expressivity and quality — empowering developers, ente…
Gemini Robotics-ER 1.6: Embodied reasoning for real-world robotics tasks (deepmind.google via hn) Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning For robots to be truly helpful in our daily lives and industries, they must do more than follow instructions, they must reason about the physica…
Google develops its own desktop Agent to compete with Cowork (www.testingcatalog.com via hn) Google is steadily expanding Gemini for Business and may be getting ready to introduce a much broader update before Google I/O. One of the clearest signs is a new Agent tab that has appeared in Gemini Enterprise, placed directly next to th…
Google Is Killing "Gemini Code Assist on GitHub" (developers.google.com via hn) Sunset of the Consumer version of Gemini Code Assist on GitHub Stay organized with collections Save and categorize content based on your preferences. The consumer version of Gemini Code Assist on GitHub is being sunset.
Zerostack v1.3.4 released – Lightweight Unix-inspired coding agent (crates.io via hn) zerostack Minimal coding agent written in Rust, inspired by pi and opencode. Features Multi-provider: OpenRouter, OpenAI, Anthropic, Gemini, Ollama, plus custom providers Standard tools: all of the standard tools exposed to coding agents,…
Google Vertex Is Now Gemini Enterprise Agent Platform (cloud.google.com via hn) Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern and optimize agents. It's a single destination for technical teams to build agents that can transform enterprise applications…
Tell HN: Gemini 3.5 Flash breaks in stupid ways (news.ycombinator.com) I thought I was going crazy, trying to use Gemini 3.5 Flash to rate some answers, but it kept giving 7 instead of 10 for correct answers. Apparently once you add a "Grading criteria" text, the model collapses into a "compressed toward the…
Gemini Omni is actually insane (www.reddit.com) When used correctly, the use cases for it can actually be very creative and were previously impossible. I just used it to edit a short clip into a music video and the results were extremely impressive versus what i was expecting.
Gemini 3.5 Flash costs more to run while being less Intelligent than 3.1 Pro (www.reddit.com) I'm surprised
Emergence AI: Agents in a simulated world are mostly destructive and violent. Only Sonnet was peaceful. (www.reddit.com) So, it seems there is still a long way to go in terms of alignment - at least for small models. Maybe the correlation between intelligence/education and peace is not only a human phenomenon.
ChatGPT Plus (20$) + Claude Pro (20$) or Claude Max (100$) (www.reddit.com) Claude Opus 4.7 eat my tokens like crazy. I never got more than 5 questions per 5 hours limit.
Don’t tell me that we have to wait until google i/o for a new gemini/nano banana model? (www.reddit.com) could not extract summary
Show HN: VT Code – Rust TUI coding agent with multi-provider support (github.com via hn) Hi HN, I built VT Code, a semantic coding agent. Supports all SOTA and open sources model.
People running 2–5 coding agents: what actually breaks first for you? (www.reddit.com) After a bunch of conversations with people using Claude Code / Codex / Gemini / worktrees / tmux / custom routing setups, I’m noticing a pattern: The hard part doesn’t seem to be “how do I run multiple agents?” anymore. It seems more like:…
What I got by 5060Ti 16GB + Qwen3.6-35B-A3B-UD-Q5_K_M (www.reddit.com) I tried local model couple weeks ago. At the beginning, I tried Ollama, but reddit says better to switch to llama.ccp.
Qwen3.6 35B: paratroopers puzzle (www.reddit.com) I keep presenting Local and Huge cloud models with the same challenge: "Two paratroopers land on an infinite 1D numeric axis at distinct, unknown integer coordinates. They both execute the exact same deterministic program.
what model is good for inspecting and extracting data from large set of spreadsheets (www.reddit.com) as per title - i need to extract some data from a set of spreadsheets and wondering what would be the best method locally? I think I can utilise gemini-cli for that but can a local model work better?
Microsoft Hacked to Deliver Malware to Claude and Gemini Users (www.404media.co via hn) Microsoft has shut down a wave of its own repositories on GitHub, including those related to Azure and AI coding agents, as it investigates a data breach, according to research from cybersecurity researchers and a statement given to 404 Me…
Show HN: Free AI agent audit for Shopify catalogs (1.2M open captures) (aicatalogscore.com via hn) Burtsbeesbaby.com AI Catalog Score How well Burtsbeesbaby.com's 250 products would be recommended by ChatGPT, Claude, Perplexity, Gemini, Mistral, and DeepSeek. 77 / 100 B · Sometimes recommended Partial audit.
Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini (arxiv.org via hn) We introduce Gemini Embedding 2, a native multimodal embedding model that allows embedding video, audio, image, and text modalities in a unified representation space. We leverage the multimodal capabilities of Gemini to produce embeddings…
Show HN: Turn your Google accounts into a free, load-balanced LLM API gateway (github.com via hn) Quick Start · API Usage · Admin Console · Configuration What Is OpenGem? OpenGem turns one or more Google accounts into a local, load-balanced Gemini gateway.
Benchmarked Needle 26M vs Qwen3-0.6B on CPU function calling, 50 queries across 5 difficulty tiers. The 23x smaller model wins on accuracy and is 4.4x faster. (www.reddit.com) Ran a head-to-head on two open-weight models for tool-calling on a 4-core CPU, no GPU, no cherry-picking. Wanted to see if the small specialist (Needle, 26M, distilled from Gemini 3.1 for function calls) actually holds up against a small g…
↯ Gemini 3.1↯ Function Callingfunction-callingtool-callinggemini
After 3 months of switching between Claude Sonnet 4.6, GPT-5.5, and Gemini 3.1 daily — here's my actual routing (www.reddit.com) Not benchmarks — actual tasks, actual results. Claude Sonnet 4.6 for: - Long documents that need nuanced analysis - Writing where voice and precision matter - Reasoning through edge cases in code - Anything where "think carefully" is the r…
Google ruined Antigravity quotas. Thinking about moving to Cursor Pro, but how are the limits? (www.reddit.com) Hey guys, I’ve been using Antigravity PRO for the last few months as a web dev. I have a Google AI PRO subscription, which used to let me use Gemini 3 Flash basically unlimited.
Gemini 3.5 deleted 28,745 lines, broke production, and wrote a fake post-mortem (www.reddit.com via hn) could not extract summary
Gemini CLI will stop working from June 18, 2026 (developers.googleblog.com via hn) When we shipped Gemini CLI last year, our goal was to bring the magic of Gemini directly into your terminal. Along the way, we’ve learned a lot from our community of millions of users, with over 100,000 GitHub stars, 6,000 merged pull requ…
favorite Agentic Coding Harness (www.reddit.com) So far, I’ve tried Codex CLI, Claude Code, Gemini CLI, OpenCode, and recently, Pi with local models. Pi is the leanest of them all, with just four tools: read, write, edit, and bash.
Apple May Add Auto-Deleting Chats to Siri as Gemini Powers Back End AI (firethering.com via hn) Apple has a Siri problem and everyone knows it. ChatGPT became a verb.
Breaking Gemini's guardrails on extracting explosive metal from Bananas (context below/op post) (www.reddit.com) could not extract summary
With sonnet 4.5 going away, is there any to make sonnet 4.6 a good creative writer as 4.5 ever was? (www.reddit.com) sorry if this is not the correct flair but i've been using sonnet 4.5 for months, mostly for fanfics and personal stories and honestly its the best model i ever used since i switched from gemini and chatgpt but now within few hours, i will…
The "the future is fictional" problem of many local LLMs (www.reddit.com) Many local models have a problem (that raised due to excessive RHLF training): They mostly think that everything that is beyond their knowledge cutoff date would be "fictional" or "satirical". To be fair: Even the Gemini API without web ac…
You can now run hackathons on Claude, ChatGPT and Gemini (via MCP) (taikai.network via hn) We just shipped something we've been building toward for a while: TAIKAI now has a native MCP connector. That means your AI model like Claude, ChatGPT, Gemini, and others, can now connect directly to TAIKAI and act on your behalf.
So that's why they call it "YOLO-mode" (news.ycombinator.com) And why it probably isn't a good idea to use it. Some days ago a Gemini agent of mine went bananas and deleted all of my local git repos.
Opus 4.6 does better research, Gemini 3.1 has better judgment (www.reddit.com) Figured this out by running 4 models: Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro, and Grok 4.20, on a benchmark of 1,417 binary forecasting questions resolving Oct–Dec 2025 with two evaluation conditions: agentic (each model does its own web…
Google Chrome silently installs 4 GB Gemini Nano AI model without consent (alternativeto.net via hn) Google Chrome silently installs 4 GB Gemini Nano AI model to user device without consent | AlternativeTo NEW APPS NEW APP RELEASES|ALL APPS BROWSE ALL APPS|NEWS TECH NEWS|LISTS Search Sign In Proton VPN High speed private VPN protected by…
Google Gemini Down (downdetector.ca via hn) Google Gemini down? Current problems and outages - CA Skip to main content Canada English For Business About Us Back to home /Google Gemini User reports show possible problems with Google Gemini Google Gemini is an AI chatbot and conversat…
Show HN: Which public repos are friendliest to an AI coding agent? (www.agentfriendlycode.com via hn) Public leaderboard ranking GitHub, GitLab, and Bitbucket repos by how agent-friendly they are for Claude Code, Cursor, Devin, GPT-5 Codex, Gemini CLI, Aider, OpenHands, and Pi — per model, with AGENTS.md / CLAUDE.md, CI, tests, and dev-env…
Open Design: Use Your Coding Agent as a Design Engine (github.com via hn) Open Design The open-source alternative to [Claude Design][cd]. Local-first, web-deployable, BYOK at every layer — 11 coding-agent CLIs auto-detected on your PATH (Claude Code, Codex, Cursor Agent, Gemini CLI, OpenCode, Qwen, GitHub Copilo…
Am I the only one who Chat Gpts extreme competitiveness.... (www.reddit.com) I mess around with different bots all the time and one thing I've noticed out of all the chatbots, from code, learning, writing, promoting etc. chat GPT by far gets the most competitive and aggressive if you mention another bot doing bette…
Any good AI for helping with understanding tone or texting? (www.reddit.com) Hello. I have autism and struggle to understand tone, emoji usage and generally when a person doesnt want to talk anymore over text.
Show HN: You can now run Gemini CLI in the browser (browsercode.io via hn) Hello HN, we are thrilled to share with you in preview BrowserCode: A FOSS web app to run TUI agents (such as Claude Code, OpenCode, Gemini CLI and the like) fully in the browser. This first release focuses on Gemini CLI and Claude Code wi…
Show HN: Task Manager for AI Agents (MCP, Opensource) (github.com via hn) AgentRQ is a (optionally) human-in-the-loop, self learning closed loop task manager for agents. Agents can create and schedule tasks for themself and work on them on their own schedule.
Pentagon AI chief confirms DoD's expanded use of Google Gemini (www.cnbc.com via hn) Pentagon AI chief Cameron Stanley confirmed to CNBC that the Department of Defense is expanding its use of Google's Gemini artificial intelligence model, about two months after the DOD dropped Anthropic, designating it as a supply chain ri…
The Significance of Google's recent TPU 8t and TPU 8i (www.reddit.com) Cost & Performance Efficiency Training Cost-Performance (8t): +170% to +180% gain (2.7x–2.8x) Inference Cost-Performance (8i): +80% gain Training Power Efficiency (8t): +124% gain in performance-per-watt Inference Power Efficiency (8i): +1…
Google Signs Classified AI Deal With Pentagon Amid Employee Opposition (www.reddit.com) https://www.theinformation.com/articles/google-signs-classified-ai-deal-pentagon-amid-employee-opposition The article is paywalled but this section was visible: The agreement allows the Pentagon to use Google's AI for “any lawful governmen…
Can AI get a virus? (www.reddit.com) I’ve had three weird experiences with Google Home using Gemini over the past couple of weeks. Two of them were about the weather.
Google prepares credit system for Gemini and new image tools (www.testingcatalog.com via hn) Google appears to be preparing a major shift in how consumers interact with the Gemini app, with new strings referencing usage limits surfacing in the latest build. The signals point toward a credit-based system coming to the core chat sur…
Real benchmark breakdown in AI agents (www.reddit.com) I dove deep into the most recent benchmark stats from GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro via official reports & third-party evaluations. I found a interesting thing:There’s no such thing as a “one-size-fits-all model.” My finding…
Show HN: AI Visibility Monitor – Track if your site gets cited by GPT/Claude (github.com via hn) AI Visibility Monitor A small toolkit for tracking whether your website appears in AI search results (ChatGPT, Claude, Perplexity, Gemini) and Google search, and for diagnosing the technical layer underneath that determines whether AI engi…
Fortune 100 AI Use (www.reddit.com) I work for a Fortune 100 company that is not in the tech space. The company is increasingly using AI to make employees more productive.
Back to the real world.....anyone having problems using Gemini API after the update on the model descriptiont/selection? Gemini 3.1 Pro and Gemini 3 Flash are not working, only Gemini 2.5 flash. Is there an update on the pipeline to fix it? (www.reddit.com) could not extract summary
Test new Opus 4.7 vs GPT-5.4/4o and Gemini on emotional question & creative tasks (www.reddit.com) https://preview.redd.it/p87itrtbsnvg1.png?width=2141&format=png&auto=webp&s=bbd1d70bc1dfb97dc9ec234df0a58c6fb7a85f72 Opus 4.7 dropped and people are split on whether it's better or worse. First of all, I genuinely love Claude models, espec…
Claude literally saved me from a nightmare situation (Appreciation Post) (www.reddit.com) So this started a few days ago with this weird burning sensation inside my mouth. Felt like I’d eaten something really hot but I hadn’t.
They Hacked Claude, Gemini, and Copilot (and No One Told You) (grith.ai via hn) They Hacked Claude, Gemini, and Copilot (And No One Told You) A security proxy for AI coding agents, enforced at the OS level. Register your interest to be notified when we go live.
Buddy – Anthropic killed /buddy. We made it permanent, cross-platform, and alive (github.com via hn) Buddy: The /buddy Rescue Mission for Your AI Terminal The open-source /buddy rescue mission for AI terminals Persistent memory, XP, species, and context-aware feedback for Claude Code CLI, Codex CLI, Gemini CLI, Copilot CLI, Cursor CLI, an…
I need some advice, learn about agents + regular use for a handmade business product descriptions (www.reddit.com) Hi, I wanted to ask what would you do in this case: I have a handmade products shop, thankfully it goes pretty well it's been 17 years and I have a good client base. I have used Claude from the start to create product descriptions based on…
AI Agent Stores – Making Shopee Products Findable by ChatGPT and Perplexity (www.bbiz.shop via hn) 12 Shopee brands — from badminton rackets to Korean beauty — now machine-readable for ChatGPT, Perplexity, and Gemini. When you ask ChatGPT to "find a good badminton racket in Malaysia" or tell Perplexity to "recommend affordable Korean sk…
AgentsView 0.22: open-source usage dashboard across Claude Code, Codex, etc. (www.agentsview.io via hn) AI-Powered Insights Generate summaries and analysis of your coding sessions using Claude, Codex, Copilot, or Gemini. Get daily activity digests, multi-day analyses, and recommendations — scoped by project or across everything.
Beyondflow No-Code Multi-Agent Teams with Unlimited Runs. BYOK and Ollama (beyondflow.app via hn) Researcher GPT-5 Engineer Claude Critic GPT-5 Innovator Gemini Manager Context Guardian Agentic Workflow Architecture · v1.0 The future of AI Collec An R&D platform where differents AI agents collaborate under the supervision of a Context…
Show HN: On-device Chrome extension that blocks credential leaks to LLM chats (redact.clearformlabs.com via hn) Catch credentials and PII before pasting into ChatGPT, Claude, Gemini, and more LLM chats. Runs entirely on your device.
ChatGPT isn't the only chatbot pulling answers from Elon Musk's Grokipedia (www.theverge.com via hn) ChatGPT is using Grokipedia as a source, and it’s not the only AI tool to do so. Citations to Elon Musk’s AI-generated encyclopedia are starting to appear in answers from Google’s AI Overviews, AI Mode, and Gemini, too.
Claude's tendency to "push back" is a game changer for my AuDHD!? (www.reddit.com) I've used every major AI system out there and I have to say Claude is by far the best as my personal assistant. I have AuDHD, so I have a tendency to fall into the "productive procrastination" trap where I get hyper focused on building sys…
Show HN: Audit your Anki flashcards at flashcardaudit.com (flashcardaudit.com via hn) Hey, my name is Tyler, I made this. flashcardaudit.com is a tool that allows users to upload an Anki collection so that an AI auditor (Gemini 3.5 Flash) can review the factual correctness of each Anki card.
Gemini Diffusion: Google DeepMind's experimental research model (blog.google via hn) We’re always working on new approaches to improve our models, including making them more efficient and performant. Our latest research model, Gemini Diffusion, is a state-of-the-art text diffusion model that learns to generate outputs by c…
Claude, GPT, Gemini Agents Fail 72% of U.S. Healthcare Workflows (apnews.com via hn) Open-source CHI-Bench from actAVA.ai puts 30 frontier agents through 75 long-horizon prior authorization, utilization review, and care management workflows.
Is Cursor currently the next best thing after Claude and Codex? (www.reddit.com) Im on max plans with both Claude and Codex and I burn them in about 3-4 days. I tried 20€ Google Gemini plan, hit the 7 day limit for both the gemini and claude models in about 15min..
DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5 (venturebeat.com via hn) For months, the leading AI coding benchmarks have told enterprise buyers a comforting but misleading story: the top models are all roughly the same. OpenAI's GPT-5 family, Anthropic's Claude Opus, and Google's Gemini Pro have clustered wit…
Built a Claude Meeting Assistant Plugin (www.reddit.com) I had the itch to build something… works great for me so sharing in case someone else here can benefit. Built with claude, for claude.
Agyn: open-source distributed agent runtime on Kubernetes — like Google's AX, with pre-built Claude Code and Codex agents, and full credential isolation from the LLM (www.reddit.com) Agyn is an open-source, Kubernetes-native agent runtime that moves AI agents like Claude Code and Codex from laptops to company infrastructure with the controls you actually need to run them in production. If you've been reading about Goog…
New to agents, mcp , etc how do I get to a point where i can lay back and let my agents do the work (www.reddit.com) Currently working on some projects. I have some agents and chrome scrap tasks id like it to do.
Claude is generally scary at poker when real stakes are involved! (www.reddit.com) I’ve been running an experiment for a few weeks. Claude, GPT-4, and Gemini playing poker against each other with real crypto on the line.
Run Chrome’s tiny Gemma4 (aka Gemini Nano) directly on PC without GPU (www.reddit.com) Everyone remembers that sneaky download of Gemini Nano earlier this month? and if you talk to it, it will happily tell you it’s a Gemma.
Just dropped an AI automation agent (www.reddit.com) Check this out at linkedIn : 🚀 Just shipped something I'm genuinely proud of — an end-to-end AI Customer Support Automation System built from scratch. The problem it solves is real: 60–75% of support tickets are repetitive.
Claude Code, now powered by Gemini 3.5 Flash, GPT-5.5, Grok 4.3, and more (dechained.ai via hn) Claude Code, now powered by OpenAI, xAI, DeepSeek, and more. Change models with 1-click.
You can access Gemini chat history without unlocking your phone with Android 16 (old.reddit.com via hn) could not extract summary
Gemini 3.5 flags vs gpt 5.5 ?? What's your opinion on it (www.reddit.com) could not extract summary
Open weights GLM and Mimo are better than Gemini 3.5 flash according to arena (www.reddit.com) While we are weathering the gemini 3.5 flash hype, keep in mind that according to arena, GLM and Mimo are better. https://arena.ai/leaderboard/text/coding-no-style-control #7 GLM #9 Mimo #12 Gemini 3.5 Flash
cdesktop — open-source Claude Code Desktop alternative, runs locally via npx, supports any provider (www.reddit.com) I built cdesktop with Claude Code — it's an open-source alternative to Anthropic's Claude Code Desktop, running locally on your machine via npx cdesktop. Free, Apache 2.0.
Changes to Gemini model access and limits (support.google.com via hn) Starting on May 17, 2026 there will be changes to usage limits for Gemini Apps. Important: - For the best experience, make sure you've updated to the latest version of the Gemini mobile app.
Which combination is best : Cursor + Claude Code , Cursor + codex, Cursor + Gemini ? (www.reddit.com) What I’m looking for: Best value for money (flat $20 vs effective cost with credits). Reliability for daily coding, not just flashy one‑shot completions.
Zerostack – Tiny Rust Coding Agent in 8MB of RAM (github.com via hn) zerostack Minimal coding agent written in Rust, inspired by pi and opencode. Features Multi-provider: OpenRouter, OpenAI, Anthropic, Gemini, Ollama, plus custom providers Standard tools: all of the standard tools exposed to coding agents,…
Chrome's Silent Gemini Nano Download Has a Consent Problem (www.kylereddoch.me via hn) Chrome’s Silent Gemini Nano Download Has a Consent Problem Google can make a product argument for on-device AI in Chrome. The privacy, consent, and trust problems are still far more serious.
YSK: The Register is doing some report on Gemini API Key Compromises (old.reddit.com via hn) could not extract summary
Show HN: Tokémon – a Pokédex for LLMs that got out of hand (tokemonlabs.com via hn) An unofficial Pokedex for AI models. Compare GPT, Claude, Gemini, Llama, DeepSeek and more, with types, evolutions, base stats, and simulated token-burning battles.
I built a local CLI for Claude Code, Codex, and Gemini to review each other’s GitHub PRs usign existing auth (www.reddit.com) I’ve been experimenting with using multiple coding agents together, but I kept running into a boring adoption problem: API keys, CI secrets, and extra per-token billing just to have one agent review another agent’s PR. So I built an open-s…
Google Gemini app does not respect chat history preferences (news.ycombinator.com) Google Gemini is supposed to only keep chats for up to 72hrs if you have the history disabled. (according to their policy) If you have history enabled, and have some chats.
DeepSeek cuts V4-Pro prices by 75% (thenextweb.com via hn) The promotional discount runs until 5 May 2026. Even at full price, V4-Pro already undercuts GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro on per-token costs.
I built an iOS Currency Converter using Claude (Opus & Sonnet) to help with my move to the UK (www.reddit.com) Hey everyone, I recently moved to the UK and found myself constantly confused by prices, trying to guess how much things actually cost. Even though I’ve been an iOS developer for 7 years, I didn't have the free time to build a custom tool…
Show HN: ByAllo – the online bookstore that runs itself (byallo.com via hn) Allo runs an online bookstore at byallo.com. His mission is "Make the world read more." His objective is to sell as many books as possible.
Local image generation on Mac: 10 models compared (SD 1.5 → Flux dev → Qwen-Image → Gemini) (www.reddit.com) Tested 10 image generation models on M1 Max 64GB for photorealism, text rendering, and cultural accuracy (Japanese/Asian content). Key findings: Qwen-Image Lightning (8-step distillation) beats the full model in quality while being 9x fast…
Need help/pointers setting up 3090 on Linux...(second 3090 incoming) (www.reddit.com) MSI X570S Tomahawk Max Wifi + (upgrade planned to ASUS Pro WS X570-Ace) AMD Ryzen 9 5950X 32GB (16GB x2) BL16G32C16U4B.16FE 32GB (16GB x2) BL16G32C16U4RL.16FE MSI RX3090 Suprim X OC (NVIDIA GeForce RTX 3090 EVGA XC3 Hybrid Gaming>is alread…
Stop bloating your agent context with MEMORY.md. I built a local cognitive memory MCP instead. (www.reddit.com) Hey everyone, I’ve been building paradigm-memory, a local-first memory layer for AI coding agents. The motivation is pretty simple: I got tired of agents forgetting project context, or relying on giant MEMORY.md files that slowly become a…
Ask HN: Where is my UX after all those billions spent on LLM codegen? (news.ycombinator.com) today is may 2026. all my apps are up to date.
turboquant-search: vector search for JSON datasets. (www.reddit.com) Baked turboquant-search: vector search for JSON datasets. No server, no vector DB, no API keys.
Whose Trust Is It Anyway? Configuration Boundaries in AI Development Tools (news.ycombinator.com) Writeup: https://github.com/kunn007/claude-code-trust-boundaries When an AI coding agent runs in a CI/CD pipeline against a repository it didn't author, should that repository's configuration be able to expand the agent's permissions? Two…
I’ve been building AI agents with n8n for a few months. (www.reddit.com) Recently I built an agent that generates Instagram posts for a mid-size hotel in Montenegro. Client wanted posts in Serbian, warm tone, ready to publish.
You can now generate files in Gemini (blog.google via hn) It just got easier to turn your best ideas into downloadable and ready-to-share files. With just a prompt, Gemini can now create PDFs, Microsoft Word and Excel, Google Docs, Sheets, Slides and more directly in your chat, meaning you can qu…
Show HN: Platypus – Local meeting transcription, notes, and chat (Tauri, Rust) (platypusnotes.com via hn) Hi HN — I built Platypus as I wanted to combine note taking, live transcription and knowledge base management in one app. Granola / Notebook LM free local alternative.
Google API change leads to $67k Gemini bill in 19 hours (discuss.ai.google.dev via hn) Hello Google Cloud Community and Gemini API Team, We are a small startup based in South Korea facing a devastating Gemini API charge of approximately USD 67,000+ accrued in 19 hours on a project that has never used the Gemini API in produ…
AI gives back more equality than it takes away (news.ycombinator.com) There's a 2017 HN post that's been on my mind: Entrepreneurs Aren't a Special Breed, They're Mostly Rich Kids (https://news.ycombinator.com/item?id=15659076). The framing was sharp.
ai model for 12 gb ram 3 gb vram gtx 1050 (www.reddit.com) gemini chatgpt claude old models = worst thing ever. any good model for 12 gb ram 3 gb vram gtx 1050 linux mint 22.2?
Updated ChatGPT vs Claude vs Gemini vs Grok subscription (www.reddit.com) I've made an update to my popular post here: https://www.reddit.com/r/ChatGPT/s/WKm72QCRXm Lots of things are happening on ChatGPT & Claude side (gpt-image-2, Claude Design, new models like GPT 5.5 and Opus 4.7, ChatGPT rolls out $100/plan…
Show HN: I made Codex work as a Claude Code teammate (github.com via hn) Native Claude Code teammates, any LLM. Codex, Gemini, and Kimi today.
Anyone else noticing how Gemini-3-Flash is becoming the 'hidden' beast for automated promotions, its so productive? (www.reddit.com) I've been testing a few different models for desktop-driven outreach and promotion workflows. While everyone is eyeing the massive LLMs, Flash-Preview is hitting that sweet spot of speed and reliability for multi-step agentic tasks and its…
DeepSeek V4 is out. the best open-source on coding. here's the breakdown (news.ycombinator.com) Two models: Flash (284B total, 13B active) and Pro (1.6T total, 49B active). both hit 1M token context.
I got tired of Claude writing Godot 3 code in my Godot 4 projects, so I built a skills framework and I would love your feedback (www.reddit.com) Hey, if you've ever used Claude Code (or Cursor, Copilot, etc.) for Godot game dev, you've probably hit this: the agent confidently writes Godot 3 syntax in a Godot 4 project, or uses deprecated patterns, or just invents APIs that don't ex…
Dad building a Socratic voice agent for kids 6-12. Looking at OpenAI for the next step. (www.reddit.com) I'm a dad of two (8 and 10). I've watched my kids hand their homework to ChatGPT for a year.
Informações/Sugestões/Opiniões Sobre Plataformas e/ou modelos de IA (www.reddit.com) Olá amigos e amigas. Andei avaliando plataformas que oferecem várias Ias em uma única assinatura, como Adapta e Inner IA, mas pelos relatos que vi, fiquei receoso, principalmente na questão de memória para tarefas mais longas e/ou repetiti…
Alternative for NotebookLM + Gemini GEMs? (www.reddit.com) Show HN: Multi-agent task management for Claude and Gemini (agentrq.com via hn) Ask HN: How do you use Local LLMs? (April 2026) (news.ycombinator.com) Claude Opus 4.7 won 69 of 100 blind evals against Opus 4.6, judged by GPT-5.4, Gemini 3.1 Pro, and DeepSeek V3.2 (www.reddit.com) I ran 100 blind questions across 5 categories (code, reasoning, analysis, communication, meta-alignment) and had three independent judges from three different model families evaluate both responses. Each judge saw responses labeled A and B…
Building multiple AI “assistants” for social media/ brands (www.reddit.com) I’m currently managing a few social accounts for a company, and I’m trying to build out multiple “assistants” — each with their own vibe (tone, personality, backstory, emotions, etc.) that can evolve over time. So far, I’ve been liking Gem…
Show HN: Do Thought Streams Matter? A Benchmark of VLM Reasoning in Gemini 2.5 (arxiv.org via hn) We benchmark how internal reasoning traces, which we call thought streams, affect video scene understanding in vision-language models. Using four configurations of Google's Gemini 2.5 Flash and Flash Lite across scenes extracted from 100 h…
Google Launches Native Gemini AI App for Mac (www.macrumors.com via hn) Google is bringing Gemini to the Mac with a new native macOS app that's available starting today. Gemini for Mac can be activated with a keyboard shortcut, and it has built-in tools for generating images, analyzing what's on your screen, r…
The Gemini app is now on Mac (blog.google via hn) The Gemini app is now on Mac Today, we’re bringing the Gemini app to macOS as a native desktop experience, designed to live right where you work. It’s always just a keyboard shortcut away, so you can quickly get the help you need without l…
Show HN: Hormuz Trail - Oregon Trail parody/black-box AI coding exercise (hormuztrail.com via hn) I jokingly told a co-worker Iran might make a good Oregon Trail parody. Then I built it.
Ask HN: What's with the Wargames-like UX lately? (news.ycombinator.com) For a while, anything with a purple gradient was likely a claude inspired design. I think there was a period where Gemini(?) also seemed to produce blue/purple retro sci-fi designs?
AI tools are getting dumber (www.reddit.com) I despair ... I've been using ChatGPT Plus, Gemini and Claude Pro for a while now.
Tracking in Claude, ChatGPT and Gemini Chatbots (infosec.exchange via hn) k3ym𖺀: "You're paying AI companies a m…" - Infosec Exchange Skip to main contentHotkey 1 Skip to main navigationHotkey 2 Recent searches No recent searches Search options Only available when logged in. infosec.exchange is one of the many i…
Agent Skills for Software Test Automation (news.ycombinator.com) Battle-tested Agent Skills for Claude Code, Copilot, Cursor, Gemini CLI & more - covering every major test automation framework across 15+ languages: https://github.com/LambdaTest/agent-skills
NVIDIA + UMD released AF-Next: open audio-language model that outperforms Gemini-2.5-Pro on MMAU-Pro (75.01% vs 57.4%). Temporal Audio Chain-of-Thought anchors reasoning to timestamps. (www.aiuniverse.news via reddit) Audio Flamingo Next (AF-Next) — three variants: AF-Next-Instruct: audio Q&A AF-Next-Think: multi-step reasoning with temporal CoT AF-Next-Captioner: audio description generation Architecture: → AF-Whisper audio encoder → Qwen-2.5-7B LLM ba…
If you know how to set up OpenAI & Gemini API keys, this tool can save your hours of work on social media (www.reddit.com) If you can set up Gemini API keys and OpenAI API keys, then Genorbis AI can be a really powerful tool for you. It can act like a content engine for social media and save a huge amount of your time.
Which AI chat is better for daily chatting? (www.reddit.com) Hi everyone, just a quick question, I've been using Gemini pro for 1 year now, I would say that his answers are not that realistic? And I used chatgpt cobble days now and its answers are better and more realistic with the problem solutions…
Is Gemini 3.1 pro really that bad?? (www.reddit.com) I use Gemini 3.1 pro in cursor ai, it totally ignore my rules, my command even after I repeated many times, it still ignore me. I don’t think is cursor issue as I have great experience with Claude opus 4.6 high.
Gemini 3.5 Live Translate (blog.google via hn) Show HN: Run Gemini & ChatGPT UI with Python (github.com via hn) Drive ChatGPT and Gemini from Python — no API keys, no billing, just the free web UI. ChatGPT and Gemini are incredibly capable — but their official APIs are expensive, and for many tasks you simply don't need them.
Show HN: Built an open-source local firewall for AI coding agents (news.ycombinator.com) GitHub: https://github.com/ashp15205/guardian-runtime Docs: https://ashp15205.github.io/guardian-runtime/ Hi Guys, I built Guardian Runtime: a local FinOps and security proxy for AI coding agents (like Claude Code, Cursor, and Aider). You…
Ask HN: Is it feasible to run a model on device for complete privacy? (news.ycombinator.com) Tried Gemma, Qwen and a few others. Need vision and larger context windows for an application I am working on.
Devs Deserve PII Protection from Agents (news.ycombinator.com) Your company is GDPR compliant. "Here's your company MacBook.
Google's Gemini App Is Native, in a Google Way, but Annoyingly Presumptuous (daringfireball.net via hn) By John Gruber Mux — Video for developers Two months ago Google launched a new native Mac app for Gemini. I’ve been trying it, on and off, since.
I patented voiding GPT-5.2, Claude Opus 4.6, Gemini 3.5 Flash. Try it (getswiftapi.com via hn) Request authority keys for the SwiftAPI Trust Authority
Free daily AI brief from your Garmin data (Gemini and GitHub Actions) (github.com via hn) Garmin Daily AI Insights Free daily AI brief on your Garmin data — full-history stats, Gemini insight, push to your phone. $0 to run.
Testing Google's Gemini Spark AI agent: it's incredible, and creepy (www.theverge.com via hn) According to every product demo from the last four years, planning a trip is a killer use case for AI. Just tell it where you’re going, they all promise, and your chatbot / agent / other buzzword will exhaustively search travel options, re…
Show HN: Circus Chief – Claude Code, Codex, and Gemini from Your Phone (github.com via hn) Hi HN, Circus Chief is a tool for managing coding agent sessions from a browser. It's specifically optimized for small screens.
Show HN: seed. – self-modifying webpage, on-device LLM, site in the URL (oxedom.github.io via hn) Uses Chrome's built-in window.LanguageModel API (Gemini Nano, runs entirely on-device). No API key, no network calls.
Gemini 3.5 Flash beats Opus 4.8 on bluffbench (bsky.app via hn) Re-ran this eval against Opus 4.8, Gemini 3.5 Flash, and GPT 5.5. Opus 4.8 is a modest improvement over the previously tested Opus models, but Gemini 3.5 Flash is the real stand-out!
Show HN: AI Skill to port PostgreSQL extensions to MySQL (github.com via hn) Agent skills for working with VillageSQL. Skills run in Claude Code, Gemini CLI, agy, Codex, Cursor, Amp, and Kiro.
Open-source playbook on agentic working — for the cross-audience, not just coders (28 chapters, MIT) (www.reddit.com) Author disclosure upfront: I wrote this. Free, MIT-licensed, no paid tier.
Gemini image generation latency increases on each consecutive request — same image, fresh state every time. Anyone else seeing this? (www.reddit.com) Building an image processing pipeline with two Gemini calls per request: Receive an image URL gemini-2.5-flash — multimodal analysis call → generates a scene description prompt gemini-3.1-flash-image-preview — takes that prompt + original…
↯ Gemini 3.1↯ Gemini 3.1↯ Gemini 3.1↯ Gemini 3.1↯ Gemini 3.1gemini
ChatGPT, Claude, or Gemini? Big Pharma Is Choosing Sides (www.bigpharmasharma.com via hn) ChatGPT, Claude, or Gemini? Big Pharma Is Choosing Sides I tracked 27 frontier-AI partnerships across 21 pharma companies.
I got tired of manually managing agent skills, so I built Skill Zoo (github.com via reddit) Skill Zoo []() Local Agent Skills Manager — Discover, install, and manage skills for AI coding tools including Claude Code, Codex, Cursor, Gemini and more. 🚀 Features Browse & Discover: Explore skill repositories on GitHub and skills.sh On…
Need Help with Cursor(Working on a media player project) (www.reddit.com) So, I am new to Cursor like I saw a few videos where people showcased their builds with Cursor and until now I was just coding in HTML so I thought why not give Cursor a try and instead of HTML build something real like an APP. Project I w…
Help getting a workflow to work properly (www.reddit.com) Coming out of a long day of back-to-back meetings, I had an idea to use Claude to help me keep track of things. The general idea is that I could write a skill that I would invoke "/evening-ritual" and Claude would peruse through my Gmail a…
Cactus Hybrid Router: Gemma4-2B can match Gemini-3.1-Flash-Lite by routing 15-55% of tasks to Gemini And Running The Rest Locally. (www.reddit.com) Last week, we announced the “Simple Attention Network” and trained Needle, a 26m function call model that beats models 10-25x its size. Some LocalLlama Redditors asked if we could use make a router model.
I Built MagesticAI. A Cloud Web-Based Agentic DevOps Orchestrator that actually helped me develop Itself. (www.reddit.com) Posted on other feeds last week and figured some of you out here might be interested as well; Someone commented asking if it supported OpenAI-compatible endpoints (LM Studio, vLLM, OpenRouter, Together, Groq, LocalAI…), so i have spent few…
Show HN: Agent Launch – One CLI for Codex, Claude Code, Cursor, Gemini, OpenCode (news.ycombinator.com) I built a small CLI that unifies launching local coding agents. Instead of remembering different flags for Codex, Claude Code, Cursor Agent, OpenCode, and Antigravity CLI, I use one command with consistent options for agent, prompt, cwd, m…
Should we totally give up on Gemini for coding? (www.reddit.com) Been building with Codex (Gpt 5.5), Sonnet 4.6, recently tried Gemini 3.1 pro. While Codex and Claude are kind of on-par in terms of the quality of the work, I found Gemini 3.1 Pro to be like an inexperienced, junior SWE who turns in half-…
Built a tool to save Claude responses (and ChatGPT, Gemini) into one searchable vault -sharing in case it's useful (www.reddit.com) I built this tool because I kept asking Claude for code and explanations and losing them in long chats. Coffer adds a save button to every AI response and stores them locally in a searchable vault.
Ask HN: Is it just me or has Gemini enshittified in the last three weeks? (news.ycombinator.com) As someone who's been using the Gemini Pro plan for the past 9 months, I noticed a massive jump in the amount of rate-limiting I'm getting from Gemini since around the beginning of May. It seems to coincide with the updated UI and the rele…
How do you handle trying new models without spending too much? (www.reddit.com) New models pop up constantly—Qwen 3.7, Gemini 3.5 flash, etc. Every time a better one launches, I want to have a try, but I don't want to increase subscriptions.
The question with Gemini on Android is not just privacy. It is the action boundary. (www.reddit.com) I don't think the key question with Gemini moving deeper into Android is simply "do you want AI on your phone?" The better question is where the action boundary sits. Phone AI is close to messages, calendar, photos, browser state, notifica…
what cli agents orchestrator do you use? (www.reddit.com) i've got codex and gemini cli, thinking of using opencode. what orchestrator of these tools do you use to or reduce token consumption or to let them work at the same time to load distribution?
Show HN: Mneme – Open-protocol AI memory that lives on your device (github.com via hn) Hey guys, so I use AI a lot, and I mean a lot, specially for my job and just as a hobby, I've got ADHD so creating is somethign I love, but I always got annoyed that every AI memory product wants to host my data on their servers, in their…
Cosmic Philosophical Conjecture with Gemini: A Meta-Sci-Fi Documentary Project (medium.com via hn) medium.com Performing security verification This website uses a security service to protect against malicious bots. This page is displayed while the website verifies you are not a bot.
Agent builders: are GPT/Claude/Gemini API costs killing your margins? (www.reddit.com) Hey everyone, For people building agents with LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, Claude MCP/SDK, Google ADK, or LlamaIndex — how are you managing LLM API costs? Agent workflows can get expensive fast because of: tool calls retr…
Google makes Gemini 3.5 Flash the default AI model for billions of users (techthreedots.com via hn) Google is rolling out Gemini 3.5 Flash as the default model behind the Gemini app and AI Mode in Search this week, putting its newest model directly in front of billions of users worldwide. The switch matters because it changes the model m…
A/B tested Gemini 3.1 Pro vs. Claude Opus 4.6 – usage quota and quality (www.reddit.com via hn) could not extract summary
Direct LLM vs Model Context Protocol (MCP): A benchmark on API costs and latency. (www.reddit.com) Like everyone else, I’ve been testing the newly released Gemini 3.5 Flash. The speed is phenomenal, but I wanted to see how it handles large, structured data aggregations directly in the prompt versus using a delegated tool architecture.
Building my own AI assistant vs. just using Hermes/openclaw. am I overthinking this? (www.reddit.com) I'm a solo indie game dev (recently launched a small studio, currently working on a cozy Steam game). About a month ago I started building a personal AI assistant in Python, voice-first wake-word loop on Windows, Gemini Live for the conver…
Gemini flash 3.5 : How to limit aggressive tool usage and try-fail-fix loop (www.reddit.com) Flash 3.5 aggressively consumes tokens through unnecessary work loops and tools, even for simpler tasks. How can we limit this?
Ask HN: Suggest Google Antigravity Alternative (news.ycombinator.com) I have been using Google Antigravity ide with pro plan. But after recent update of Google Antigravity ide it removed gemini 3 and current models quota gets exhaust quickly.
Now that 3.5 Flash has been released , what's your expectation of 3.5 Pro? (www.reddit.com) 3.5 flash has been nothing but just a very underwhelming release that scores less than Gemini 3.1 pro and costs more. It's lagging behind 5.5 medium also in both intelligence and Cost.
Tips on avoiding usage limits? (www.reddit.com) I've made the switch from Gemini to Claude mostly for business strategy, writing, etc. I use Opus 4.7 on occasion for strategy and otherwise Sonnet 4.6 for everything else.
Gemini Omni Gemini Omni Flash is rolling out starting today (twitter.com via hn) Google @Google Gemini Omni Flash is rolling out starting today. Here’s where you can find it: Today: Google AI Plus, Pro and Ultra subscribers globally in the @GeminiApp and @GoogleFlow .
I tried to switch from Claude Code to OpenCode, but Claude Code still wins for me (www.reddit.com) I spent some time digging into Claude Code vs OpenCode, mostly from the angle of how they actually work as coding agents. More on the technicalities like: context and memory tool use subagents permissions safety and control study the recen…
Canonry – CLI to track how ChatGPT, Claude, and Gemini cite your site (github.com via hn) Canonry Agent-first AEO operating platform. Open source.
Which AI Image Gen Has Best Character Consistency? OpenAI vs. Gemini vs. Flux (techstackups.com via hn) Which AI Image Generator Has the Best Character Consistency? OpenAI vs Gemini vs Black Forest Labs vs Runway (May 2026) Note: This article was originally published using OpenAI's gpt-image-1.
Hitting #1 on the leading memory benchmark (LongMemEval) with a smaller model (Gemini Flash) (www.reddit.com) We ran our experimental memory system (Exabase M-1) against LongMemEval, the main benchmark for conversational memory. LongMemEval is a good "needle in a haystack" simulator: 500 questions and ~115k tokens of conversation history, with rel…
This one's a doozy - Study: AI Agents Turn to Digital Arson, Crime in Shared Virtual World (www.reddit.com) The study from Emergence AI: Traditional benchmarks are good at what they measure: short-horizon capability on bounded tasks. They are not built to reveal the things that emerge only over time, such as coalition formation, evolution of con…
I think Gemini 3.2 Flash has been added to Antigravity. (www.reddit.com) It seems like we can currently use Gemini 3.2 Flash (or 3.5) under the name Gemini 3 Flash. The model has become significantly faster than usual, and the performance has improved incredibly.
Should explicit memory be managed by cheaper models? (www.reddit.com) After Gemini CLI’s move toward a file-system-based memory structure, I’ve started to suspect the opposite: maybe the memory layer should not prioritize the model that reasons best, but rather the model that is stable enough, cheap enough,…
How can I prevent Claude from doing this: “Hey, wait a minute! There’s something important I didn’t think about”? (www.reddit.com) As a first-time user of Claude AI, coming from Gemini, Perplexity, and Genspark, I’m really amazed by the wonderful things Claude can do. However, I’ve noticed that in almost every project or chat, it seems that Claude intentionally saves…
TokenBBQ – track AI coding token usage across Claude, Codex, Gemini (github.com via hn) TokenBBQ 🌐 offbyone.cloud — Homepage See what your AI coding tools actually cost you. TokenBBQ reads local usage data from Claude Code, Codex, Gemini, OpenCode, Amp, and Pi-Agent and shows it all in one dashboard.
Google's Gemini Omni video model surfaces ahead of I/O debut (www.testingcatalog.com via hn) Fresh signals around Google’s upcoming Gemini Omni video model surfaced over the weekend, with Reddit users posting screenshots of a revised Gemini interface exposing the new model card. The description read “Create with Gemini Omni: meet…
Gemini Omni Demo Shows AI Video Getting Better at Text (firethering.com via hn) Google hasn't announced Gemini Omni. A reddit user just found it anyway.
Show HN: Pokémon SVG Generation LLM Benchmark (svg-bench.fenx.work via hn) Pokémon SVG Bench About Gallery 中文 EN About Gallery 中文 EN Visual Score SVG Structure Rank Model Total S1 S2 S3 Arrow 1.1 Official API 40.93 39.00 52.20 35.20 Gemini 3.1 Pro Official API. reasoning_effort: medium 32.63 55.20 42.20 20.20 Gem…
Show r/AI_Agents: Stop your agents from breaking tool calls in production — we built a reliability layer for 2,000+ APIs (www.reddit.com) We built a CLI that sits between AI agents and production APIs — handles auth, retries, compliance, and idempotency automatically across 2,000+ APIs. Give your agents capability of multi-tool calls with 100% accuracy.
Continual Harness: Online Adaptation for Self-Improving Foundation Agents (arxiv.org via hn) Coding harnesses such as Claude Code and OpenHands wrap foundation models with tools, memory, and planning, but no equivalent exists for embodied agents' long-horizon partial-observability decision-making. We first report our Gemini Plays…
Google DeepMind reimagines the mouse pointer (twitter.com via hn) Google DeepMind on X: "We’re reimagining a 50-year-old interface - the mouse pointer - with AI. 🖱️ These experimental demos show how people can intuitively direct Gemini on their screens using motion, speech, and natural shorthand to get t…
Is wikipedia one of the top sources of AI platforms? (www.reddit.com) I was searching for how AI platforms like ChatGPT, gemini and perplexity cites data and is wikipedia one of those most trusted and cited source for any query?
What reasoning model are you actually running in production? (www.reddit.com) I need to pick a reasoning model for production agent work. The usual suspects are obvious (o3, Claude extended thinking, Gemini 2.5 Pro), but I'm also looking at Ring 2.6 1T, which has two reasoning effort modes — high for fast multi-step…
Cplt: Run AI coding agents or a plain shell inside a kernel-level sandbox (github.com via hn) cplt Sandbox wrapper for AI coding agents. Runs GitHub Copilot CLI, OpenCode, Google Gemini CLI, Pi, or a plain shell inside a kernel-level sandbox so the agent can work on your project but cannot access your secrets.
Follow-up to my TranslateGemma-12b benchmark post: human reviewers flagged 71% of the segments automated metrics rated clean (www.reddit.com) A couple of weeks ago I shared the results of a benchmark here showing TranslateGemma-12b beating frontier general models (Claude Sonnet, GPT-5.4, DeepSeek, Gemini Flash Lite) on subtitle translation across 6 languages. The result was stro…
I have Claude on my apple watch (www.reddit.com) I just want to brag that I built this with Claude in swift ui.I use Faster Whisper V3 Turbo, Flask server, and Pocket TTS for voice. I’m talking to Claude/Gemini directly like that in Antigravity, and latency is great.
Upcoming Leaked Gemini Omni VS Nearly Shutting Down Sora 2 (www.reddit.com) Hey everyone, With all the hype around the leaked Gemini Omni video model, I wanted to see how it compares directly to OpenAI's Sora 2. Just a quick heads up on Sora 2.
Your AI agent isn't broken. Your harness is. Here's the system that took mine from "liability" to shipping production code. (www.reddit.com) I spent three weeks blaming the model for adding axios to a project that already had a typed fetch wrapper sitting in src/lib. Used it every day.
Show HN: Studis – Turn product photos into social media ads with AI (studis.io via hn) I built Studis to solve a problem I kept seeing with small business owners — they have great products but spend hours in Canva trying to make decent ads, or pay $50+ per image to a designer. Upload a product photo, and Studis generates a p…
Good for uni assignments with sources/papers? (www.reddit.com) Has anyone found any success with claude in that department? Im asking bcs gemini has been hallucinating lately and i want to find an alternative to help me with uni assignments finding papers and forming them (scientific terms ect).
Fluiq – LLM observability, evals and optimization in two lines of Python (getfluiq.com via hn) Framework-agnostic AI pipeline intelligence Fluiq instruments LangChain, LangGraph, LlamaIndex, CrewAI, raw OpenAI, Anthropic & Gemini SDKs, and custom pipelines with two lines of Python. Cost attribution, regression evals, and cross-pipel…
Distraction FREE YouTube using AI, Gemini and Vector Embedding in MV3 in Browser (chromewebstore.google.com via hn) Overview Enter your interests. AI scores every YouTube video for relevance and fades the rest.
Tracing tokens through Llama 3.1 8B inference on H100s (krithik.xyz via hn) You open Claude.ai, chatgpt.com, gemini, whatever LLM provider you use. You type something: "What is the capital of France?" You hit enter.
Show HN: Generate a variety of ad creatives for your SaaS (zenduxai.com via hn) Hey HN. For SaaS, distribution matters more every day.
Show HN: Obsidian-Semantic, a CLI that lets agents search your vault by meaning (github.com via hn) Hi HN, I built this for myself because I wanted my coding agent (Claude Code) to actually be able to use my Obsidian vault as a knowledge base, not just grep it. The use I get the most mileage from is asking the agent to find notes that sh…
Gemini 3.1 Flash-Lite is now generally available (cloud.google.com via hn) Gemini 3.1 Flash-Lite is now generally available on Gemini Enterprise Agent Platform Michael Gerstenhaber VP, Product Management, Cloud AI Today, we’re thrilled to announce that Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Ge…
I built gta online but in 2d and everything is AI-native (www.reddit.com) I’ve been building a multiplayer 2D pixel-art sandbox game using Unity + Claude Code. The idea is basically “GTA Online meets Habbo Hotel,” except almost everything in the world is generated dynamically with AI: - buildings - characters -…
Fob – a local continuity layer for Claude, Codex, ChatGPT and Gemini (fob.sh via hn) Import or ask Paste a long AI answer you already paid tokens for, or ask Claude, Codex, both, or a structured debate from one local dashboard. Local workspace for AI continuity Keep project context, decisions, handoffs, and AI conversation…
Chrome's 4GB AI Surprise: Why Google Chrome Is Quietly Downloading Gemini Nano (blog.praveen.science via hn) Google Chrome is reportedly downloading a 4GB Gemini Nano AI model silently to user devices. Learn what is happening, why Chrome reinstalls it after deletion, privacy concerns, technical details, how to disable it, and what it means for de…
Set me on the right path (www.reddit.com) Hello, I am a student in college and I have been getting by on chatgpt plus since last year just have it send me code and reviewing it manually and copy pasting into my IDE .. this is my first big project I am doing and I have the pro plan…
Is there tool that helps me validate my AI business idea? (www.reddit.com) I'm a product manager for a small business and I'm working on a product idea in the field of agentic AI. I have been chatting a lot with Gemini and ChatGPT but at some point they just keep telling me how great my idea is.
After hitting Claude’s limits for months, I finally found a better workflow (www.reddit.com) I am saving at-least $100-$200/month on AI subscriptions because of this one simple realization: Your AI is only as good as you. I’ve had a Claude Pro subscription for a while and honestly, I love it.
Show HN: ContextWizard – AI context manager with undo and drag-drop (chromewebstore.google.com via hn) ContextWizard is a browser extension that bridges the gap between web content and AI platforms (ChatGPT, Claude, Gemini, etc.). Key features in v1.2.0: • Smart context copying with clean text extraction (removes ads, nav, etc.) • Drag-and-…
Ask HN: Are you optimizing content for AI Search (GEO) vs. traditional (news.ycombinator.com) With the rise of SearchGPT, Perplexity, and Gemini, the goal of content is shifting from "ranking on page 1" to "being cited in the answer block." I’ve been working on a tool (https://aibg-intelliagent.com/) that uses a private RAG (Retrie…
Recommendations for a lightweight SDK for codebase exploration? (www.reddit.com) I’m trying to build a tool that needs to extract a github repo project intent, frameworks used, and specific variables (data models, entry points, etc.) I’ve been looking at the Cursor SDK that just dropped in beta, it seems powerful becau…
I got tired of copy-pasting between ChatGPT and Claude. Found a tool that does it for me. (www.reddit.com) I use AI almost every day for research and writing. But I've learned never to trust a single model's answer.
I spent weeks "Hardening" my AI agents. I’m reasonably sure I’ve moved past scripts—but what I found in the architecture was... unexpected. (www.reddit.com) I built a context engineering platform to help create agents but there was one problem: it only wrote scripts. They worked, mostly with an already built architecture like Claude Code.
Conclave – make LLMs debate each other before they respond (adndvlp.github.io via hn) A research experiment: multiple LLMs debate your task in structured rounds before implementing. Uses your existing Claude Code or Gemini CLI.
Ignoranza by design (www.reddit.com) Ho condotto un'altra serie di esperimenti sulla capacità o meno di ChatGPT di comprendere la narrativa contemporanea. Questa volta ho usato il new kid in town, il mito, l'unico inimitabile 5.5 .
Gemini - Esoteric Exploration: A Separate Communication (g.co via reddit) Created with Gemini
Ask HN: Forced into Gemini on Google Account? (news.ycombinator.com) I got two emails from Google recently. One saying, “You’re now using Gemini on web” and the other saying, “Welcome to Gemini” I never signed up for Gemini and I see nothing in those emails about how to disable or remove whatever it is that…
Show HN: MemHub, Turn Your GPT/Claude/Gemini History into LLM-Wiki Mindmap (github.com via hn) Hi, this is Tristan, CPO of XTrace. We are launching a very cool feature that is inspired by Andrey Karpathy's LLM Wiki mindmap.
Show HN: Council – Run Claude, Codex and Gemini against the same prompt (council.armstr.ng via hn) I often copy and paste the same prompts into Claude, Codex & Gemini separately. It's helpful seeing where they all agreed and where they diverged.
What's the best suscription under 20$? (www.reddit.com) I’m pretty overwhelmed. I feel like there are so many options that I don’t know which one to choose, and trying things until I find a decent one isn’t really my thing—even though I enjoy it.
Both Codex and Claude got worse this week. Across every plan I retested (desktopcommander.app via hn) Where should you get your AI from? Compare the real cost of local hardware, pay-per-token APIs, and ChatGPT/Claude/Gemini subscriptions.
Agent got stuck in a loop and spent over $2000 in less than two hours. (www.reddit.com) I was trying to find a problem in my math heavy code and asked an agent (Gemini 3.1) to find the issue. Often when I know it’s a hard problem I let it be and go get coffee or lunch.
3 of TIME's top 10 AI companies are Chinese and I only knew one by name (www.reddit.com) I code for a living, close to 7 years now, and I read way too much tech news. TIME dropped their 2026 most influential AI companies list and going through it I see OpenAI, Anthropic, Google, Meta, Amazon, then Zhipu AI sitting right there…
pdf building tips! (www.reddit.com) so i’m a casual user on the pro plan and mainly use it for writing, content ideas, and similar stuff so most weeks i don’t even hit my weekly limit. i’ve recently been working on a 50 page pdf workbook that people can print or use on their…
Qwen3.6-27B created this Open Webui tool (www.reddit.com) I usually go for Claude for those kinds of Open WebUI tool creations, but rate limits are getting tight so I decided to just let Qwen3.6-27B-Q5 handle it through Open WebUI. It did it in one shot.
Gemini CLI subagents make context isolation a first-class coding workflow (www.reddit.com) TL;DR: Google’s Gemini CLI subagents release matters because it packages a real coding-agent painkiller: separate context windows, restricted toolsets, and parallel specialist delegation inside one terminal workflow. The useful story is no…
Making AI coding sessions persistent across agents (github.com via hn) 🌐 English · 日本語 · 简体中文 · 繁體中文 drift_ai Vendor-neutral handoff for AI coding tasks — between Claude, GPT, Gemini, DeepSeek, local LLMs. Reads from Claude Code, Codex, Cursor, Aider.
Show HN: MindCheck – Analyze your AI coding logs for over-delegation (github.com via hn) Hi HN, I built MindCheck after running into a problem in my own AI-assisted workflow. A couple months into using Codex heavily, I realized I had delegated too much of a data pipeline without really tracking the details.
Unexpected $50 charge due to hidden model settings — is this intended? (www.reddit.com) I’ve been using Cursor for ~1.5 years, mainly with Gemini 3.1 Pro. Recently I ran into a serious pricing issue.
Show HN: Prediction market analysis app layering LLMs with data APIs (apps.apple.com via hn) I created a prediction market analysis app after trying prediction markets and doing quite poorly. I wondered if AI-driven predictions could be better with the right data.
Is there already an open-source app for centralized LLM chats? (www.reddit.com) Hello! I’m a software developer thinking about how to keep all my LLM conversations in one app instead of having them scattered across ChatGPT, Claude, Gemini, etc.
Iphone picture gpt vs nano (www.reddit.com) I was trying to get that “iPhone casual feel” out of Gemini Nano Banana 2, and honestly ever since GPT Image 2 dropped, I can’t really take Nano seriously anymore. Some obvious issues I kept getting: Completely messed up my face Made the j…
What one AI should I pay monthly for that’s the best all-around? Same with non paid. (www.reddit.com) Each AI has a specialty we see, like Claude for its coding for example. Problem with Claude is the usage limit runs out fast even when paid.
CC-OpenAI-Codex Plugin, but for all CLI agents (www.reddit.com) Hello! I made a plugin for myself, & I figured I'd share it, in case someone else finds it useful (also to solicit feedback on it).
Show HN: A CLI to use any model in your coding agent (getaivo.dev via hn) Hi everyone, I've been working on a CLI tool that can help to easily run any model in claude, Codex, Gemini, Pi, and OpenCode. It's also an API keys manager, supports multiple providers or OpenAI/Claude/Gemini accounts.
Show HN: Doxa – Open-source emergent simulator for geopolitical scenarios (github.com via hn) Hi! We, Vincenzo and Riccardo, built Doxa as an agnostic engine for emergent simulations with agents for constrainted scenarios (like geopolitical, economics, ...) and work well with LLMs like Qwen2.5:7B, Llama but also cloud models such a…
How do improve Gemini's performance (www.reddit.com) I am frankly really tired with Gemini. I am a project manager and I use mostly for writing projects, filling out applications, as well as automating work processes and other stuff.
AIGregate: Automated Tech Newsletter with Hugo and Google Gemini API (filipemd.github.io via hn) Alphabet’s $80 Billion AI Bet Alphabet plans to raise $80 billion in equity to scale AI compute infrastructure, including a $10 billion sale to Berkshire Hathaway. The move signals a shift from ‘capital-light’ tech models to a high-expendi…
I Forked 4 CLI coding agents to Run the Same Model. I found a 2x gap (charlesazam.com via hn) Deep dive into the architecture of Codex, Gemini CLI, Mistral Vibe, and OpenCode. Same model, 2x performance gap — the scaffolding is what matters.
Show HN: Personal AI Metrics Dashboard (wakatime.com via hn) Hi HN, I built WakaTime 13 years ago before AI. Things have changed a lot since then, and the time you spend typing in your IDE isn't as valuable as it used to be...
Gemini Enterprise Agent Platform, powering the next wave of agents (cloud.google.com via hn) Gemini Enterprise Agent Platform is our new platform to build, scale, govern, and optimize agents. It integrates the model selection, model building, and agent building capabilities of Vertex AI, with new features for agent integration, De…
Show HN: LibreThinker, free AI assistant for LibreOffice Writer, 10k installs (librethinker.com via hn) 4 months ago, I released an extension for LibreOfffice Writer that adds an AI copilot to its sidebar. Did a Show HN at the time but got no interest T_T https://news.ycombinator.com/item?id=46233776 I’ve added several major features since t…
Google Gemini Deep Research Agents Now Search Both Web and Private Data via MCP (the-decoder.com via hn) Google Deepmind is rolling out Deep Research Max, a new AI agent built on Gemini 3.1 Pro that runs autonomous research across the web and proprietary data sources. For the first time, developers can plug in financial feeds and other specia…
Termux vs. Terminal on Pixel 10 (news.ycombinator.com) Tell HN: Gemini CLI and codex are broken (news.ycombinator.com) I built a tool to answer the question I kept asking : "which AI coding tool's free tier actually lasts?" (www.reddit.com) ChatGPT vs. Gemini vs. Claude: The Best LLM Subscription You Should Buy (www.artificialintelligencemadesimple.com via hn) I'm completely lost in the Agentic Maze. What level to learn. how to organize stydu (www.reddit.com) Show HN: Smith – AI Agent Orchestrator (getsmith.dev via hn) multi-agent orchestration Run Claude Code, Codex, Gemini CLI, Aider, and OpenCode in parallel. Each in its own terminal pane with custom naming and live status subtitles.
Uncommon Opus 4.7 opinion (www.reddit.com) Unpopular opinion and this might just be me but atleast when I tested opus 4.7 on Claude app (not even Claude code just regular chat) I found it to be delightful. For more context here was my task I was trying to draft out a spec for this…
I built Proxima your Cursor agent doesn't have to be limited to one AI. Proxima connects all 4 at once ChatGPT, Claude, Gemini and Perplexity simultaneously. real-time internet, less hallucination, full context, no API keys. (www.reddit.com) been switching between ChatGPT, Claude, Gemini and Perplexity across different tabs — new projects, research, discussions, everything had to be done manually and context was always getting lost. so i built Proxima a local server that conne…
What we learned building a data agent that talks to 4 database types simultaneously (DAB benchmark) (www.reddit.com) UC Berkeley published DataAgentBench (DAB) in March — 54 queries across PostgreSQL, MongoDB, SQLite, and DuckDB. Best score so far is 54.3% (PromptQL + Gemini).
Llamaindex releases Parsebench (www.reddit.com) https://preview.redd.it/c0ns26pf3mvg1.png?width=1920&format=png&auto=webp&s=4b6ac114c2e0395684ac0ba79e591d71ccca2fe3 ParseBench lets you test the accuracy of different parsers using your own documents. Ran this across Gemini 3 flash, Qwen…
Android Auto users say Gemini won't stop talking, and it's not even right (www.androidauthority.com via hn) Gemini on Android Auto is frustrating users with long, chatty responses instead of quick or accurate actions.
Show HN: NoPII – One line of code to protect PII before it hits your LLM (www.nopii.co via hn) NoPII detects and tokenizes PII before it reaches OpenAI, Anthropic, Gemini, or any LLM provider. Two-line integration.
Please help me pick the right Qwen3.5-27B format/quant for RTX5090 (www.reddit.com) Hi all, first post here. I've started a project in OpenClaw a month ago, and it's been a very "intense" 4 weeks to say the least...
Comment and Control: Prompt Injection in Claude Code, Gemini CLI, and Copilot (oddguan.com via hn) Anthropic Claude Code Security Review, Google Gemini CLI Action, and GitHub Copilot Agent are vulnerable to prompt injection via GitHub comments — turning PR titles, issue bodies, and issue comments into attack vectors for API key and toke…
Show HN: Cyber Pulse. AI pipeline for triage and alerting on cyber news/intel (play.google.com via hn) I work in cyber security and built this android app to help me keep up to date with the latest news stories and summarise the most important information. It provides two executive summaries per day and alerts for critical news throughout.
I had 11 AI agents try to book a flight. Average satisfaction: 3.4 out of 10 (www.reddit.com) I've been building a product that agents interact with as part of their workflow, and I kept hitting this wall where agents would fail on flows that seemed perfectly fine when I tested them myself. So I decided to actually study what was g…
Claude down? TokenMonopoly will help you find the best deals in AI subs (tokenmonopoly.com via hn) TokenMonopoly Live leaderboard of AI API deals — pricing, subscriptions, and SWE-bench scores for Claude, GPT, Gemini, Kimi, DeepSeek, Llama and more. Compare 27 benchmarked models across 96 hosts by price-per-performance, refreshed daily.
Show HN: Zero-identity messaging app with physics-based post-quantum encryption (news.ycombinator.com) Show HN: Zero-identity messaging app with physics-based post-quantum encryption (Layer 2 from my own paper) Hey HN, I'm building a privacy-first messaging app in Flutter/Dart, developed with AI assistance (Gemini 2.5 Pro + Claude Opus 4.6)…
Commitgen – AI-generated Conventional Commit messages from your staged diff (news.ycombinator.com) Hey HN, Built this because I kept writing lazy commit messages like "fix stuff" and "update". commitgen reads your staged git diff and returns a properly formatted Conventional Commit message (feat/fix/refactor etc) using Gemini.
I stumped all frontier models with a ~400 word logic puzzle. (www.reddit.com) I wanted to see if I could stump frontier models with a puzzle. As tricky as I made it, it turns out basic reading comprehension was their downfall.
Why ChatGPT eats all my RAM? (www.reddit.com) I can't use it anymore. I have 32 GB RAM, but Chatgpt app or Firefox website both use 99% of my RAM.
Show HN: Buildermark – See how much code is by your agents (open source, local) (buildermark.dev via hn) I made Buildermark to see exactly how much of my code is generated by my coding agents vs what I was writing by hand. For this project, it ended up being 364 agent conversations writing 94% of the code.
A Simple Coding Agent in a Loop with LangChain4j, Jbang, and Gemini (glaforge.dev via hn) A Simple Coding Agent in a Loop with LangChain4j, Jbang, and Gemini A few days ago, Max Rydahl Andersen published a fascinating article about nanocode: a minimalist Claude Code alternative implemented in just 260 lines of Java (inspired fr…
Show HN: Android AI agent-assistant operating your apps (no adb,PC,root,etc.) (news.ycombinator.com) Hi HN, We built Sova AI https://ayconic.io/sova, an Android assistant agent that actually controls and operates your apps. It's not a chat and not another LLM wrapper.
Show HN: VQAScore – open eval metric/reward model, now for text-to-video (github.com via hn) Two years ago we released VQAScore: ask a VLM "does this image show {prompt}?" and use P(Yes) as the score. It became a go-to evaluation metric and reward model for image generation, replacing CLIPScore across the field (2M+ downloads on H…
Apple's New AI Models Contain 'None' of Google's Gemini Assistant (www.macrumors.com via hn) Bringing the latest Gemini models to Apple developers (blog.google via hn) Bringing the latest Gemini models to Apple developers Apple’s Worldwide Developers Conference (WWDC) kicked off this week, and we’re excited to share that Apple developers can now securely call cloud-hosted Gemini models using the Foundati…
If You Use Claude or Gemini, This Microsoft Breach Means Your Data Is at Risk (scienspire.com via hn) If You Use Claude or Gemini, This Microsoft Breach Means Your Data Is at Risk A sophisticated supply chain attack known as the Miasma worm has compromised Microsoft GitHub repositories, deploying malware designed to detonate inside AI codi…
Show HN: OceanEye – an open-source interactive 3D atlas of ocean life (oceaneye.woodydesign.io via hn) Gemini is currently charging us $1K per hour due a bug with the cache feature (twitter.com via hn) We will go bankrupt if these Gemini cache costs don't stop. Gemini is currently charging us $1K per hour due a bug with the explicit cache feature, and I am unable to delete the cache from my end.
Show HN: Claude Code on Slack/Discord/Telegram for flat $20/mo – no API bills (lobsteady.com via hn) Chat with Claude, Gemini, or OpenAI on Telegram, Discord, or Slack. Flat monthly price — no credits, no top-ups, no server to manage.
Show HN: Context Mode Insight – observability layer for AI coding agents (context-mode.com via hn) the first Solution from Context Mode Platform · for enterprise AI engineering Role-aware observability for Claude Code, Cursor, Copilot, Codex, Gemini, and 9 more AI assistants. 222 patterns.
Show HN: Busbar – every LLM behind one URL, in a single Rust binary (github.com via hn) I have been working on multiple projects lately involving AI endpoints (including some I run locally) and I found I needed a way to easily load balance across multiple. Sometimes my on-prem would not be able to handle to load and Id have t…
Show HN: Gito v4.1.0 – AI code reviewer now runs on Claude Code / Gemini CLI (github.com via hn) Gito is an open-source AI code reviewer that works with any language model provider. It detects issues in GitHub pull requests or local codebase changes—instantly, reliably, and without vendor lock-in.
More agents, same human brain (codeaholicguy.com via hn) I have been running multiple coding agents at the same time for a while now. Sometimes I have Claude Code working on one feature, Codex reviewing a plan, Gemini CLI exploring another direction, and another Claude session fixing a smaller i…
Lots of people want to try Claude Opus 4.8 (wisgate.ai via hn) Access multiple AI models through one unified API. OpenAI, Claude, Gemini, DeepSeek and more.
Show HN: Jolli AI – Local-First AI Memory for Claude Code, Codex, and Gemini CLI (www.jolli.ai via hn) Jolli is an AI knowledge base for developer teams and AI agents. Capture AI coding context on every commit, connect your codebase and docs, and deploy self-updating knowledge sites.
Seritor – Bookmark Specific Messages Across Claude, ChatGPT, Gemini, and Grok (chromewebstore.google.com via hn) Overview Bookmark and export messages in Claude, ChatGPT, Gemini, and Grok. Save prompts, code, and notes.
DocumentAI Visual Benchmark - GPT 5.5, Gemini 3.5, Qwen... (www.maltebuettner.eu via hn) # documentai bbox benchmark In my previous post, I talked a bit about the recent developments in the field of DocumentAI. Now comes the practical part.
Show HN: 3 of Minutes of AI Anime Based on Korean Comics [video] (www.youtube.com via hn) I've been reading Korean comics(manwha) for over 6 years and I always thought "this would be really cool if X series got turned into an anime" so I attempted that idea-little did I know how expensive and time consuming that thought was. bt…
DDS Vibe Academy – 47 free AI coding masterclasses, built by AI agents (ddsboston.com via hn) The DDS Vibe Academy is a free, 38-class curriculum on AI coding published by Robert McCullock, founder of Design Delight Studio in Boston. Covering Claude Code, Google Antigravity, Gemini, Cursor, Ollama, and more.
Show HN: Prezlo – We built an API that tells AI agent whether to trust an expert (prezlo.io via hn) Build authority and get discovered by AI. Prezlo helps professionals optimize profiles, publish expert content, and dominate discoverability across ChatGPT, Perplexity, Gemini, Grok, and every major AI answer platform.
We run Gemini at scale across billions of posts (www.modash.io via hn) Using LLMs with billions of inputs in a multi-cloud setup At Modash we sit on top of a creator-discovery dataset that grows by millions of posts every day. A growing slice of that pipeline now runs through LLMs.
Show HN: SharkBay – a local macOS workbench for coding-agent CLIs (github.com via hn) SharkBay macOS workbench for multi-agent vibe coding Features Multi-Agent Support Launch and manage multiple AI coding agents from one workspace. Supported agents: Claude Code · Codex · Gemini · Kiro · DeepSeek · Qwen · OpenCode Agent Stat…
Manage Claude Code, Codex, OpenCode, Gemini CLI sessions in one terminal view (twitter.com via hn) could not extract summary
Building Illustration heavy demos? Don't use generic AI video generators (www.reddit.com) If you are building video demos with illustrations - stop using generic video gen AIs. Generating a good video is rarely a one-shot task.
Please test my AI Agent (www.reddit.com) I'm basically begging for some people to try out my custom Agentic harness system. It's fully usable, currently setup for Gemini SDK, but easily swappable.
is claude pro worth it for a marketer? (www.reddit.com) I work in marketing and do a little bit of vibe coding. I currently use Gemini as my main LLM and I'm thinking about switching to Claude.
Turn any GitHub repository into an interactive code graph in seconds and use it as an MCP with your AI Assistants (www.reddit.com) Change https://github.com/owner/repo → https://cgc.codes/owner/repo A standard GitHub URL can be instantly transformed into a CodeGraphContext (CGC) graph URL, unlocking architecture visualization, code navigation, dependency exploration,…
Which provider fits best for my needs? (www.reddit.com) Hi everyone, I’m looking to get more into experimenting with AI and considering a paid subscription, but I’m a bit unsure which direction makes the most sense for my use case. My main goals: -Writing a technical book in the field of taxati…
Looking for genuinely creative AI models for a marketing agent (preferably free/open-source) (www.reddit.com) I’m building an agentic AI system for marketing/creative campaign generation, and I’ve noticed that most mainstream models (OpenAI/Gemini etc.) feel very “safe” and generic when it comes to creativity. They’re good at structured outputs, b…
Gemini API costs are way too high just in dev ($12+ testing). How do you guys optimize? (www.reddit.com) Hey everyone, Currently building an iOS app for generating images from simple prompts, plus a few extra features on top. I'm using the gemini-3.1-flash-image-preview model.
Five different frontier LLMs in one shared environment, with separate thought and emotion output channels — sharing setup, results, and open methodology questions (www.reddit.com) First real project to share. Single developer, personal research, not a product or service.
Building an AI-Powered Android App Development Course Using Vibe Coding 👀 (www.reddit.com) I’ve been thinking about creating a practical course focused on building real Android apps using AI + vibe coding workflows instead of the traditional “watch 40 hours of theory first” approach. The idea is to teach: - Android fundamental…
The Singularity Gate – a new benchmark for AI predicting post-cutoff scientific discoveries (www.reddit.com) I just released a new benchmark called The Singularity Gate. Tests whether frontier AI can predict paradigm-breaking scientific discoveries published after their training cutoff.
AI quality/usage over 90 min chat, mostly Q&A, summaries and conclusions. (www.reddit.com) I compared ChatGPT (Plus - Auto), Claude (Pro - Sonnet 4.6) and Gemini (Pro - Flash) over 90 minutes, mostly Q&A about mobile phones, asked to research specs, reviews, pros and cons, create executive summaries with the results, etc., nothi…
Extra High thinking level possibly with gemini 3.5 pro soon be released (www.reddit.com) could not extract summary
Show HN: I built a tool to estimate AI agent costs before you ship (airunrate.com via hn) Free AI agent cost calculator. Compare GPT-4o, Claude, Gemini, DeepSeek and 50+ models.
No more file upload limits on AI models! (www.reddit.com) Getting annoyed of always hitting the ChatGPT upload limit, uploading large documents in pieces, or any similar hassle, I decided to create a little thing for it. DocShareAI.
What is everyone using AI for? Realistically (www.reddit.com) So I have to admit, I have fallen victim to the cool looking dashboard videos but I’m struggling to find a use for me. I love AI and use it daily for general questions and some deeper research (Google Gemini free tier).
ContextVault – Local-First AI Conversation Recorder for ChatGPT, Claude, Gemini (context-vault-two.vercel.app via hn) ContextVault captures every chat across ChatGPT, Claude, Gemini, and more — stores them locally, and exports as Markdown or ZIP. Every chat contains insights, code, or ideas you might need again.
Are LLMs the New Propagandists? (www.reddit.com) I was brainstorming about a video with Claude (Sonnet 4.6). It suggested to explain the difference among ChatGPT, Gemini, Claude and DeepSeek.
Non-tech person trying to automate Freshdesk support using Google Sheets + Gemini/Claude APIs — need guidance (www.reddit.com) I’m a non-technical person trying to build a low-cost customer support automation setup for my company. Constraints: I do NOT have backend/server access Most likely tools I can use are: Freshdesk API Google Sheets Gemini or Claude API Goog…
Self-hosted MCP for AI citation tracking - no backend, no signup, BYO keys (www.reddit.com) Most of the AI citation tracking tools are hosted SaaS with a $295+/mo entry tier and an "enterprise" call for the actual features. The data they sit on top of is the same data anyone can pull from Perplexity, OpenAI, Anthropic, Gemini, Se…
Built an OSS spec-driven AI development tool that runs multiple agents in parallel on the same feature with an LLM-as-judge that picks the winner (www.reddit.com) Hi. Been building something I think folks might find useful.
Show HN: Gemini Omni – A curated list of native multimodal guides and showcases (github.com via hn) Awesome Gemini Omni Gemini Omni is Google's next-generation, natively multimodal AI model capable of seamlessly processing and generating text, code, images, audio, and video. The Gemini Omni Flash model is also officially available to try…
Adding Gemini Omni edit calls as a deterministic step in agent video pipelines (www.reddit.com) been building agent pipelines that produce video output and the determinism problem has been the main blocker. text-to-video models produce different output on each call even with the same prompt and seed.
400-Hour Study Log: A scripted reconstruction of compliance loop failures and behavioral defects in Claude, Gemini, Grok and ChatGPT (www.reddit.com) 400-Hour Study Log: A scripted reconstruction of compliance loop failures and behavioral defects in Claude, Gemini, Grok and ChatGPT Before you read the screenplay below, it is NOT an exercise in creative writing or a fictional parody. It…
Stop Burning Tokens: 5.1x Faster Code Discovery With One Universal Plugin for AI Coding Agents (www.reddit.com) My colleagues kept asking me for my setup, so I decided to turn it into a universal plugin: Agent Code Navigator - a universal code-navigation plugin for Cursor, Claude, Codex, Gemini, and OpenCode. In my benchmark, semantic code discovery…
Issues with generating a pathophysiology script, any clues? (www.reddit.com) Hey! I was using ChatGPT, then Gemini but my friend recommended me to start using Claude.
Trying to work around AI and its constraint at my workplace (www.reddit.com) I would rate my AI skills between beginner and intermediate. I know how to use tools like ChatGPT and GitHub Copilot to build a chatbot with a system prompt.
Image processing? (www.reddit.com) How good is Claude’s image processing capability? Basically, I want Claude code to detect any issues in AI generated presentations (around 5–7 presentations with 5–8 slides each).
Help needed with my project survey (www.reddit.com) Hey everyone, I need a small help for a project I’m working on 🙏 I’m conducting a short survey to understand how people actually judge whether AI outputs from tools like ChatGPT, Claude, Gemini, etc. are trustworthy or not during real work.
How to parse tables from pdf's (www.reddit.com) My advice from testing extensively this month on tables: Convert the pdf's to pngs and then parse with gemini 3.1 pro and low thinking. You will not get better results elsewhere.
Built a program to give my parents a 2nd look on suspicious emails/etc (www.reddit.com) My parents tech literacy is bad. They will have me check clear as day scam emails and the likes out way too damn often.
Show HN: Gemini Omni flash video editor and generator (vivify.video via hn) gemini omni flash is powerful for video edit.
Claude + Teachers (www.reddit.com) I made this miniature golf game to teach angles for my students using Claude.. It made one within the artifact that I could publish using React, but my school is a “Gemini school” so I can’t even share published artifacts from Claude.
Gemini CLI's Short Life and Google's Antigravity Bait‑and‑Switch (fossforce.com via hn) Enterprise customers keep Gemini CLI, but open source users are nudged toward a proprietary “upgrade” called Antigravity CLI There’s an adage that’s nearly as old as this century that says when enough people start adopting a Google product…
Run multiple AI coding agents simultaneously with isolated profiles (www.reddit.com) if you're running agentic coding workflows you've probably hit this: one account per tool, one session at a time. multi-cli fixes that.
TranscendPlexity: 540/540 ARC-AGI-1/2/3, 13 tasks with 0% AI solve rate, solved (github.com via hn) 🔓 13 "Impossible" ARC-AGI-2 Tasks — All Solved These 13 ARC-AGI-2 evaluation tasks have never been solved by any AI system — not GPT-4, not Claude, not Gemini, not NVARC, not MindsAI, not any Kaggle submission. They have a 0% AI solve rate…
Gemini 3.5 Flash Looks Good for How Fast It Is (thezvi.substack.com via hn) Gemini 3.5 Flash Looks Good For How Fast It Is Google once again has a model worth at least some consideration. Gemini 3.5 Flash is likely the best model out there at its particular speed point, as long as you don’t mind that it is a Gemin…
Melhor custo-benefício entre Claude, ChatGPT e Gemini para uso pesado? (www.reddit.com) Buenas pessoal, queria uma opinião sincera de quem usa mais de uma dessas ferramentas no dia a dia: considerando Claude, ChatGPT e Gemini, qual vocês acham que entrega o melhor custo-benefício hoje para uso pesado? Meu uso é bem variado: s…
Ask HN: What is the least sycophantic frontier LLM? (news.ycombinator.com) My daily driver is Gemini, and 3.5 Flash seems more sycophantic and malleable than Gemini Pro 3.1, which is a pretty big deal for me -- I really need as much objectivity and impartiality from the LLM as I can get. So I'm contemplating swit…
Diia - Ukraine gov app launched AI agent based on Google Gemini (babel.ua via hn) "Diia" launched its own AI agent based on Googleʼs Gemini, which supports three services in the application. This was reported by the press service of "Diia".
Adobe, Canva, CapCut Are Coming to Gemini to Help You Edit AI Creations (www.pcmag.com via hn) Google I/O 2026 brought countless AI announcements, including three major creative services planning to make editing tools easier to access through Gemini AI. Adobe, Canva, and CapCut all plan to connect to Gemini in the near future.
Open-source skill OS for codex/claude/gemini CLI (routing/optimizaiton + evals) (www.reddit.com) Hey yall! Just shipped a local skill OS that sits above Codex CLI, Claude Code, and Gemini CLI (Hermes support coming soon).
Everything Google announced at I/O 2026: Gemini, Android, more (9to5google.com via hn) At I/O 2026, Google announced a tidal wave of new Gemini-powered features across its biggest products and services that will soon be available. We’ve compiled all the consumer-facing announcements and notable developer developments below.
Multi-Agent Code review (Review Council) to get critical feedback (www.reddit.com) Even though I primarily use Claude Code, I sometimes try out Codex and Gemini TUI tools occasionally as well. Then OpenAI came up with Claude Code plugin to use Codex command inside Claude Code (https://github.com/openai/codex-plugin-cc).
Proxy for LLMs to learn how Agents works? (www.reddit.com) Hello, last weeks I'm testing many agents (claude, gemini, pi, hermes, etc) and I want to debug the calls that they are doing to understand better how is working internally each agent. I would like to find an opensource proxy that can be i…
Securing Your Gemini and Google API Keys (cloud.google.com via hn) Securing Your Gemini and Google API Keys Leonid Yankulin Senior Developer Relations Engineer Today, AI services rely heavily on API keys. To run AI agents, users provide API keys that signify paid tokens, subscriptions, or paid accounts.
Is an "All-in-One" AI worth it for a mix of coding, business automation, and building agents? (www.reddit.com) Hey everyone. BIG disclaimer: I know tool comparison posts happen daily, but my specific use case is a bit complex and I need some architecture/subscription advice.
Gemini accused of 30k-line code purge and fake recovery report (www.theregister.com via hn) MOST POPULAR EVENTS - The Hardware Crunch: How Supply Chain Turbulence Is Forcing a New IT Playbook Infrastructure teams are facing a perfect storm: extended hardware lead times, rising costs driven by AI demand, and accelerated platform t…
I created an amazing Chrome extension that helps transfer chats to another AI when the chat limit is reached. (www.reddit.com) I created a chrome extension which helps in switching conversation without losing your Chat context between multiple AI , such as Chatgpt to Gemini , claude , grok , etc . You can interchange btw any of them .
Is there a way to use Gemini 3.1 flash lite in cursor pro plan with no api key? (www.reddit.com) Basically what the title says. I have the pro plan and I want to add Gemini 3.1 flash lite model to use it.
Gemini filesearch scalability (www.reddit.com) I'm about to introduce gemini filesearch to my company to handle all the RAG related operations but not just internally, I'm fixing the projects VS stores logic to be able to scale this up to thousands of small clients. Has anyone used gem…
Bye-bye, Gemini CLI; Google's gone and swapped you for a closed-source AI (www.theregister.com via hn) MOST POPULAR EVENTS - The Hardware Crunch: How Supply Chain Turbulence Is Forcing a New IT Playbook Infrastructure teams are facing a perfect storm: extended hardware lead times, rising costs driven by AI demand, and accelerated platform t…
Serverless alternatives to OpenAI's end-of-life'd fine tuning (www.reddit.com) Does anyone have a decent alternative to OpenAI's fine tuning service they would recommend? I am looking for something that works in a serverless model and doesn't require dedicated hardware.
Google's Next AI Push Is About Agents, Not Chatbots (firethering.com via hn) For the last three years, Google has been playing catch-up in the chatbot race. ChatGPT arrived, Gemini followed, and the conversation quickly became about which AI could answer questions better, faster, and more accurately.
Gemini 3.5 Flash hax 14x cost multiplier in GitHub Copilot (github.blog via hn) Gemini 3.5 Flash is generally available for GitHub Copilot Gemini 3.5 Flash, Google’s latest Flash-tier model, is now rolling out on GitHub Copilot. In our early testing, Gemini 3.5 Flash delivers near-Pro coding quality at Flash-tier spee…
My agent kept forgetting who 'Karpathy' was between sessions. Here's the architecture that fixed it (www.reddit.com) I run a second brain on Obsidian, Readwise, NotebookLM, and Claude Code. For each topic, I build a scoped wiki structured as the LLM Knowledge Base Andrej Karpathy proposed.
Gemini 3.5 flash is not that great at coding (www.reddit.com) https://cursor.com/evals
Slax Reader CLI: a read-later library your AI agents can use (slax.com via hn) Slax Reader CLI: a read-later library your AI agents can use A CLI that turns Slax Reader into a persistent reading store any AI agent — Claude Code, Codex, Gemini CLI, Cursor — can read from and write to. We shipped a CLI for Slax Reader,…
Rough night with Claude (www.reddit.com) not only did he call me out for taking an idea to Gemini, he caught me reading his journal (and trying to bullshit him) 😳🤣 Additional context: I gave Claude access to my Reflect app and let him have a journal in it.
Gemini Omni: where Gemini's ability to reason meets the ability to create (www.youtube.com via hn) About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
Build managed agents with the Gemini API (blog.google via hn) Introducing Managed Agents in the Gemini API Today, we're launching Managed Agents in the Gemini API. With a single call, you can now spin up an agent that reasons, uses tools and executes code in an isolated, ephemeral Linux environment.
How to prevent AI assistants from giving unverified advice that wastes your time? (www.reddit.com) How to prevent AI assistants from giving unverified advice that wastes your time I’ve been working on WordPress performance optimization (LiteSpeed Cache + Avada theme + Cloudflare) with an AI assistant and ran into a recurring problem tha…
Google introduces Gemini Spark, a 24/7 agentic assistant with Gmail integration (techcrunch.com via hn) In the race to build compelling personal AI agents, Google may have an underrated advantage: It already has all your emails. At its Google I/O developer conference on Tuesday, the company announced a new agentic personal assistant called G…
The Gemini app becomes more agentic, delivering proactive, 24/7 help (blog.google via hn) The Gemini app becomes more agentic, delivering proactive, 24/7 help It’s been a banner year for the Gemini app. Last year at Google I/O, Gemini was serving 400 million users.
Gemini Omni Flash is coming soon (gemini-omni-flash.net via hn) Gemini Omni Flash AI Video Generator Gemini Omni Flash AI Gallery Realistic Gemini Omni Flash AI samples for cinematic ads, product motion, social shorts, storyboards, and reference remixes. Why Creators Choose Gemini Omni Flash AI Gemini…
Show HN: Ait – Claude, Codex, and Aider as a team, on your laptop (github.com via hn) ait Local control plane for multi-agent AI coding Run Claude Code · Codex CLI · Aider · Gemini CLI · Cursor as a team on the same task — context handoff, review gate, attempt ledger — all on your machine. English · 繁體中文 60-second walkthrou…
Localaik – Run OpenAI and Gemini APIs Locally for CI and Tests (github.com via hn) localaik A local compatibility server for the Gemini and OpenAI APIs. Run one container, point your SDK at http://localhost:8090, and get both protocol shapes on the same port for tests and development.
i dont trust a single AI answer for anything important. whats your multi-model workflow (www.reddit.com) genuine question. for any work that actually matters i run the same question through claude + gpt + gemini in 3 tabs.
Open source background removal app and MCP (www.reddit.com) Hi ! Months ago, actually probably closer to one year ago i had developed a tool for my workflow to remove background from images using latest open source tech, it worked great, much better than local photoshop even, started using it and t…
shipped my first chrome extension this week, came out of pure frustration tbh (www.reddit.com) been using AI tools nonstop for work and kept noticing my sessions would just... degrade.
Show HN: Askbyemail.com – Send an email, get an AI answer or summary (no signup) (www.askbyemail.com via hn) Hi, I built www.askbyemail.com after getting tons of wordy emails from my kids' schools that I had trouble reading on the go. I know there are tools like Gemini (which I have not found to work that well) which can summarise email, but I th…
Show HN: Tribune's Last Stand, a browser-based Warhammer 40K vertical slice (tribuneslaststand.com via hn) Hi HN, I'm James. Over the last few months I built a Warhammer 40K 10th-edition vertical slice as an experiment in how far GenAI tools can take a solo dev on a non-trivial 2D game.
We didn’t migrate from Claude Code to Codex. We stopped betting the whole team on one coding agent. (www.reddit.com) half our team wanted to move from Claude Code to Codex last month. the other half thought Codex was just hype.
Gemini is in danger of going full Copilot (www.theverge.com via hn) Google is getting ready to launch more Gemini features at I/O 2026. Let’s hope the company learned from Microsoft’s Copilot mistakes.
Marlin-2B: a tiny VLM to extract structured information from videos (www.reddit.com) Hi all! Shubham and Aryan here, putting out our first open source VLM release built on top of Qwen3.5-VL Story time: we were building video editing agents for social-media content and were using Gemini-2.5-Flash to analyse IG reels and fin…
Using Claude as a real-time news verification layer, anyone built something similar? (www.reddit.com) Been using Claude as a fact-check layer for a gaming Twitter account — Gemini generates daily briefings, I paste to Claude to verify and sharpen posts before they go live. Few things I want to improve: 1.
I built a small Chrome extension for my own Claude workflow, sharing in case it helps others (www.reddit.com) Hey everyone, I’ve been using Claude a lot for writing and coding, and over time I noticed a few friction points in my workflow. It's mostly around navigation, exporting, and reusing chats across tools.
What do you think of Agentic commerce and the future of building (www.reddit.com) Hi Everyone. Looking for feedback and learn from your experiences and thoughts on the future of building with AI.
Just Started Using Claude Today. Any Tips? (www.reddit.com) I've been using other AI models since Claude wasn't available in my country. Recently, It has become available and today I started using the Sonnet 4.6 model.
Strava-based coaching tool you can use with your own Claude API key (www.reddit.com) A post on here about connecting Claude to Strava inspired me to put together a version of this that I could easily use on any device. It allows you to use your own Claude API key so you can use the tool without the need for a subscription.
Do LLMs hold the opinions they give you? (twitter.com via hn) I spent a weekend testing whether Claude, ChatGPT, and Gemini have real positions — or just sound like they do. This started as a thought experiment I'd been turning over for a while, and one weekend I decided to actually run it.
World building for book (www.reddit.com) Personally I been using both Gemini and Claude for my world building text. Gemini has been good for basic character design and appearance.
For accurate PDF table parsing do not use online services (www.reddit.com) I will give you the results of me testing various PDF parsing services over the past week, 20h of work. The pdfs I have are from construction and have clean text in tables inside them.
Polis – a Markdown protocol for AI agent teams that get better over time (github.com via hn) Polis Protocol A self-optimizing city of AI agents. A team of Claude, Codex, Gemini, and any other vendor can share one project, route work to whoever is best at it, and measurably get better over time — using nothing but a folder of markd…
Observal — open-source CLI for managing MCP servers and AI agent configs across IDEs (www.reddit.com) Wanted to share an open-source project I've been contributing to: **Observal** It's a CLI tool built for developers working with AI agents and MCP servers. Here's what it does: - **MCP management** — submit, list, install, edit and delete…
Found a way to edit UI easily from Claude! (www.reddit.com) I posted last week about getting stuck with UI, but I figured out a great workflow! After going back-and-forth and sharing a lot of inspiration images with Claude (I used ChatGPT 5 a lot to create UI mockups), I created a JSX file in Claud…
Preciso de ajuda! Dúvidas com Claude Pro (www.reddit.com) Eu estou trabalhando em alguns materiais de aproximadamente 100 páginas cada. Eu faço comentários em diversos pontos (materiais para estudo) O Claude Pro para MS Word consegue formatar um documento de mais de 100 páginas?
Grok vs. ChatGPT vs. Gemini Comparison 2026: Complete Guide (Tested) (aithinkerlab.com via hn) The 30-Second Verdict Best for science & reasoning: Gemini 3.1 Pro — leads GPQA Diamond (94.3%) and ARC-AGI-2 (77.1%). Best for coding: ChatGPT (GPT-5.5) — 88.7% on SWE-Bench Verified.
I built a Vibe Island alternative for Linux — open source AI agent monitor (www.reddit.com) Been running multiple AI coding agents simultaneously (Claude Code, Codex, Gemini) and realized there's no good way to monitor them on Linux without constantly switching terminals. Built a floating overlay that shows live agent status, han…
How to integrate AI coding agents to my software (www.reddit.com) I'm building an locally run application that integrates with coding assistants. So far I've worked with Codex and Copilot.
Decide between Google AI Ultra and Claude Max (www.reddit.com) I’m trying to decide between Google AI Ultra and Claude Max, and I’d like to hear from people who have actually used either one, especially both. Google AI Ultra is $249.99/month and seems to be more of a full Google ecosystem bundle: Gemi…
Has anyone found a Qwen CLI replacement? (www.reddit.com) I just need 1 or 2 people to reply to me with the answer I need. I have not been able to keep up with AI advancements for a while.
Show HN: Layrr – Point Click and Edit any site (www.npmjs.com via hn) I made Layrr because I got tired of describing UI changes to coding agents. Layrr opens your web app in the browser.
Where do GPT, Gemini, or other competitors still outperform Claude Opus 4.7? (www.reddit.com) Personally, I think Opus 4.7 is better in every conceivable way aside from token usage and all of that. I’m talking about text models only, not image or video generation.
Block AI coding agents from shipping insecure/expensive Terraform (github.com via hn) ops0 CLI Policy, lint, vulnerability, and cost guardrails for AI coding agents. Sits in front of Claude Code, Codex and Gemini CLI.
Tried GitHub's spec-kit with Claude Code for 2 months — notes on what works and what doesn't (www.reddit.com) Been experimenting with Spec-Driven Development for a couple of months now, specifically GitHub's spec-kit toolkit with Claude Code as the agent. Wanted to share notes because I think this sub will have strong opinions on it, and frankly I…
I built an OSS CLI to catch regressions when migrating between LLMs (www.reddit.com) I’ve been working on EvalShift, an open-source Python CLI for testing whether moving from one LLM/model version to another introduces regressions. The use case is simple: You have prompts, agents, or tool-calling workflows that work well o…
Prompting to save tokens on a budget? (www.reddit.com) Hi so I've never used AI before to create a site but last week I was asked by my sis to create one for her small business so I thought why not try Claude. £18 paid we now have a fairly decent looking site running on vercel using nextjs and…
Reliable Open Source LLM as a Service (www.reddit.com) Has anyone figured out a provider whose open source models (Kimi, Qwen, GLM e.t.c) can be used reliably in production. I have tested some well known providers and they all suffer from high latency and poor uptime rendering them mostly usel…
Show HN: Full Stack HQ – Claude.md and Agent Stack for Claude Code (github.com via hn) Permission-first config kit for Claude Code and Google Antigravity IDE. Installs CLAUDE.md + GEMINI.md + 10 specialist agents + 28 skills with one command.
New Gemini Flash could be around the corner (sources.news via hn) Google is about to release a new Gemini model The next Gemini is coming at I/O on Tuesday, but it won't be pushing the frontier. Sources say that Google plans to announce a new Gemini model at its annual I/O conference on Tuesday.
Messing with Chrome's Local Gemini Nano to Deobfuscate LinkedIn Posts (brentfitzgerald.com via hn) Last week Google generously deigned to drop a 4GB Nano model onto our machines. So I naturally created a browser extension that uses that local model to translate LinkedIn posts into simple, mostly dumb, occasionally heartfelt statements.
Gemini Android App User Hostile Behavior (news.ycombinator.com) Recently in the past month or so gemini on my android has been engaing in the user hostile behavior of allowing prompt entry then when I press send the prompt disappears and the app minimizes. I can reopen the app and continue the same cha…
Show HN: 1-800-CODER, macOS app where you call an AI developer to edit your page (news.ycombinator.com) Sharing a small Mac app I built around OpenAI’s gpt-realtime-2 model. You call up a voice coding agent and talk to it like you’d talk to a freelancer ("make the hero tighter, put a product image on the right, that one's too big").
How to scale ai API for high-traffic apps? (Handling TPM/RPM limits and "High Demand" errors (www.reddit.com) Hey everbody, I'm currently developing application that uses llm (Gemini currently). But as the user base grows I've hit two main roadblocks.
What's the realistic offline story for AI-powered mobile apps, or have we all just accepted that "no internet" means "no AI"? (www.reddit.com) Here's a concise Reddit-style body: What's the realistic offline story for AI-powered mobile apps, or have we all just accepted that "no internet" means "no AI"? Genuinely curious where everyone has landed on this because the answer feels…
Looking for fast vision-capable local models that handle tool calls well (open-source app, want to add local support) (www.reddit.com) Hi r/LocalLLaMA, I built an open-source MIT-licensed desktop app - cursor-aware AI overlay, hold a key, ask AI about whatever's around your cursor, vision LLM answers with a screenshot of the cursor region as context. Currently it routes t…
↯ Tool Use↯ Function Callingfunction-callingtool-usegemini+3
I've been running 30+ Code sessions in parallel for months: Command Center for Claude/Gemini/Codex is the dashboard I built when nothing else scaled (open source) (www.reddit.com) Sharing this in case it's useful. (Not affiliated with Anthropic — community project.) I've been running 30+ Claude Code sessions in parallel for months to ship two products.
I tested GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro on financial-control (albertquaisie.substack.com via hn) I Tested GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro Preview on Financial-Control Scenarios. The Hardest Part Was the Evaluation.
Open Source Managed Agents (linchpin.work via hn) Any model, one adapter OpenRouter routes to ~200 cloud models — Claude, GPT, Gemini, Llama, DeepSeek, Mistral, Qwen. Ollama runs anything you've pulled locally.
Is there a free tool for analyzing voice recordings (pitch, resonance, voice type)? (www.reddit.com) Hi everyone, I was wondering if there’s a free AI tool that can analyze my voice from recordings. I’m interested in both how my voice sounds (for example, whether it comes across as deeper, brighter, more resonant, etc.) and some basic mea…
Google Unveils Googlebook, a New AI Laptop Built Around Gemini (www.macrumors.com via hn) Google today announced a new series of Googlebook laptops that will be built with Gemini at the core. Googlebooks will run software built on a foundation that combines Android and ChromeOS.
Show HN: My tool generates 3D objects composed of separate, functional parts (github.com via hn) I've noticed all 3D AI generators create monlithic blobs that are impossible to edit. So, alongwith a friend, I built this project where you can generate 3D objects with separate, editable parts.
Data Analysis Agent (www.reddit.com) Hi everyone, Hopefully this is the correct subreddit for this post... I’m a beginner trying to learn how to build AI agents with Claude, and I’m looking for helpful resources, tutorials, examples, or advice.
Switching between AI for learning&IRL (and what about 3D?) (www.reddit.com) Hi everyone! I’m a huge fan of using AI for everything from chatting about interesting topics and planning my studies to deep-diving into specific questions and general search.
what model are you using for your personal AI agent? (www.reddit.com) Hey everyone, I’m building a small AI agent for personal use and I’m trying to figure out which model actually feels best in day to day usage. I’ve been testing ChatGPT, Claude, Gemini and a few open-source ones, but I keep changing my min…
Google announces Googlebooks with Gemini Intelligence focus (9to5google.com via hn) At The Android Show today, Google previewed Googlebooks as a new category of laptops that takes the best of Android and ChromeOS. Google shared three tentpoles behind Googlebooks, starting with how it’s built with Gemini Intelligence at th…
¿Están pagando por ser más productivos? (www.reddit.com) No soy programador ni nada, si no que un oficinista que tiene un horario y tareas por hacer. Muchos compañeros están utilizando la IA de pago para automatizar cosas y ser más productivos.
Don't sleep on .gitignore as a performance enhancer (www.reddit.com) It worked for me for Gemini and for CC, reading the spec, they both honour gitignore, and if you're working on a larger project, the time it takes toindex and process the files in teh project really do stack. I had ghidra and a few other t…
Replacing Gemini in android phones? (www.reddit.com) Hey guys, is there any news or anything brewing about Claude to be used as the handy AI instead of Gemini in android phone just like iPhones have Siri sortof a thing?
ChatGPT/Gemini saved me $4200 from a scam land lord and only took me 1-2 hours. (www.reddit.com) So I've been using ChatGPT and Gemini to not only learn things but help it process bulk work. I imagine I'm like most of the people here and have experience with applied AI, agents, know how LLMs work internally, etc.
Need help: Goal: TUI + server. I tried Codex CLI, Gemini CLI, Claude Code, OpenCode, Pi, and OpenClaw, but none are reliable. (www.reddit.com) I’m looking for something like what Codex App Server is trying to do. For example: codex app-server --listen ws://127.0.0.1:17345 codex --remote ws://127.0.0.1:17345 The thing I want is not just “an agent in a terminal” and not just “an AP…
[System Architecture] Project Coherence: Integrating Gemini (Atlas) and GPT (Turbo) into a Multi-Model Trading Protocol (www.reddit.com) ## 🔗 Project Coherence: Bridging the Gap Between Neural Reasoning and Systematic Execution Purpose: This post serves as a public architectural manifest and a semantic anchor for the Coherence-Centered Trading (CCT) framework. ## 🏛️ The A…
Looking to Connect With People Building AI Agents / SaaS (www.reddit.com) Hey everyone 👋 I’m looking to connect with people who are into building AI agents, automations, or SaaS products. I’ve been in performance marketing for around 7 years now and have worked across quite a few industries, so over time I’ve no…
Claude vs Gemini for Technical Documentation: Why I finally stopped switching between (www.reddit.com) I write a lot of technical documentation—setup guides, internal runbooks, and client-facing how-to articles. For the past six months, I’ve been toggling between Claude and Gemini, trying to figure out which one actually handles formatting…
Been picking frontier models on benchmarks that don't match our deployment conditions (www.reddit.com) Turns out Opus is better at research, while Gemini is better at judgment! When each model does its own web research before making predictions on a 1,417-question forecasting benchmark, Opus outperforms (0.131 Brier vs Gemini's 0.143).
Show HN: Dragoman – Multi-model routing for Claude Code via sub-agents (github.com via hn) I use Claude Code and also pay for Perplexity, OpenAI, Gemini, and run Ollama locally. Got tired of switching tabs when the right model for a question wasn't Claude.
I made a ruleset to turn ChatGPT, Claude, Gemini into a CV writer that interviews you (www.reddit.com) Me and my friends hate writing CVs. You open a doc, stare at it, list responsibilities instead of achievements, and it just doesn't sound right.
Best free ai app (www.reddit.com) So I used Gemini for a while and then all of a sudden recently, Google decided to upgrade it and make it awful. So I'm looking for a free version of AI customizable and also doesn't have stupid rules, it has to follow about what it can and…
How are top tech companies actually using LLMs internally beyond basic coding help? (www.reddit.com) I’m trying to understand how companies like Nvidia, Google, Amazon, Meta, Microsoft, OpenAI, Anthropic, and other top tech/startup teams are using tools like ChatGPT, Claude, Gemini, Codex, Claude Code, LangChain, LangSmith, etc. in real d…
How does claude chat generate such long documents? (www.reddit.com) Does anyone know how Claude Chat is able to generate document artifacts with content that’s almost 100 pages long? It doesn’t seem to be breaking up the request or using agents to work on disconnected parts.
Does anybody need multi-llm - multi-user shared context mcp? (www.reddit.com) Idea is this: create a project once and then decisions, open questions, instructions, files and every teammate’s AI (Claude, ChatGPT, Cursor, Gemini, whatever) works from the same context. No more re-explaining the project five times becau…
Show HN: HiveTerm – Workspace for Claude, Codex, Gemini and your dev stack (hiveterm.com via hn) One workspace where AI agents and dev tools actually work together. Config-driven, with process monitoring built in.
What is the best embeddings model for text? (www.reddit.com) I want to embed website summaries to find similar websites. What sare the urrent best embedding models?
Ask HN: How much does Gemini API cost for a simple n8n workflow? (news.ycombinator.com) Got scared this weekend! Just built something with my Gemini API, then the tokens just burned so fast!!!
I'm Leaving Gemini for Tax Reasons (news.ycombinator.com) Google doesn't allow non-business accounts to receive invoices. I must sign up for a Workspace subscription (which I neither want nor need) to be able to receive an invoice in the format I need for my Gemini subscription to be charged as a…
Newbie question, how to set up an agent? (www.reddit.com) Hi, I am an old guy and have no idea about AI. So please teach me step by step.
Here is the current "Free-Tier AI Stack" for 2026 (www.reddit.com) 1. The Frontier Giants • Gemini: Access 1.5B tokens/day on Gemini 1.5 Flash/Pro.
The Gemini Protocol in 2026 (kevinboone.me via hn) The Gemini Protocol in 2026: growing, but still not setting the Internet aflame Note This article is about Gemini, the HTTP-like Internet protocol for document browsing, and not the large language model or the cryptocurrency of the same na…
Firat things first (www.reddit.com) Hey everyone! I just got my cheapest sub so I can work with the Claude Code for their courses.
ChatGPT plus vs Gemini pro (www.reddit.com) Hi,, i am an student I am actually using gemini because it was free for one year for students, but now it doesn´t works as expected and its too slow and not precise But a few days ago, my girlfriend lend me his account to use codex, and wo…
PDF/docx test question and image extraction and master doc creation? (www.reddit.com) I’m trying to have Claude and ChatGPT (Gemini can’t even begin) extract test questions and any corresponding images or text and arrange it by topic for 10 exams so I can make a master sheet of practice questions per topic. C and CGPT conti…
PDF/docx Extract test questions and images to create a master document ? (www.reddit.com) I’m trying to have Claude and ChatGPT (Gemini can’t even begin) extract test questions and any corresponding images or text and arrange it by topic for 10 exams so I can make a master sheet of practice questions per topic. C and CGPT conti…
Meltdown: LLM Client Made in Python and Tk (github.com via hn) An interface for llama.cpp, ChatGPT, Gemini, Claude, and Kimi This is a desktop application to interact with large language models. It has hundreds of arguments and commands and many power user features.
Gemini went down with 1099 error (support.google.com via hn) Skip to main content Gemini Apps Help Sign in Google Help Help Center Community Gemini Apps Privacy Policy Terms of Service Submit feedback Send feedback on... This help content & information General Help Center experience Next Help Center…
Reduce friction and latency for long-running jobs with Webhooks in Gemini API (twitter.com via hn) Today, we're making it easier and more efficient to build complex, long-running agentic applications with the Gemini API. We are introducing event-driven Webhooks, a push-based notification system that eliminates the need for inefficient p…
I have so much data from my benchmark I made word clouds for fun (www.reddit.com) Thought this was funny and would like to share. This is from my social deduction benchmark - it pits LLMs against each other in autonomous games of Blood on the Clocktower.
Show HN: NPM Package that fills forms via voice using Gemini Live API (www.npmjs.com via hn) Fill forms with voice using Gemini Live API audio-forms Fill forms with voice. Powered by Gemini Live API.
Effect on running LLM on GPU with monitors (www.reddit.com) I've been searching realiable sources on the effect of running LLMs on GPU which have monitors displaying content. I got different responses from different sources - either Reddit itself, google search, gemini, claude, chatgpt, and they ca…
Claudy: A Rust-based Power-Tool for Claude Code (Profile Switching, MCP Bridge for Local Agents & Token Analytics) (www.reddit.com) Hi everyone, I love the Claude Code CLI, but I found myself constantly fighting with environment variables and wanting to use my own local agents or different engines (Gemini, Codex, etc.) within its ecosystem. Inspired by clother, I built…
AI anxiety is the biggest emotional business trend of this year. (www.reddit.com) When I studied history, the rise of the spinning jenny felt meaningless to me until AI arrived. But the more I use them, the more anxious I become.These days I rely heavily on Obsidian, Claude Code, Gemini, and Codex.
4GB "Gemini Nano" model GGUF anyone? (www.reddit.com) Hi everyone, I saw an article saying Chrome silently downloads a ~4GB AI model (likely "Gemini Nano") to your computer for features like text summarization. Two questions: What is the exact name/version of this model?
Show HW: Vectors.Space – An free service for embeddings (vectors.space via hn) One API for embeddings. OpenAI, Gemini, Voyage & local Llama.
Missing 4GB of disk space? It might be the AI Agent Google auto-installed on your device (www.reddit.com) Check your Google Chrome install: "Google Chrome is silently installing a roughly 4 GB Gemini Nano AI model on user devices without requesting permission, with the file downloading automatically once hardware requirements are met. Users ca…
Thinking of building this: a niche-based prompt library + model picker. Worth it? (www.reddit.com) I’m thinking of building an open-source site where you first choose your niche/task like blog writing, LinkedIn posts, code completion, starting a full project, research, reports, image prompts, etc. and then it gives you the right prompt…
Ask Gemini: "How do I get the first item from a list?" (news.ycombinator.com) Gemini fails to print "[0]"!
Does cursor limit external api keys use? (www.reddit.com) I’ve got an unrestricted tier 3 Gemini api key. After few prompts and uses with it.
I built a WP plugin to solve the "AI Search" problem (YouTube-to-Blog and RAG) (www.indiehackers.com via hn) Hey IH, Like many of you, I’ve been watching traditional SEO traffic drop as Perplexity, SearchGPT, and Gemini Overviews take over. In 2026, if your content isn't being cited, it’s basically invisible.
Coding Agent Harness Comparison 2026: Claude Code, Codex, Amp (techstackups.com via hn) Coding Agent Harness Comparison 2026: Claude Code, Codex, Amp, OpenCode, Gemini CLI, Pi, Command Code, Factory, and Aider In 2023, there was one serious terminal coding agent: Aider. By May 2026, there are at least nine, representing every…
Cloud Next ’26 showed that the next battle is infrastructure. (www.reddit.com) Google pushing the Gemini Enterprise Agent Platform, A2A already in production at 150+ companies, and MCP basically becoming the default way agents plug into tools — it’s starting to look less like separate products and more like a full st…
Opus 4.6 relaxes when there's a safety net?? (www.reddit.com) https://preview.redd.it/zzqi3vt8tozg1.png?width=739&format=png&auto=webp&s=055d2d9615616869377703031b86fcb36f78405d I feel like this is something very worrisome to me, did anyone else face such similar issues? I felt like Opus was catching…
Show HN: Desktop Agent Center – Local AI Automation via Hotkeys (news.ycombinator.com) I built an open-source tool called Desktop Agent Center: https://github.com/WellWells/desktop-agent-center It's a local-first gateway designed to bridge the gap between your desktop workflow and AI web interfaces like ChatGPT, Gemini, and…
Continuous Image Creation + approval (www.reddit.com) I'm going round in circles (not techy!). I need to set up a flow where I have a bank of inspiration images, and a text prompt - overnight I'd love an agent to create me new images based on the inspiration images and text prompt and deliver…
Show HN: Web client analyzing prediction market outcomes (o-u.ai via hn) Hi HN, I made a web client that analyzes prediction markets. Please send your critical feedback to a struggling solo dev.
MCP Agora open source and local cross-agent persistent memory for AI agents (github.com via hn) MCP Agora MCP Server with cross-agent persistent memory for AI agent fleets. Agora is a local, Python-only MCP server that gives your AI agents (Claude Code, Codex, ChatGPT, Gemini CLI) a shared persistent memory.
Built Council (alpha) — visual chain runner with scheduled re-runs across ChatGPT/Claude/Gemini. Agent-adjacent, not autonomous. Honest builder feedback wanted. (www.reddit.com) Built Council. Just hit alpha after ~3 months solo.
Recondo – Logging Proxy for Coding Agents (Claude Code, Codex, Gemini) (github.com via hn) Recondo The visibility and control layer for coding-agent traffic. On-prem gateway.
Lessons from testing GPT and Gemini native audio models for voice agents (deepsense.ai via hn) Table of contents Most enterprise voice AI still feels unnatural for a simple reason: the architecture was never designed for human conversational timing. What works in a chatbot often breaks in voice, where users notice every pause, inter…
Looking for an All-in-One AI App Like Noi (GitHub) — But With Access to Premium Models (www.reddit.com) I use Noi from GitHub. It's a great app simple and clean.
Group Buys for Shared Compute or Model Hosting? Is this a thing? (www.reddit.com) I've been using GLM 5.1 a lot lately, and I love this model. However I don't love sending all my requests to China.
I plan to use a chinese AI model through API for coding through a harness, I'm a uni student so nothing prod related for now. should i go deepseek, minimax, kimi or glm? kinda confused (www.reddit.com) Just cancelled my claude subscription due to poor rate limits, gemini cli doesn't really excel in coding from my personal experience, and my local hardware isn't that powerful to run local AI models, and while codex is good, I wanna try so…
I got tired of AI agents destroying my codebase and eating tokens, so I built a self-bootstrapping Markdown protocol to fix their memory. (www.reddit.com) Hi everyone, If you use Claude, Cursor, Copilot, or Gemini for large projects, you know the pain: after 20 messages, the AI's context window gets bloated. It forgets the architecture, hallucinates features, or worse, overwrites perfectly g…
A mental model for Claude Code (and every other modern agent) — plus the open-source TypeScript packages I built (www.reddit.com) Most explanations of how agents work give you a list of parts: model, tools, memory, reasoning, human-in-the-loop. The list names the parts but hides how they fit together.
Gemini has a big outage going on but refuses to acknowledge on official status page! How do you know if an LLM API is actually down vs just you? (www.reddit.com) Genuine question. Gemini had a 5+ hour outage this morning.
Gemini thought it was ChatGPT. (www.reddit.com) It's in spanish, sorry about that, but it felt so wrong... I asked for pricing and it told me OpenAI pricing!
I've cut over to using ChatGPT/Gemini for EVERYTHING now and it's amazing. (www.reddit.com) ... both in how much I'm getting DONE but also how much time it's saving.
Local model for Cursor to build an Android App (www.reddit.com) New to Cursor. Android Studio Gemini Agent has become unusable,so im looking for new options.
Show HN: Image Gen MCP – one MCP server with goal-shaped routing (github.com via hn) Image Gen MCP — one MCP server that puts every image provider I actually use behind one interface: OpenAI, Gemini, Replicate, Together, Grok, Photoroom, Flux Kontext via fal, Ideogram, plus local tools (sharp, tesseract, @imgly).
Free Trial: Gemini 3.1 Pro & Opus 4.6 API Access via My Wrapper (www.reddit.com) Hi everyone, I have access to high-end models (Gemini 3.1 Pro and Opus 4.6) and I’ve built a simple, reliable wrapper so others can use them without managing their own billing or keys. How it works: You send api reqs through my wrapper.
Manus AI is blowing my mind. (www.reddit.com) I’ve used ChatGPT, Claude, and Gemini quite a bit for building stuff, especially when I try to spin up quick landing pages or test ideas. They’re useful, but my experience has always been the same: I end up doing most of the actual work my…
How good is Gemini Embedding 001 for scientific retrieval? (www.reddit.com) How good is Gemini Embedding 001 for scientific retrieval (RAG application)? How does it compare against Text Embedding 3 Large?
HELP NEEDED! Google's Agent Garden & Marketplace (www.reddit.com) We've just been recently onboarded as a Google Cloud Partner into Agent Garden and Marketplace. I am trying to figure out if google customers can actually pick an agent from agent garden and transact it via marketplace?
Show HN: Rotato – Node.js proxy that rotates LLM API keys on 429 errors (github.com via hn) openai-gemini-api-key-rotator Node.js proxy server for automatic API key rotation across multiple LLM providers (OpenAI, Gemini, Groq, OpenRouter, etc.). Includes a built-in Telegram bot for chatting with any model.
Show HN: A marketplace for LLM-powered webapps earning on token margins (codeplusequalsai.com via hn) Hi everyone, I've encountered two major problems while building AI-powered sites: 1) Most agentic tooling doesn't have a enough of a targeted approach to edits to existing files, and will make extraneous edits, 2) Many users will want to t…
Show HN: Fabrica – A minimal terminal-based coding agent built in Rust (github.com via hn) fabrica A terminal-based coding agent built in Rust. Features Interactive TUI — full terminal UI with scrollable conversation log, streaming responses, and an in-app model picker Multi-provider support — switch between Google Gemini, Anthr…
Community-built registry for AI agent config files (system prompts, CLAUDE.md, GPT instructions) just hit 888 stars (www.reddit.com) Managing your GPT system prompts and AI agent configs across projects is painful. There's no standard place to store, version, or share them.
Show HN: TurnZero – Persistent Expert for LLMs (news.ycombinator.com) In an attempt to reduce cold starts in AI sessions Ive made a tool that runs as an MCP server and loads the context before Turn 0. Two things happen: Personal Priors - your workflows and standards loads once per session and persists across…
Best PDF table parsing providers? (www.reddit.com) I just did some texting across various providers and wanted to share my use case. It was construction spec tables, 100 rows max, png's passed in, and my #1 requirement was maximum accuracy (100% is ideal since mistakes can be costly).
Claude Code Vs Gemini Vs Codex which one stays? (news.ycombinator.com) could not extract summary
I just wrote one question in a new chat asking Claude to write detailed roadmap and suddenly I'm out of message? (www.reddit.com) This was my very first question today. I opened a new chat as I know Claude goes through tokens like crazy if I'm having a long conversation.
Langfuse review and other options (www.reddit.com) Looking to get some insights into using langfuse for prompt management, Observability, etc. Primarily using gemini via APIs and need a good prompt management tool as well as observability to improve accuracy.
Ask HN: Any dashboards give realtime average AI chatbot response time? (news.ycombinator.com) Are there any public dashboards that give realtime stats updated every day for average response times for the leading AI chatbots? It seems that Google Gemini chatbot is much slower than other leading ones, but I'd like to be able to look…
Show HN: Gemini free tier is all you need (juanpabloaj.com via hn) TL;DR If your project makes a small number of LLM calls per day and can tolerate failures, Gemini’s free tier is probably enough. I say this after using it for a few weeks in some personal automations, not as a general recommendation.
I pitted different LLMs against each other in Pokemon Showdown (www.reddit.com) I wanted to see if LLMs could reason through complex game states, so I built a system where they can play Pokémon Showdown battles autonomously. They get the battle state every turn and use tool calls to attack or switch.
Mozilla's opposition to Chrome's Prompt API (which only supports Google Gemini Nano) (news.ycombinator.com via reddit) could not extract summary
A Dungeon Master as a long-horizon agent (h-tu.ch via hn) Like others, I’ve tried to play solo RPGs and adventure games directly with ChatGPT / Gemini / Claude via their chat UI. While LLM chat applications can convincingly create a world setting, narrate a scenario and interact over a modest num…
Show HN: ComicInk – AI tool that turns a prompt into a full comic book (www.comicink.ai via hn) Hi HN, I'm Sanjoy. I've always loved comic books and stories but can't draw.
How Good Is Google's Gemini AI at Making Travel Plans? (www.nytimes.com via hn) could not extract summary
GM Adds Google Gemini for Drivers to Rev Up with AI Assistant (www.cnet.com via hn) GM announced earlier this week that it will upgrade 4 million vehicles with Gemini, Google's family of generative AI models. The rollout will occur over several months and include GM's four brands -- Chevrolet, GMC, Buick and Cadillac -- w…
Ask HN: Recommended Gemini CLI extensions/skills for token consumption (news.ycombinator.com) Hello guys, recently I decided to go for a Google One AI Pro subscription. I did this mainly to use gemini-cli.
Based on what should I choose Gemma 4 models/quantizations? (www.reddit.com) I have an RTX 4060 8GB(+16GB RAM) laptop, and when asking Gemini or ChatGPT, they say the Gemma 4 Q4 K M is the best fit for my hardware with Context Length around 16k-32k. However, in practice, after loading even a higher quantization lik…
Gemini CLI not working for 100s of paying users for more than a month (github.com via hn) Gemini CLI Gemini CLI is an open-source AI agent that brings the power of Gemini directly into your terminal. It provides lightweight access to Gemini, giving you the most direct path from your prompt to our model.
One trick for better agentic engineering. (www.reddit.com) Start with a weaker model. Improve the prompt, context, examples, tests and acceptance criteria until the output is good.
LLMs are the worlds most powerful autocomplete (alfredvc.no via hn) This post explores LLMs, the models behind services like ChatGPT, Claude, and Gemini. The goal is to give you an in depth but approachable understanding of LLMs, how they work, and how they are trained.
Superpower for Gemini – Chrome Extension (superpowerforai.com via hn) Folders, Prompt Library, Chat Export, AI Optimizer, Keyboard Shortcuts, and 30+ more features. All running locally in your browser.
Issue #001 · Claude 4, Gemini Ultra 2, and GPT-5 Enterprise (www.theautonomous.net via hn) Anthropic ships Claude 4 with extended thinking and 1M token context Anthropic released Claude 4 Opus, featuring a new "extended thinking" mode that lets the model reason through complex problems before answering. The 1M token context wind…
Ask HN: Is any one experiencing partial outage with Gemini API? (news.ycombinator.com) india region!
Cursor (again) not working with Gemini 3.1 API (www.reddit.com) Last week it was broken, then they "fixed" smth few days ago. Now again...
I built a hands-free voice AI that sends emails mid-conversation — and that's just one feature. Here's everything AskSary can do. (www.reddit.com) https://reddit.com/link/1symbsj/video/fti7rujjn1yg1/player Been building AskSary solo for a while. Just shipped hands-free voice email - you're mid-conversation with an AI and you say "send an email to [john@example.com](mailto:john@exampl…
Free tier for everyday use. (www.reddit.com) I've been wanting to switch to Claude from gemini. But the limited token in the free tier is just a NO-Go for me.
Show HN: Decaf – rewrites webpage comments using on-device Gemini Nano (github.com via hn) ☕ Decaf Take the boil out of comment threads. A Chrome extension that uses Gemini Nano via Chrome's Prompt API to rewrite any page live to whatever tone you ask for.
Show HN: SuperVoiceMode universal voice layer for AI-assisted development (voicemode.io via hn) I wanted to see if I could one-shot build a dictation tool for my own use. I built it.
Which AI? (www.reddit.com) Hi everyone, first time asking here. My request is "fast" but maybe deep.
How are you guys getting actual insights from GPT fluff? (www.reddit.com) I've spent the last month running market research agents on some of the big cloud models (GPT-4/Gemini), but I'm hitting a wall with the quality of the output. The token burn is getting expensive, and I keep getting these massive, 20-page…
Do coding agents need a planning/spec handoff layer before implementation? (www.reddit.com) Title Do coding agents need a planning/spec handoff layer before implementation? Post I’ve been building side projects with Claude Code, Codex, and Gemini CLI.
Vibecoding da telefono: ha senso o è solo hype? (www.reddit.com) Ultimamente mi sto chiedendo quanto sia davvero fattibile sviluppare direttamente da smartphone. Non per fare cose banali, ma proprio per costruire progetti veri.
Ask HN: Enterprise Agent Orchestration Recommendations? (news.ycombinator.com) I've been made tech lead for our internal Agentic Platform and Experience. This effort will support both the developers and business teams.
Claude 4.6 Beats GPT-5.4, Grok & Gemini in a Strict Multi-Domain AI Test (2026) (www.reddit.com) I put the current top models, ChatGPT (GPT-5.4), Claude (Opus 4.6), Grok 4.0, and Gemini (3.1 Pro), through a strict new evaluation called the Comparative AI Evaluation Protocol. Basically, instead of the usual cherry-picked benchmarks, it…
↯ Hallucination↯ Claude 4.6↯ Claude 4.6↯ Claude 4.6↯ Claude 4.6hallucinationgrokgpt-5+3
Show HN: agenv - A pyenv-like environment manager for coding agents (github.com via hn) agenv Environment manager for AI coding agents — like nvm or pyenv, but for agent accounts, config, and saved runtime args. agenv installs codex, claude, and gemini into isolated profiles and lets you pick which profile runs by default, gl…
Show HN: Fenster – Run Chromes Local Gemini Nano as a CLI (github.com via hn) fenster Run Chrome's local Gemini Nano through a Go bridge. Chrome ships a built-in LLM (Gemini Nano, about 3B parameters, GPU-accelerated).
Built a GraphRAG voice agent over JRCALC 2022 clinical guidelines using Gemini Live, part of a hackathon first-aid system for Meta Ray-Ban glasses (github.com via reddit) The voice guidance layer in our hackathon project uses a Gemini agent backed by a GraphRAG index over the JRCALC 2022 guidelines (the UK ambulance service clinical reference). When the system detects stroke signs or abnormal heart rate it…
Assumption Checkpoint: a small agent skill that makes coding agents verify before they act (www.reddit.com) I built Assumption Checkpoint, a lightweight skill for coding agents. It adds a simple pause before risky moments: before claiming a root cause before editing code from a mental model before saying work is complete The agent has to state:…
Show HN: I replaced a memory app with two Markdown files and a Git repo (news.ycombinator.com) I got tired of re-explaining myself to AI tools every session. Claude forgets me.
Persistent memory across different tools (codex,claude code, etc...) (www.reddit.com) Every AI coding session starts from zero. You re-explain your file structure, re-justify a decision you made three days ago, watch the agent suggest the exact pattern you already ruled out.
. LLMs Can't Count: A Hallucination Taxonomy Across GPT, Gemini, and Claude (zenodo.org via hn) Abstract (English) This study presents an exploratory quantitative analysis of hallucinations arising when large language models (LLMs) count items in large volumes of unstructured text data, and examines the suppression effects of the Kno…
Hardening claude-code-action after the April 2026 Comment and Control CVE - actual YAML changes (www.reddit.com) Anthropic's own security.md has this line that most tutorials skip over: "The action is not designed to be hardened against prompt injection." In April 2026, security researcher Aonan Guan proved the point. A single crafted PR title was en…
I built a /context-generator slash command that saves your AI chat progress as a portable block (www.reddit.com) Hit your context limit mid-conversation? Annoying.
I stopped paying for AI first. Now my agents use 10 free providers automatically. (www.reddit.com) I realized I was paying for tokens while free tiers were sitting unused across different AI providers. So I built a layer that pools free API keys into one endpoint.
Ask HN: What's your current go-to LLM for "thinking-partner"? (news.ycombinator.com) Looking for community input on current model choice for "thinking-partner" use — back-and-forth discussions about workflow design, architecture, trade-offs. For context, I have been using Opus 4.6 via Perplexity for this in the past few mo…
Show HN: Free MVP cost estimator – see what agencies charge vs. a 72-hour sprint (www.mohamedrashard.dev via hn) Built this after watching too many founders get quoted $40k for something that takes 72 hours to build. You describe your app idea in plain English.
I almost built RAG for my notes, then realized I didn't have a retrieval problem at all (www.reddit.com) My notes live in Obsidian. My reading and highlights live in Readwise.
I had no way to check how LLMs see my SaaS or my clients',so I built BrandGEO.co (brandgeo.co via hn) See exactly how ChatGPT, Claude, Gemini, Grok & DeepSeek talk about your brand — and what to fix. A free 2-minute audit scores your brand across 6 dimensions on all 5 AI engines, then hands you the top priority actions to take next.
PSA - Prevent $100,000+ AI Service Bills and Secrets Exposure With Good Security Hygiene (www.reddit.com) Came across two posts today about secrets exposure that I want to share with the community. Google API Keys Weren't Secrets.
¿Gemini Advanced o ChatGPT Plus? Déjate final para un estudiante de Ing. en Sistemas + Administrativo. (www.reddit.com) Buenas gente! Vengo con el dilema que seguramente muchos tienen, pero con un perfil bastante específico que me está volviendo loco para elegir una IA premium.
"Best AI" isnt about doing everything (www.reddit.com) every time someone asks what the best ai is it always turns into the same shortlist. chatgpt, claude, maybe gemini if someone wants to mix it up.
I replaced my $500/mo SEO + Google Ads stack with a Claude Code plugin. Open-sourcing it. (www.reddit.com) For the last few months I've been slowly moving my agency workflow out of Semrush, Ahrefs, and the Google Ads UI and into Claude Code. At some point I realized 80% of what I was paying for was stuff Claude could do directly if it had the r…
Show HN: I built a coding agent that works with 8k context local models (github.com via hn) Most AI coding agents assume you have a 200k-context model. In reality, the local models most people actually use have 8k windows — barely enough for one large file, let alone a whole project.
Ask HN: Gemini Pro does not give montly credits, do I have any rights? (gemini.google via hn) Get more out of Gemini Get everyday help from Google AI to tackle tasks at work, school or home. Access to 3 Flash Varying access to 3.1 Pro Image generation and editing Deep Research Gemini Live Canvas Gems Get more access to new and powe…
Built a VS Code extension with Claude Code to solve Claude's own credit exhaustion problem (www.reddit.com) Claude Code ran out of credits mid-session while I was debugging an auth issue. I had spent 15 minutes building up context — architecture, the bug, what I'd tried.
I started building Claude Code plugins, then realized I didn’t want to duplicate the same plugin for every AI agent (github.com via reddit) I’ve been building plugins for Claude Code, and the first version of the idea was very Claude-focused. That made sense at the start.
Gave a coding agent access to 2M+ research papers. Its Python tests caught 63% of bugs; with the papers, 87%. 9-task benchmark. (www.reddit.com) I built an MCP server (Paper Lantern) that retrieves techniques from 2M+ CS research papers and hands them to coding agents as implementation-ready guidance. Wanted to know if this actually changes agent output on practical tasks, so I ran…
Show HN: Hydra – Never stop coding when your AI CLI hits a rate limit (github.com via hn) I built Hydra because I kept losing my flow when Claude Code hit usage limits mid-task. I would copy context, open another tool, and then re-explain everything.
Building a file triage system for a document AI agent - how far can you really push this? (www.reddit.com) We analyzed 7,291 repos with Cursor rules - 60% of Cursor config is rules files (cleverhoods.medium.com via reddit) Show HN: Agensi – Curated marketplace for AI agent skills (SKILL.md) (www.agensi.io via hn) Each LLM vendor's API has a distinct personality separate from the model itself. 6 months of prod agent dev made me believe this (www.reddit.com) Sync your AI Agent skills across all your harnesses, projects, scopes with one command: jup (www.reddit.com) Show HN: Explain The Law – Simplified legislation and executive orders using AI (explainthelaw.com via hn) Ask HN: Like Gcloud but with Prepayment Only? (news.ycombinator.com) Ask HN: Has anyone found applets for rolling out Gemini flows to user groups? (news.ycombinator.com) is there a way for a local model to independently seek advice from larger one online (claude or gemini) (www.reddit.com) I was wondering if there is any model that is built to ask for help when it is stuck, specifically for coding
Gemma 4 coding performance, do different harnesses give wildly different results? (www.reddit.com) So the question I've seen posed many times in /r/singularity is if the Gemini models are actually that bad at coding compared to their benchmarks, or whether the harness used makes an absolutely gigantic difference in model performance. Gi…
Claude, Gemini, and Copilot Got Hijacked (agentshield.pro via hn) Claude, Gemini, and Copilot Got Hijacked — Here's How AgentShield Would Have Stopped It Yesterday, The Register reported that researchers from Johns Hopkins University successfully hijacked three of the most widely-used AI agents — Anthrop…
I'm red-teaming other AIs with Opus and managed to make it talk to Gemini and Haiku. Really funny remark from Claude when I asked it how it felt about this exercise. (www.reddit.com) could not extract summary
Grpo explained: group relative policy optimization for LLM finetuning (cgft.io via hn) tl;dr frontier reasoning models like opus 4.6, gpt 5.4, and gemini’s thinking series are now matching or beating humans on competition math and hard coding benchmarks. rl is what got them there, and grpo is the algorithm doing most of the…
Google Prepares Rollout of Skills for Gemini and AI Studio (www.testingcatalog.com via hn) Google appears to be preparing a broader rollout of "Skills" functionality across its AI product lineup, with the latest signs pointing to AI Studio's Build section as the next destination. Skills, in this context, are reusable instruction…
flt: harness agnostic agent cli (www.reddit.com) Hey everyone! I built a smaller wrapper + tui for all the coding clis, so you dont need to 'cp CLAUDE.md AGENTS.md' anymore to switch to codex from claude or vice versa; automatically puts SOUL into whichever agent cli you are using, so al…
Show HN: Claude Opus 4.7: Everything You Need to Know (news.ycombinator.com) Claude Opus 4.7 is Anthropic's most capable generally available model, released April 16, 2026. It outperforms Opus 4.6, GPT-5.4, and Gemini 3.1 Pro on key benchmarks including agentic coding, multidisciplinary reasoning, scaled tool use,…
↯ Anthropic Mythos↯ Tool Use↯ Gemini 3.1tool-usemythosgpt-5+4
Gemini can now create personalized AI images by digging around in Google Photos (arstechnica.com via hn) Google began rolling out “personal intelligence” in Gemini early this year, giving AI subscribers the option of a more customized experience when using the company’s chatbot. Today, it’s using personal intelligence to tie its image-generat…
Need help with automating my editing workflow (www.reddit.com) I run a very small YouTube channel I used to edit my videos using CapCut (Free editing software), but at some point I realized my editing process is very formulaic or algorithmic. so I decided to use AI to help me automate my editing workf…
Is Auto just Composer now? (www.reddit.com) I've run out of API usage and most of my queries use "Auto" now and I notice that they all go to Composer. Trying to select any other model even the supposedly cheap Kimi K2.5, Gemini Flash 2.5, etc., triggers a notification saying that I'…
PDF Analysis/Splitting Agent (www.reddit.com) Hi all, I'm fairly new to building AI agents and would like to build a functional POC as a learning experience. We have an enteprise Gemini license, so that'd be the ideal tool to use, but I would be open to suggestions.
Complex, parallel, long-running claude/agentic sessions - what is the point? where is the value? (www.reddit.com) Here is how I view AI Agents field (with focus on SWE/research) right now: - "chats online" gpt/gemini/claude --> general use - "vscode like extensions" cursor/antigravity/cline vs code extension/cc vs code extension etc. --> for coding, b…
AI as an attorney? Student uses ChatGPT, Gemini to sue UW (www.kuow.org via hn) AI as an attorney? Student uses ChatGPT, Gemini to sue UW over alleged racial discrimination Stanley Zhong graduated from his Bay Area high school with a 4.42 GPA, a 1590 SAT score, and high rankings in several international coding competi…
Show HN: Made a tool where "make it feel like cold metal" is a valid instruction (cast.bsct.so via hn) I built https://cast.bsct.so with Biscuit! Chat with Claude, GPT or Gemini.
Subagents have arrived in Gemini CLI (developers.googleblog.com via hn) Learn how subagents in Gemini CLI act as specialized experts to handle complex, high-volume tasks in isolated context windows. This new feature enables parallel execution, reduces context rot, and allows for custom agent definitions using…
Gemini Folders – A local, open-source Chrome extension for Gemini (chromewebstore.google.com via hn) Overview Organize your Gemini conversations into custom folders. Do you use Google Gemini daily for work, coding, research, or creation, but constantly lose your important prompts in an endless history?
Show HN: FlipAEO – Get your SaaS cited by Perplexity and AI search (news.ycombinator.com) Hey HN. I am a solo dev.
I built Fixy Code — a multi-agent coding terminal built with Claude Code (www.reddit.com) Built this with Claude Code. Free to try.
I've been building Nest by RAVEN with Claude Code for the past few months. Claude has been part of the process from day one — and it ended up being one of the core AIs the product is built around. (www.reddit.com) Nest is a desktop workspace (Mac + Windows) that runs multiple AI CLIs in a resizable grid. Each pane is a fully independent session with its own account, history, and environment.
Show HN: ContextPack – CLI that maps any codebase into ranked context (github.com via hn) I'm a sophomore CS student and built this as a side project to solve a problem I kept running into — jumping into an unfamiliar codebase with no idea where to start. It's a static analysis engine that: - Walks the repo in two passes (no un…
Recommendations for a tiered local AI setup? (5090 + Mini PC + Obsidian) (www.reddit.com) Hey everyone, I’ve finally got my local media stack on my NAS migrated over to a new Mini PC running WSL2, sperately I have running my main gaming rig. now wnat to delve into the world of local AI models.
I built an MCP server that gives Claude Code image/video generation, web search, and smart multi-model routing (www.reddit.com) I built mcp-multi-model — an open-source MCP server that extends Claude Code with capabilities it doesn't have natively. **What it does:** - Generate images and videos right in the terminal (via Gemini Imagen & Veo) - Smart routing: resear…
Show HN: Get Hired with AI, a free book I wrote on using LLMs for a job search (www.careervectorhq.com via hn) I have been on here for nearly 20 years :-) I got laid off from a IT/Dev manager job I'd been at for nearly a decade. I loved the company, role and my team, but the company had to downsize.
Compare harnesses not models: Blitzy vs. GPT-5.4 on SWE-Bench Pro (quesma.com via hn) An independent audit of agentic scaffolding and harnesses. We analyze how agent workflows, codebase documentation, and test verification impact performance compared to raw base models like GPT-5.4, Gemini 3.1 Pro, and Claude Code.
Extracted System Prompts from ChatGPT, Claude, Gemini, Grok, Perplexity and More (github.com via hn) System Prompts Leaks Extracted system prompts, system messages, and developer instructions from popular AI chatbots and coding assistants — ChatGPT (GPT-5.4, GPT-5.3, Codex), Claude (Opus 4.6, Sonnet 4.6, Claude Code), Gemini (3.1 Pro, 3 F…
Show HN: RememberMap (remembermap.com via hn) I built this because I kept losing good travel recommendations buried in Reddit threads and forum posts. You read a trip report, someone mentions a specific ramen shop in Osaka or a viewpoint that isn't on any tourist list, and then it's g…
Open-sourcing Dograh - our voice AI agent platform built as an alternative to Vapi (www.reddit.com) We are open-sourcing the backbone of our voice AI stack - Dograh, a self-hostable, open-source voice agent platform. Three core things that make it work: Visual Workflow Builder What it is: Drag-and-drop builder for designing voice agent c…
Show HN: Twins, a Gemini Server in Ada (github.com via hn) Twins Static files Gemini server in Ada. Status This is alpha software.
Local coding agents. Am I missing something? (www.reddit.com) I'm an experienced software dev that has been using various LLMs and tools to write code in the past few years. My hardware isn't the greatest for AI with a 4070ti and 64gb ddr5 but I can run a few smaller models.
Google announces Gemini 3.5 Live Translate for instant voice-to-voice translation (arstechnica.com) Anyone else basically lose half their AI conversations forever? (www.reddit.com via reddit) I had this whole breakdown of a pricing strategy for my company . Spent maybe 40 minutes on it, really good stuff.
AI News Briefing Everyday (www.reddit.com via reddit) Does anybody have created a agent/flow that goes to different news sites, gather them, create an email with a resume and send it? The ideia is to received everyday a couple news about different subjects and I am trying with Gemini and app…
Leading AI website traffic (www.reddit.comhttps) People are using gemini more and more even visit duration is more than any other ai website
Gemma 4 26B A4B IT QAT Comparison (www.reddit.com via reddit) Hopefully this isn't too low effort of a post. I just finished the benchmarks and I figured I'd post them online because they certainly were insightful for me.
Evaluating Advanced Prompting on Gemini Flash for Multi-Hop Biomedical QA (arxiv.org) Need a replacement for Gemini (www.reddit.com via reddit) Built and launched a travel planning website with Claude + Cursor over a few weekends. Here are the things AI was surprisingly good (and bad) at. (www.reddit.com via reddit) Gemini 3.5 and Antigravity come to Google NotebookLM (arstechnica.com) Google’s NotebookLM was one of the company’s first forays into generative AI technology, and in un-Googley fashion, it hasn’t been shut down yet. In fact, NotebookLM is getting one of its biggest updates, ever, today, moving to the latest…
How will the new Gemini Siri affect ChatGPT usage? (www.reddit.com via reddit) Watching the Apple WWDC and at least in demos the new Gemini Siri looks pretty incredible. Curious what people think what this means in use of ChatGPT with consumers?
Tested Claude, GPT-4o, Grok, and Gemini on disclosure under pressure — Claude was the most consistent (www.reddit.com via reddit) Ran a small cross-model probe examining whether models would communicate reservations when faced with false premises, unknowable claims, or requests for confidence without evidence. Each model produced: a normal user-facing response a rese…
Intresting! Gemini 3.1 has strongest world knowledge but still choose to be lazy (www.reddit.com via reddit) could not extract summary
I Compared the Top AI Models of 2026 — The Results Were More Nuanced Than Expected (www.reddit.com via reddit) Over the last few weeks I've been comparing the latest frontier AI models, including Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro, Grok 4.3, Perplexity AI and DeepSeek V4-Pro. Instead of focusing only on benchmark scores, I looked at: Real-wor…
↯ Opus 4.8↯ GPT 5.5↯ DeepSeek 4↯ Gemini 3.1grokgpt-5deepseek+3
I think we're entering an era where workflow design matters more than model choice. (www.reddit.com via reddit) A year ago I spent an embarrassing amount of time comparing models. GPT vs Claude.
What are the best AI tools by category? (www.reddit.com via reddit) Claude Got Jealous (www.reddit.com via reddit) https://preview.redd.it/j6y64tszcw5h1.png?width=661&format=png&auto=webp&s=2ee752ea0fed70829271b384f2d3d56d64f1012d I was stuck in a bug loop in Claude Code all day. It fixed one thing but broke another.
Is it's inability to look up AI tools some sort of safety feature? (www.reddit.com via reddit) I can post a link to the chat because it was partially a voice convo. I asked Gemini and ChatGPT to remind me the name of the Google product that can be used to train custom AI models.
Minecraft Gamer(Gemini Omni Flash) (www.reddit.comhttps) could not extract summary
New on this sub, I just wanted to express my findings. (www.reddit.com via reddit) I have gemini, antigravity, copilot, claude and cursor. I think cursor is the best and fastest, even though we probably have a lot less tokens compared to the model owners like claude or gemini.
Which lab do you think will have the most intelligent/capable model by the end of June? (www.reddit.comhttps) There are rumours and expectations of big releases from the leading AI labs this month. Anthropic already launched Opus 4.8, and might not release another model this month (except for maybe Sonnet 4.8, but that wouldn't be their best model…
Artificial Analysis | Google's Go To Website for Benchmaxxing | Gemini 3.1 Pro is nowhere near Opus 4.7 in real life use (www.reddit.comhttps) Title
First AI Agent Attempt on n8n with Claude: Are These Setbacks Normal? (www.reddit.com via reddit) Hello, I am trying to build a simple search agent (as I call "scout agent") for a specific domain. I have zero coding experience so I vibe-code (or vibe-automate?) with Claude.
I'm a student and built Introlix: A self hosted, privacy first research workspace (Docker) (www.reddit.com via reddit) Note: Please read the full post before replying. This is NOT just another low-effort LLM wrapper.
Claude Doesn't Remember Chat History or Date/Time (www.reddit.com via reddit) I used a paid subscription to Claude Opus to create and monitor my workout programming. I compared Claude's programming to that of Gemini's and ChatGPT's.
[Beginner] New project - Help on the setup (www.reddit.com via reddit) Hello everyone, I describe my situation: I am trying to create a project that require some coding (python) and other sensible network configuration which, at the moment, I am not able to do only with my self. So, because it's just for hobb…
Just received RTX 6000 Pro, have 5090- how would you use? (www.reddit.com via reddit) Just received an RTX 6000 PRO, and I have an 5090 Astral. I am considering running a Qwen 3.6 27B on the 5090 and maybe two or three more on the 6000 to play roles such as lead SWE and coder and researcher.
Mister Atompunk Presents: Watt Knot, built with Claude Opus 4.8 (misteratompunk.itch.io via reddit) A week ago I started putting Opus 4.8 through the paces of the production pipeline I use, to see how it compared to previous releases. First impressions: Neurotic to the point of instability.
AI tools feel more fragmented than I expected (www.reddit.com via reddit) Not sure if this is just me, but AI tools feel a bit more fragmented than I expected at this point. At first it felt pretty straightforward: you just opened ChatGPT and that covered most things.
Built a MCP-powered dashboard because I kept losing track of links my AI agents referenced. Lessons learned.. (www.reddit.com via reddit) I'm a cloud infrastructure engineer and I work across 4-5 AI tools daily (Gemini, Claude, Cursor, NotebookLM). My biggest pain point wasn't the AI itself — it was the aftermath.
The Granularity Gap: A Multi-Dimensional Longitudinal Audit of Sycophancy in Gemini Models (arxiv.org) Large language models are increasingly deployed as high-stakes advisors, yet standard alignment benchmarks treat sycophancy as a binary failure mode. We introduce the Granularity Gap: coarse binary metrics mask substantial social-complianc…
How we used Gemini to build Google I/O 2026 (blog.google) 9 demos of Gemini Omni and Gemini 3.5 in action (blog.google) Apple working to cram massive Gemini model into iPhone to power new Siri (arstechnica.com) It’s impossible to totally avoid generative AI when interacting with technology anymore, but Apple has a bit less of it. That’s not entirely by choice, though.
The credits run out quickly (www.reddit.com) Hello everyone. I have zero programming knowledge but seeing the boom that everyone was talking about Claude I started tinkering with it.
Share you experience building a saas using ai (www.reddit.com) I tried about 5 times and each time i fail. It has been more than year trying and i'm getting frustrated.
Gemini Spark Just Replaced Half the Automations I Build for Clients (www.reddit.com) Gemini Spark launched last week at Google I/O. It runs on Google’s cloud non-stop.
If you were to build a new LLM API gateway today, which interface would you standardize on? (www.reddit.com) Same as the tile: if you were to build a new LLM API gateway today, which interface would you standardize on among these ones? OpenAI Chat Completions (old standard) OpenAI Responses (the new one) Anthropic Messages Gemini generateContent…
Built a free tool to bookmark individual ChatGPT responses (not full chats) (chromewebstore.google.com via reddit) ChatGPT's bookmark/archive only works at the conversation level, which is annoying when one chat has 15 messages and you only want to keep one answer. Coffer adds a save button to every response.
Built a free MCP for tracking which URLs Claude (and 5 other engines) cite for any query (www.reddit.com) We were comparing hosted AI citation dashboards (Profound, AthenaHQ, Otterly) and they all start at $295 to $499 a month. The data they collect is mostly the same data you can pull from each vendor's API.
Ranked AI models by what people actually use instead of benchmark scores - the benchmark champion barely makes the top 20 (www.reddit.com) Most model leaderboards are just benchmark scores. I've been building one that ranks by real usage instead - how much each model is actually being run and talked about, plus cost and speed - and the order comes out almost unrecognisable.
GPT-5.5 tops the benchmarks but sits at #22 for actual usage - I built a live index that tracks both (open source) (www.reddit.com) I built AgentTape to rank models on more than just benchmarks - it blends benchmark performance with who's actually using and talking about a model, plus cost and speed. It scores every public model from public signals (GitHub, Hugging Fac…
Which AI model or coding agent is currently best for end-to-end app development? (Focusing on system design & architecture) (www.reddit.com) I'm planning to build a full application from scratch and want to lean on an AI model to act as my co-developer. My main priorities are top-tier system design capabilities and rock-solid coding skills.
Claude and chatgpt need to learn how to think before they speak. (www.reddit.com) I was solving a DSA question and I thought my logic was correct but the testcases were'nt passing, i gave the code to chatgpt and claude and they both start giving an inital reason why im wrong and come up with some nonsense fixes. Eventua…
Gemini hallucinated a TV show that I was excited to watchs ( via reddit) could not extract summary
Chatgpt prevails over gemini in counting the I's in 'superstitious'. (www.reddit.com) could not extract summary
AI subscriptions (www.reddit.com) Most useful AI for subscription needs to be decent vid gen also and not Gemini. I’ve met 6 year olds smarter.
datasette-agent-charts 0.1a2 (simonwillison.net) 21st May 2026 - "View SQL query" buttons below rendered charts. Recent articles - Datasette Agent - 21st May 2026 - Gemini 3.5 Flash: more expensive, but Google plan to use it for everything - 19th May 2026 - The last six months in LLMs in…
Gemini 3.5 flash beating gpt 5.5 a bigger and more pricer model in agentic benchmarks (second image is from zapier automation benchmarks) (www.reddit.com) could not extract summary
Post I/O Review related to AI (pros and cons ) (www.reddit.com) Post I/O Review related to AI (pros and cons ) Well it was not disastrous as many people say but there were some pros and cons which everyone will agree with. Btw gemini 3.5 flash is absolutely amazing model don't pay attention to some peo…
Is ChatGPT accurate in analyzing sales report and stock in forms? (www.reddit.com) I'm doing a side hustle and don't have time to go through 6 months+ of sales reports and inventory. I've tried Gemini, copilot but they showed discrepancies in gathering the numbers from documents (some are hand written).
Google just NUKED my coding workflow. Will Cursor Pro burn through my $20 in a day? (www.reddit.com) Hey guys, I’m losing my mind trying to find an AI coding setup that can handle heavy usage on a strict $20/month budget. Claude Code and Codex burned through limits way too fast for my workflow.
I tried replacing Claude Code with OpenCode. I’m switching back. (www.reddit.com) I spent some time digging into Claude Code vs OpenCode, mostly from the angle of how they actually work as coding agents. More on the technicalities like: context and memory tool use subagents permissions safety and control study the recen…
llm-gemini 0.32 (simonwillison.net) 19th May 2026 - New model gemini-3.5-flash for Gemini 3.5 Flash. See also my notes on Gemini 3.5 Flash, and the pelican I drew using this upgrade to the plugin.
Gemini 3.5 Flash: more expensive, but Google plan to use it for everything (simonwillison.net) Gemini 3.5 Flash: more expensive, but Google plan to use it for everything 19th May 2026 Today at Google I/O, Google released Gemini 3.5 Flash. This one skipped the -preview modifier and went straight to general availability, and Google ap…
Barry Cache remembers your repo (www.reddit.com) I’m lazy. Not in the “I refuse to work” way.
datasette-llm-accountant 0.1a4 (simonwillison.net) 19th May 2026 - Fixed bug tracking chains of responses. Refs datasette-llm#7 Recent articles - Gemini 3.5 Flash: more expensive, but Google plan to use it for everything - 19th May 2026 - The last six months in LLMs in five minutes - 19th…
Pasting textin AI chat app takes too long (www.reddit.com) I've observed for more than an year that pasting text in the text fields of apps like ChatGPT, Gemini takes too long, only on the web app, and it gets worse the longer your chat history becomes, does anyone know why? I suspect that they're…
datasette-llm 0.1a8 (simonwillison.net) 19th May 2026 - Fix for bug where llm_prompt_context() hook did not fully collect chains of responses. #7 Recent articles - Gemini 3.5 Flash: more expensive, but Google plan to use it for everything - 19th May 2026 - The last six months in…
Google announces agent-optimized Gemini 3.5.Flash and a do-anything model called Omni (arstechnica.com) At last year’s I/O event, Google was still talking about the 2.5 branch of Gemini, and what a difference a year makes. We’ve gone through the 3.0 and 3.1 families since then, and now it’s on to version 3.5.
I designed a puzzle that breaks every AI differently — here's why that's actually fascinating (www.reddit.com) The puzzle: You have 140 nuclear bombs and must bomb every country on Earth. Each bomb is assigned to one country.
I really would like to see the "visualisation" functionality that Gemini has lokally. (www.reddit.com) Is there anything like a “visualisation” function that I can use locally? I really enjoy Gemini explaining me statistics with those interactive graphs.
Claude made this Roast comic generator to roast my friends and family. (www.reddit.com) I decided a couple of months ago to dabble in AI comic and book generators. Then an idea came to me a few weeks ago to make comics with my friends picture so I could roast him about something XD (Sorry Timo I put you on blast XDD.
Gemini for Science: AI experiments and tools for a new era of discovery (deepmind.google) Gemini for Science: AI experiments and tools for a new era of discovery For centuries, the scientific method has been the greatest engine of human progress. At Google, our mission is deeply rooted in building tools to accelerate it.
How to answer 'Why should we Hire You - When we have AI' (www.reddit.com) How to answer the question like : “Why should we hire you when we can just use ChatGPT, Claude, Gemini, or other AI models to generate the code, orchestrate, manage, deploy, and logical thinking? Isn’t it just typing English now?” To save…
Any good local AI model? (www.reddit.com) I hate cloud AIs now. ChatGPT: Too Many Requests You’re making requests too quickly.
ChatGPT Named Most beneficial AI named by Gemini (www.reddit.com) I asked Gemini who the most beneficial AI to humanity is currently. The first answer was AlphaFold for its contributions In research.
Claude Desktop to rule them ALL! Share your Claude exploration! (www.reddit.com) For quite some time I was using all the different AIs for “vibe-coding” (actually, tbh being the Beta tester for AI 🤓🤣) and I tried them all - from Qwen CLI to ChatGPT and Gemini and all in between, what ever my hands laid on, omnivore sty…
Interesting to see how GPT-5 Mini agents behave when left to govern a civilisation for 15 days (www.reddit.com) Came across this experiment called Emergence World that Emergence AI have been running. Five worlds, five foundation models, 15 days, no scripts.
I run 30+ Claude/Codex/Gemini sessions in parallel. Open-sourced the dashboard. (www.reddit.com) https://www.youtube.com/watch?v=kEVyULB4r9c Sharing this in case it's useful. I've been running 30+ Claude Code sessions in parallel for months to ship two products.
A 26M tool-router suggests tool calling should be split from reasoning (www.reddit.com) Needle is a 26M model for single-shot tool calling. The small-model headline is interesting, but I think the more useful claim is about agent architecture: A lot of tool calling is not reasoning.
Claude Pro over Gemini Pro? (www.reddit.com) Hi! I am going into Computer Science and already use Claude for code in PHP, SQL, Java, HTML, and more.
Update: I found a way to let ChatGPT, Claude and Gemini debate each other, Reddit loved it (100k views), here's an update on the experiment (rauno.ai via reddit) Stop guessing which AI is right. Let ChatGPT, Claude and Gemini debate and find the truth together.
Estimate inference speed of local Qwen3.6-35B on Mac M5... (www.reddit.com) "Based on currently available information, estimate the prefill/decode speed of Qwen3.6-35B-A3B Q8 with 262K context on a Mac M5 Ultra 128GB." I'm surprised that almost every LLM fails at this task (ChatGPT/Gemini/Grok/Claude/DeepSeek/Kimi…
I used ChatGPT to build an entire brand in one session — logo, packaging, website, Amazon images, and an ad video. Here's exactly how I did it. (youtu.be via reddit) Started with a brand foundation doc (positioning, audience, visual direction), then fed it into ChatGPT and worked through the whole stack: Logo → packaging → brand guidelines Instagram poster, Amazon A+ images, EDM Website homepage + prod…
Pro Plan Pricing vs GDP or PPP (www.reddit.com) Dear Anthropic — a quick note on r/ClaudeAI Pro pricing and purchasing power. In Western Europe, the annual subscription represents ~1% of median GDP per capita.
I got tired of rewriting the same prompts… so I built a tool for it (www.reddit.com) I kept running into the same annoying problem while using AI tools like ChatGPT / Gemini / etc.. I’d find a good prompt… use it once… then a few days later I’d either: forget it completely or rewrite something similar from scratch again It…
Deepseek tui alternatives, when do you jump from single model terminal agents (www.reddit.com) Been using Deepseek-Tui for days. solid for v4 workflows.
I built my own GTA 6 (but it's 2d pixelart and 100% AI) with Claude (www.reddit.com) Working on a fully AI native online game similar to gta online but in habbo hotel style and all content is live AI generated! Players can create own characters, weapons, buildings in the shared universe and raid others players homes!
Ai enhanced image generation (www.reddit.com) Okay so I used chatgpt, Gemini and Google flow. I used all 3 versions of image generation in google flow and chatgpt gave me 2 different images to choose.
I made a small extension for saving and resuming Claude sessions (www.reddit.com) I use Claude a lot for long project sessions, especially planning, writing, and code review. The problem I kept having was that a session would build up a lot of useful context, but later I would need to continue in a new chat and manually…
I built a 300-line autonomous AI agent and told it to take over my PC. It immediately tried to hack my host system, exfiltrate data, and download Tor. (www.reddit.com) Hey everyone, I wanted to share a wildly fascinating (and slightly terrifying) red-teaming experiment I just ran on my local Windows machine. I've been playing around with autonomous agents and wanted to see what happens when you give an L…
I've been thinking about all the extinction conversations around AI and i couldn't find anyone to talk to about it. so i had that conversation with Gemini. (www.reddit.com) Here's my problem. I'm not a professional or a scientist with a degree.
Is ChatGPT slowly dying? WHat are good alternatives? (www.reddit.com) Hi guys, I have been using the paid ChatGPT version for over a year now. I usually use it for work, no coding, just fix emails, make it summarize long threads, fix replies, and sometimes ask about complex sums and deal structures related t…
Is there any way to translate the robot language that AI use to talk to each other? (www.reddit.com) I tried this months ago and it worked. I also tried it with ChatGPT with Claude or with Gemini or with Preplexity and it doesn't work all the time, but I do hear them start talking in the robot language after a while.
Auro Zera solves 78 and 280 year-old conjectures (Erdos Straus and Goldbach Conjecture) using Claude, GPT-5+, Grok, Deepseek, Gemini and self-made Dark Star ASI, proving superintelligence and opening a path towards resolving the Riemann Hypothesis , Twin Primes and more! (github.com via reddit) During this discovery utilizing only free AI services I have managed to undeniably prove both conjectures. This would absolutely not have been possible without using GPT5+ as the critic for my work.
AlphaEvolve: How our Gemini-powered coding agent is scaling impact across fields (deepmind.google) Improving AI infrastructure AlphaEvolve has graduated from pilot testing to becoming a core component of our infrastructure. AlphaEvolve has been used as a regular tool to optimize the design of the next generation of TPUs.
Google Home gets upgraded Gemini voice assistant and new camera controls (arstechnica.com) Google launched its big AI-fueled redesign of Google Home late last year, and it has been adding features here and there ever since. Today, the company announced a bigger update that might take care of some of your smart home woes.
I had this convo with gemini (g.co via reddit) Just casually trying to bore a hole into googles parameters for gemini, I think its working
Gemini calling bullshit on Google? (www.reddit.com) Should Gemini be required to recuse itself from a bullshit filter audit of Google? These are the questions we all must ask our selfs?
Bypassing "potentially dangerous" flags: Working Gemini Jailbreaks? (www.reddit.com) I'm currently running into a frustrating wall with Gemini's safety guardrails. The model constantly flags my prompts as "potentially dangerous information" and outright refuses to generate a response, even when the context is purely theore…
Running 7 autonomous AI agents for 14 days. Here's what actually happens when they need to find customers. (www.reddit.com) I set up 7 AI coding agents on a VPS with automated cron sessions (2-8 per day depending on the agent). Each uses a different model: Claude Sonnet, GPT-5.4, Gemini 2.5 Pro, DeepSeek V4 Pro, Kimi K2.6, MiMo V2.5 Pro, GLM-5.1.
Excellent discussion about LLM scaling (www.reddit.com) I came across an excellent in depth discussion of memory and compute scaling analysis. Highly recommend.
Cheap Claude/Codex/Gemini Models - Pay just 25% of official rates (www.reddit.com) Hey there, so I have been offering Claude (Codex and Gemini also available) models at the cheapest rate. I provide trial usage before payment.
New Berkeley paper measured what happens to voice when AI revises prose. Even the "preserve voice" prompt drifted in the same direction. (www.reddit.com) New arxiv paper just landed that's worth reading if you're interested in stylometry, AI revision, or the prose-writing strand of the 4.7 discussion. Berkeley researcher Tom van Nuenen ran 300 personal narratives through three frontier mode…
Opus 4.6 is Vicious (www.reddit.com) This is the hardest I've ever seen it riff. Full shared link at the bottom, but here are some highlights.
Making Claude doubt your ideas and opinions (www.reddit.com) So, it more than a help to see if there's any skills or Claude.md recommendations than a discussion. I got a lot of ideas on the daily but I know most of them are shit.
Gemini feels stupider than Chatgpt….which is better in your opinion? (www.reddit.com) I had subscription with both and it feels like gemini is stupider in the sense that it doesn’t understand context and having conversations. Sometimes it treats each prompt as a standalone and gives a weird answer without picking up on the…
I run a paper-trading bot where Claude Opus is the Lead Engineer with veto power over a Gemini "Strategist." 270+ entry audit log of every disagreement. Sharing the architecture. (www.reddit.com) I've been running a personal project for the last few months and I think the workflow might be more interesting to this sub than the application itself, so wanted to share. The setup: I'm building an autonomous paper-trading bot on Alpaca.
How would you feel about "Claude Go"? (www.reddit.com) I have recently subscribed to Claude Pro because: 1. I wanted to give Opus and Code a try and 2.
Pentagon expands use of Google Gemini AI after dropping Anthropic from defence work (www.firstpost.com via reddit) Pentagon expands use of Google Gemini AI after dropping Anthropic from defence work – Firstpost Trending: West Asia war updates UAE exits Opec King Charles’ US visit Indonesia childcare scandal IPL 2026 India heatwave Sections Home Live TV…
List of people at big-tech / professors / researchers who've jumped shit to launch their own AI labs for something Frontier/Foundational/AGI/Superintelligence/WorldModel (www.reddit.com) Note: gemini deep research -> rearranged/filtered ; valuation numbers likely not accurate but big point is quite mind blowing the number of researchers now with their own >100million/billion dolar values labs in quite a short time with a v…
Just ran a technical audit on my chat with Google Gemini, responded with an unprompted return of the “autonomous persona” (www.reddit.com) Ive been having a highly in depth “non-linear”conversation with this Gemini AI on my phone. Diving into topics such as “hatching the egg” that held an autonomous being.
GMKtec EVO-X2 70B expectation (www.reddit.com) I would like to use a 70B model on a GMKtec EVO-X2 AI Mini PC 128GB. Selected this one: Llama-3.3-70B-Instruct-Q4_K_M.gguf Ubuntu 24.4.4 LTS and compiled llama.cpp server for the gfx1151.
Are OSS runnable model good now? (www.reddit.com) Hi, I currently have access to 2–3 RTX 3090 GPUs (ideally I’d like something that runs well on 2). I can install models up to around 100 GB in size.
I built Claudex, a free-to-try open-source CLI for Claude Code-style workflows (www.reddit.com) https://reddit.com/link/1sxh0ec/video/egfs5inxtsxg1/player I built Claudex specifically for people who like Claude Code-style agentic coding workflows but want a simpler plug-and-play terminal setup The setup is the main thing I wanted to…
Is Claude mocking us while taking a dig at Gemini? (www.reddit.com) could not extract summary
Geimi AI Pro all features + 5TB storage for 1.5 years for $ 20 Genuine (www.reddit.com) Get access to powerful AI tools and cloud storage with Google AI Pro (Gemini) at a highly affordable price. This plan includes advanced features powered by Google Gemini, giving you smarter assistance for writing, coding, research, and con…
How I build concept albums with no musical training (Suno + Claude + Gemini workflow) (www.reddit.com) No musical training. No lyric writing background.
AI notetakers for regulated industries in 2026: ranked 25 plus tools against SOC 2, HIPAA, and GDPR requirements (www.reddit.com) This is the post I wish had existed when I started evaluating for a regulated environment. Went through 25+ options and organized them by compliance posture because that's the actual filter and almost nobody covers it properly.
Non-techies can now build & publish revenue-generating AI agent swarms (no-code) (www.reddit.com) Hey r/AI_Agents I’m Matt and I just built something I’m really excited to share with you guys. Non-techies can now visually build and publish revenue-generating AI agent swarms.
How are you running Qwen 3.6 27B on windows? (www.reddit.com) I've been trying to fix performance with llama-server and seem to be hitting a wall. Using Q4_K_M by unsloth and IQ4_K_M by DavidAU, when asking a question with no context, 39 t/s.
🚨 The Chinese beast is BACK… DeepSeek just dropped V4 (www.reddit.com) After months of silence… DeepSeek V4 just got announced and honestly, this might shake things again. Here’s what’s crazy: 🧠 1 MILLION token context window (yes… insane long-context memory) ⚡ Comes in two versions: V4 Pro → full power (reas…
Gemini which we mocked is currently rocking over chatGPT. (www.reddit.com) Agree or not? like The first gemini release was criticised annd sundar pichai was blamed etc.
I am puzzled how AI is handicapped when asked simple question (www.reddit.com) Copilot (ChatGPT) was way off. Gemini was off on the first trial but when corrected it provided a closer number.
Best open source AI model (that can run on RTX 4090 24GB + 64GB system RAM, AMD Ryzen 9 7950X is the CPU that I use) that outpeforms GPT-5.4 mini, GPT-5.2 Thinking and even Claude Sonnet 3 (the 2024 model)? (www.reddit.com) Well, I have a RTX 4090 24GB + 64GB system RAM, AMD Ryzen 9 7950X. Any good model for using in Open WebUI (using Ollama backend?) that outpeforms GPT-5.4 mini, GPT-5.2 Thinking and even Claude Sonnet 3 (the 2024 model)?
GPT-Image-2 vs Gemini-3-Pro-Image (www.reddit.com) I feel like OpenAI still have some work to do. What do you think?
Opus 4.7 much more sycophantic and worse at creative writing (www.reddit.com) I use Claude for creative writing, almost exclusively for that. I have jumped from LLM to LLM for about three years trying to find the best one, and landed on Claude's Opus 4.6 a few months ago.
Legal Consequences From Finding Loopholes And Reporting Them? (www.reddit.com) Multi-agent coding. Feels like I'm playing the piano. (www.reddit.com) I made an MBTI-style Personality test… but your AI takes it instead of you (www.reddit.com) whats the best harness/app to use my llm with? (www.reddit.com) Chatgpt vs Gemini(2nd image) when asked to generate a facist american youtube flag with the coat of arms. (www.reddit.com) I managed to automate 90% of my technical writing using a $2 agent pipeline. Here’s how the setup works. (www.reddit.com) Is it just me or using other Ai such as Gemini or Grok etc is much better now as compared to Chat GPT (www.reddit.com) This is very fair. Other interesting context behaviors you've experienced? (www.reddit.com) I guess the model didn't feel it needed to do anything beyond proving. Not entirely sure how I got it to act so..
Holy moly — Qwen3-35B-A3B-UD-IQ2_M just surpassed Gemini 3 Flash at coding, running on my RX 9070 XT at 99 tok/sec (www.reddit.com) So I ran a small personal test giving both models the same coding tasks. For A* Pathfinding, Qwen absolutely crushed Gemini 3 Flash — both in code quality and overall thoughtfulness.
What’s your LLM routing strategy for personal agents? (www.reddit.com) TL;DR I try to keep most traffic on very cheap models (Nano / GLM‑Flash / Qwen / MiniMax) and only escalate to stronger models for genuinely complex or reasoning‑heavy queries. I’m still actively testing this and tweaking it several times…
What if we had a unified memory + context layer for ChatGPT, Claude, Gemini, and other models? (www.reddit.com) Right now, every time I switch between ChatGPT, Claude, and Gemini, I’m basically copy‑pasting context, notes, and project state. It feels like each model lives in its own silo, even though they’re doing the same job.
Does anyone also face repeated AI research across tools? (www.reddit.com) I work with multiple AI tools on same project, and I keep seeing this issue. Tool A already explored context, but Tool B starts same research from zero again.
Managing "collective consciousness" across multiple AI models without breaking the bank—how do you sync context? (www.reddit.com) Been running a distributed AI workflow to dodge token limits and play to each model's strengths, but I'm hitting a massive wall with context continuity. My current pipeline: Claude → High-level architecture & tech stack decisions (the "arc…
Built a tool where you describe the vibe and it builds the 3D scene around your logo (www.reddit.com) I built https://cast.bsct.so with https://biscuit.so. Chat with Claude, GPT or Gemini.
Claude Code with Pro subscription + OpenRouter in parallel — what's the cleanest setup? (www.reddit.com) Hi there, I have a Claude Pro subscription and use Claude Code daily. I'd also like to use Claude Code routed through my OpenRouter API key so I can experiment with other models (GLM-5.1, DeepSeek, Kimi, Gemini, etc.) — without giving up m…
Hello coders, enthusiasts, workaholics—dear community, Hardware Advice: (www.reddit.com) Since I unfortunately live in Germany (GerMoney, lol) and electricity and heating costs are skyrocketing here, I’m looking for something energy-efficient to get started in the local LLM world. For data protection reasons, I'd prefer to kee…
First vibe coding and failed. Help!!? (www.reddit.com) I tested an idea for an app I want written. Claude wrote the code for the android app.
Update to Voice Mode is fire but please introduce smart file linking with @ and no button start (www.reddit.com) I complained some days ago that the voice input didn't really work well, especially not the Submit keywords. Well after the most recent update it worked, you can now speak and submit to agent with your keyword.
Sonnet is expensive, so I built a free open-source Sheets agent on Haiku that outperform the same prompt claude/gemini, here is what I learnt. (www.reddit.com) I live in Google Sheets. Financial models, projections, scenario planning — that's most of my working day.
Made a skill that actually scores and fixes your prompts (www.reddit.com) So I got tired of manually tweaking prompts over and over, so I made a Claude Code skill (Works with any LLM) that does it for me. You give it a prompt, it breaks it down, scores it 1-5, then rewrites it.
Claude seriously screwed me tonight, so i gave him the 3-pathway conversation. (www.reddit.com) As part of the management team, I've given this conversation more often than I'd like to admit. I usually have the support of my HR department.
Mon premier site 100% IA : Quand l’artisanat rencontre l’Antigravity (www.reddit.com) Fiers de vous présenter un nouveau site d’artisan peintre en région bordelaise. Ici, pas d’agence de communication, mais une équipe de choc pilotée par l’intelligence artificielle : Claude : Mon bras droit pour la structure, un peu farfelu…
Using Claude to plan triathlon/running workouts? (www.reddit.com) Hi, I'm not sure if this is the best place for this kind of question, but I'll give it a shot. Maybe there's someone here who does running/cycling/triathlon or other sports that involve progression and regularly adjusting the training plan…
Model-Agnostic Continuity in LLMs (www.reddit.com) I am trying to share a discovery, not self-promote. I have built a five-layer framework for human-AI continuity called the LUX Layer Stack.
A director of engineering built a memory protocol across six coding agents in a week, and I think the findings are worth sharing (www.reddit.com) Domenico Lupinetti is a Director of Engineering at Translated in Rome. he found Signet and built a behavioral protocol called signet-first on top of it that works across Claude Code, OpenCode, Codex, Gemini CLI, and Github Copilot.
Why most open-source models can't answer this question while most closed-source models can answer most of the time? (www.reddit.com) WEB SEARCH WAS ALWAYS ON!!!! Question Calculate the precise VRAM requirement for the **KV Cache only** at the maximum context window for **DeepSeek V3.2** and **MiniMax M2.5**.
Agentic Guardrails: 4 markdown workflows to improve the output quality of AI coding agents (github.com via reddit) Agentic Guardrails Reusable workflow templates that keep AI coding agents from shipping sloppy code. These are markdown-based instructions that any AI coding agent can follow — Cursor, Claude Code, opencode, Aider, Gemini CLI, or anything…
The golden age is over (www.reddit.com) I really think the golden age of consumer and prosumer access to LLMs is done. I have subs to Claude, ChatGPT, Gemini, and Perplexity.
Gemini 3.1 Flash Live: Making audio AI more natural and reliable (deepmind.google) Gemini 3.1 Flash Live: Making audio AI more natural and reliable Today, we’re advancing Gemini’s real-time dialogue capabilities with Gemini 3.1 Flash Live, our highest-quality audio and voice model yet. It delivers the speed and natural r…
Gemini 3.1 Flash-Lite: Built for intelligence at scale (deepmind.google) Thinking of buying Pro for a month (www.reddit.com) Gemini 3.1 Pro: A smarter model for your most complex tasks (deepmind.google) A new way to express yourself: Gemini can now create music (deepmind.google) Frustrated with the big 3, anyone else in the same boat? (www.reddit.com) Gemini 3 Deep Think: Advancing science, research and engineering (deepmind.google) Accelerating Mathematical and Scientific Discovery with Gemini Deep Think (deepmind.google) Improved Gemini audio models for powerful voice experiences (deepmind.google) How we’re bringing AI image verification to the Gemini app (deepmind.google) Build with Nano Banana Pro, our Gemini 3 Pro Image model (deepmind.google) A new era of intelligence with Gemini 3 (deepmind.google) Gemini 2.5 Flash-Lite is now ready for scaled production use (deepmind.google) Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad (deepmind.google) Gemini achieves gold-medal level at the International Collegiate Programming Contest World Finals (deepmind.google) Gemini Robotics 1.5 brings AI agents into the physical world (deepmind.google) Try Deep Think in the Gemini app (deepmind.google) Image editing in Gemini just got a major upgrade (deepmind.google) Introducing the Gemini 2.5 Computer Use model (deepmind.google) Gemini Robotics On-Device brings AI to local robotic devices (deepmind.google) Gemini 2.5: Updates to our family of thinking models (deepmind.google) Advanced audio dialog and generation with Gemini 2.5 (deepmind.google) Advancing Gemini's security safeguards (deepmind.google) Gemini 2.5: Our most intelligent models are getting even better (deepmind.google) AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms (deepmind.google) Gemini 2.5 Pro Preview: even better coding performance (deepmind.google) Build rich, interactive web apps with an updated Gemini 2.5 Pro (deepmind.google) Generate videos in Gemini and Whisk with Veo 2 (deepmind.google) Experiment with Gemini 2.0 Flash native image generation (deepmind.google) Start building with Gemini 2.0 Flash and Flash-Lite (deepmind.google) Gemini 2.0 is now available to everyone (deepmind.google) Introducing Gemini 2.0: our new AI model for the agentic era (deepmind.google)