gpt-5.4-nano ist SO much better than gemini-2.5-flash-lite! (www.reddit.com)
model roundup
Gemini 2.5
-
I've been playing around with GPT-5.4 nano in a real workflow and honestly... I'm kinda impressed.
-
What’s your LLM routing strategy for personal agents? (www.reddit.com)
TL;DR I try to keep most traffic on very cheap models (Nano / GLM‑Flash / Qwen / MiniMax) and only escalate to stronger models for genuinely complex or reasoning‑heavy queries. I’m still actively testing this and tweaking it several times…
-
I built an open-source research agent. You ask a question, it searches the web via Tavily, synthesizes an answer with an LLM, and shows the sources it used.
-
We benchmark how internal reasoning traces, which we call thought streams, affect video scene understanding in vision-language models. Using four configurations of Google's Gemini 2.5 Flash and Flash Lite across scenes extracted from 100 h…
-
Audio Flamingo Next (AF-Next) — three variants: AF-Next-Instruct: audio Q&A AF-Next-Think: multi-step reasoning with temporal CoT AF-Next-Captioner: audio description generation Architecture: → AF-Whisper audio encoder → Qwen-2.5-7B LLM ba…
-
Show HN: Zero-identity messaging app with physics-based post-quantum encryption (news.ycombinator.com)
Show HN: Zero-identity messaging app with physics-based post-quantum encryption (Layer 2 from my own paper) Hey HN, I'm building a privacy-first messaging app in Flutter/Dart, developed with AI assistance (Gemini 2.5 Pro + Claude Opus 4.6)…