DOA model by Cohere Labs (www.reddit.com via reddit)
model roundup
DeepSeek 4
-
So apparently the model gets beaten by qwen 3.6 on every benchmark reported by cohere labs. You are getting lower RAM (considering model offload) usage and slightly better performance for imo significantly less output quality.
-
Running DeepSeek-V4-Flash on a Raspberry Pi (twitter.com via hn)
Article Conversation Running DeepSeek-V4-Flash on a Raspberry Pi I ran DeepSeek-V4-Flash on a Raspberry Pi 5 (8GB edition) by streaming model weights from a PCIe attached NVMe SSD. Codex (GPT-5.5 xhigh) and Claude Code (Opus 4.8 max) drove…
-
-
Here are some tips on hitting nearly 200 tok/s for DeepSeek v4 Flash on Hopper (dnhkng.github.io via reddit)
I needed a smarter model for my local Hermes Agent setup, so I moved to DeepSeek v4 Flash. First things first: Running 4 concurrent threads on vLLM, I can hit ~400 tok/s 400 x 60 x 60 x 24 x 30 is ~1B TOKENS per month!!!
-
Share your agentic LLMs and average cost ($/MTokens) (www.reddit.com via reddit)
-
DStudio A native, local-first desktop app for DeepSeek V4 — chat, a coding agent and a design studio, all running on your Mac. Nothing leaves the device.
-
I Compared the Top AI Models of 2026 — The Results Were More Nuanced Than Expected (www.reddit.com via reddit)
Over the last few weeks I've been comparing the latest frontier AI models, including Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro, Grok 4.3, Perplexity AI and DeepSeek V4-Pro. Instead of focusing only on benchmark scores, I looked at: Real-wor…
-
Mimo v2.5 is better deal than DeepSeek v4 flash (news.ycombinator.com)
So Hear me out. Not only on almost all benchmarks is mimo v2.5 is better than dsv4f flash, but also the pricing.
-
Show HN: One API Key for 45 AI Models – Pay per Token, OpenAI Compatible (modelhub-api.com via hn)
DeepSeek V4 math score equals GPT-5.5 (91) and trails by just 4-6 points in other categories — at 97% lower cost. Is the AI quality as good as GPT?
-
DeepSeek V4 Pro beats GPT-5.5 Pro on precision (runtimewire.com via hn)
DeepSeek V4 Pro takes this matchup 38.0 to 33.0, and the margin feels earned. Across the scored tasks, the pattern is simple: Model A was tighter, more literal, and more reliable under constraints, while Model B was good but a little too w…
-
Command Code - confusing messages (www.reddit.com via reddit)
Hi, I'm a little confused. I was doing a code review of one of my repositories, mainly just testing out different models to see what came back.
-
planing with composer 2.5 executing with deepseek v4 flash (www.reddit.com via reddit)
I am thinking to buy 20 dollars pro. is this approach make sense?
-
Alternate to ChatGPT Pro (www.reddit.com via reddit)
I had briefly used ChatGPT pro feature - in the chat app. It was quite amazing.
-
DeepSeek V4 Flash is amazing! (WIP llama.cpp PR #24162) (www.reddit.com via reddit)
In case you're not aware already, the DeepSeek V4 series is finally getting supported on llama.cpp with this PR! The PR is at a very early stage right now, so only try it if you're consciously willing to experiment out of curiosity and acc…
-
DeepSeek V4 managed to reverse engineer Teamspeak's Licensing System with $3.88 (old.reddit.com via hn)
could not extract summary
-
DeepSWE Audit: DeepSeek-v4-pro results are unreliable (github.com via hn)
DeepSWE DeepSWE is a benchmark for measuring frontier coding agents on original, long-horizon software engineering tasks drawn from active open-source repositories. The benchmark includes 113 tasks across TypeScript, Go, Python, JavaScript…
-
DeepSeek-V4-Flash (official FP8) running across 2x DGX Spark (forums.developer.nvidia.com via hn)
I didn’t create this recipe you guys did but I was finally able to find it and get Deepseek v4 Flash working with 200k Context on 2 Nodes. Sharing this since I couldn’t find a confirmed end-to-end recipe for the official DeepSeek-V4-Flash…
-
Bringing Up DeepSeek-V4-Flash on AMD MI300X (fergusfinn.com via hn)
Bringing up DeepSeek-V4-Flash on AMD MI300X At Doubleword we are building an inference cloud designed for volume. To do that we have to reckon with the enveloping compute shortage.
-
Did DeepSeek v4 suddenly become more expensive? (imgur.com via hn)
If you're seeing this message, that means JavaScript has been disabled on your browser , please enable JS to make Imgur work.
-
Show HN: Train Claude Code's replacement (ds4 and pi and aoe) (github.com via hn)
Remember how Meta monitored employee activity closely for a few months, and then had a bunch of layoffs related to AI efficiency? (oh right that was like 3 days ago).
-
How DeepSeek's architecture is shattering Silicon Valley's token moat (venturebeat.com via hn)
DeepSeek’s announcement over the weekend that it has made its 75% price cut permanent on its flagship V4 Pro model is a disruptive assault on the capital-heavy business models of Silicon Valley’s frontier labs. The reduction on DeepSeek V4…
-
Show HN: Free open source coding models in Slack (www.runcord.com via hn)
Hey HN, We believe we have the easiest onboarding from signup to being able to spin up coding agents in slack like Stripe, Ramp & Coinbase. Demo of the onboarding: https://www.tella.tv/video/connecting-cord-to-slack-1-19ep Every signup get…
-
The first framework that can post train DeepSeek V4-pro on a single-node? (news.ycombinator.com)
Hi all, We just opensourced a project called Orbit, which can RL post train trillion scale LLMs like deepseek v4. We found it pretty cool!
-
my apologies if anything does not make sense, I literally dont know what I am doing, im not a programmer, just a simple vibe coder, with an Claude subscription. That said, if you have 200gb of sys ram+vram and want to run deepseek v4 flash…
-
Trying to figure out the right box for my team and wanted to see if anyone had any clue which would be a better fit or if it is not worth our time in our budget. Situation: 5 of us doing agentic coding (lots of long context getting re-sent…
-
Looking for a working Deepseek-v4-Flash quant (www.reddit.com)
Best I tried so far is https://huggingface.co/nsparks/DeepSeek-V4-Flash-FP4-FP8-GGUF with the custom llama.cpp fork, but it suffers from low quality and random incoherent output. VLLM wouldn't support anything other than H100s for DS4.