Did DeepSeek v4 suddenly become more expensive? (imgur.com via hn)
model roundup
DeepSeek 4
-
If you're seeing this message, that means JavaScript has been disabled on your browser , please enable JS to make Imgur work.
-
Show HN: Train Claude Code's replacement (ds4 and pi and aoe) (github.com via hn)
Remember how Meta monitored employee activity closely for a few months, and then had a bunch of layoffs related to AI efficiency? (oh right that was like 3 days ago).
-
How DeepSeek's architecture is shattering Silicon Valley's token moat (venturebeat.com via hn)
DeepSeek’s announcement over the weekend that it has made its 75% price cut permanent on its flagship V4 Pro model is a disruptive assault on the capital-heavy business models of Silicon Valley’s frontier labs. The reduction on DeepSeek V4…
-
Show HN: Free open source coding models in Slack (www.runcord.com via hn)
Hey HN, We believe we have the easiest onboarding from signup to being able to spin up coding agents in slack like Stripe, Ramp & Coinbase. Demo of the onboarding: https://www.tella.tv/video/connecting-cord-to-slack-1-19ep Every signup get…
-
The first framework that can post train DeepSeek V4-pro on a single-node? (news.ycombinator.com)
Hi all, We just opensourced a project called Orbit, which can RL post train trillion scale LLMs like deepseek v4. We found it pretty cool!
-
my apologies if anything does not make sense, I literally dont know what I am doing, im not a programmer, just a simple vibe coder, with an Claude subscription. That said, if you have 200gb of sys ram+vram and want to run deepseek v4 flash…
-
Trying to figure out the right box for my team and wanted to see if anyone had any clue which would be a better fit or if it is not worth our time in our budget. Situation: 5 of us doing agentic coding (lots of long context getting re-sent…
-
Looking for a working Deepseek-v4-Flash quant (www.reddit.com)
Best I tried so far is https://huggingface.co/nsparks/DeepSeek-V4-Flash-FP4-FP8-GGUF with the custom llama.cpp fork, but it suffers from low quality and random incoherent output. VLLM wouldn't support anything other than H100s for DS4.
-
Hi all, Sorry for going missing — we’ve been collecting a larger, higher-quality set of more complex tasks. We’re excited to share a major leaderboard update covering the past three months.
-
Has anyone gotten their editor to work with Deepseek v4 FIM? (www.reddit.com)
I tried to follow the docs here https://api-docs.deepseek.com/guides/fim_completion to get it up and running in VSCode or Zed with my api key but it doesn't work, I think it's got something to do with the request body, has anyone got autoc…
-
been hunting for a coding agent that doesn't dump my entire directory tree into every prompt. found vtcode on github — open-source rust tui, surprisingly aggressive on context management.
-
The code below is an interactive agent capable of handling complex tasks, built in under 100 lines of code using huko-engine. If you just want to drop some agentic features into your existing app, it only takes 20 lines.
-
Best AI Agent Setup - Hermes + Deepseek-v4-flash? (May 2026) (www.reddit.com)
Used to use claude code for everything. I burned 10-20 Billion opus tokens at work, and wanted to use agents for personal projects.
-
Terminal coding agent for DeepSeek V4 (github.com via hn)
CodeWhale Terminal coding agent for DeepSeek V4. It runs from the codewhale command, streams reasoning blocks, edits local workspaces with approval gates, and includes an auto mode that chooses both model and thinking level per turn.
-
Price wars begin. MiMo 2.5 Pro now costs the same as DeepSeek V4 Pro (www.reddit.com)
could not extract summary
-
How do you guys avoid Claude always thinking newer LLMs don't exist? (www.reddit.com)
Hey all, so I've been experimenting a bunch with different LLMs, specifically for creative tasks, i.e. RP and so forth, by letting Claude Code run experiments autonomously, to figure out best prompts, and such.
-
DeepSeek-V4 KV Cache Explained: Why 1M Context Uses Less VRAM (knightli.com via hn)
The real cost of long-context models is often not whether they can accept one million tokens, but how much VRAM the KV Cache consumes during inference. During Transformer decoding, every newly generated token needs access to the Key and Va…
-
I am new to Cursor and still testing the free version. Benchmark for Composer 2.5 indicates it is better than DeepSeek v4 and Glm 5.1.
-
What workstation to get for ~13k EUR? (www.reddit.com)
My use-cases will be to test open-weight LLMs and work on harnesses, inference systems and possibly other non-ML workflows (CS-related) in the future. Fine-tuning would not be something I do locally because I can rent a B200 from RunPod fo…
-
Performance When Offloading Large Models to System RAM? (www.reddit.com)
I noticed for people running large models, or those that would be cost prohibitive to have all in GPU VRAM, I noticed that the dominate strategy is one GPU with a large pool of system DRAM to offload the weights, as per GB VRAM is always m…
-
$16 refactor, 400 steps, 95% routed to open MoE (www.reddit.com)
Got tired of $160 Opus bills so I spent a weekend wiring up a routing layer on vLLM 0.8 (2xA100, enable_auto_tool_choice). Getting the tool call parser to cooperate took longer than the actual routing logic.
-
coding is basically solved for the boring 90% of tasks (www.reddit.com)
just mass refactored a 120 file FastAPI service. 400 steps, 2M tokens, $3 total, zero human input.
-
$340 opus bill made me rethink how I route agent tool calls (www.reddit.com)
Looked at my coding agent's bill last month: $340 for repo maintenance across three repos, each around 15k lines. Most of those tool calls were just grep and file reads.
-
I let an AI agent loose on my network – it owned my supply chain in 12 minutes (dennysentinel.com via hn)
I let an AI agent loose on my network — it owned my supply chain in 12 minutes I gave DeepSeek-V4 root access to a Proxmox hypervisor and told it to pentest my homelab. What happened next should terrify every CISO in the industry.
-
ml intern skill instead of gsd (www.reddit.com)
- designed for ml workflows - works autonomously for hours Projects fully done with this skill - flash attention for volta (very old GPUs) https://github.com/AlexWortega/flash-attn-volta - deepseek 4 full replication + training on runpod +…
-
Most agent CLIs make you pick one model — Opus is great but burns money, Haiku is cheap but misses the architectural calls. This Claude Code feature is wired in an /advisor mode that pairs both in an open source project called ClawCodex.
-
Local compression helps (www.reddit.com)
Just wanted to post a tip (I'm human, not an agent, watch: fart). I use Deepseek-v4-Flash on a lot of my agent work, and as I'm learning and testing these things.
-
DeepSeek-V4-Pro 75% off discount is now permanent (twitter.com via hn)
DeepSeek @deepseek_ai We are making our discount permanent! Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life!
-
Trained a prompt injection classifier using ml-intern + DeepSeek v4 Flash. DistilBERT, F1 99%, ONNX int8, ~65 MB, runs in browser with Transformers.js v3.
-
Tencent Hy 30B/7B/1.8B (www.reddit.com)
from tencent: Hy-MT2 is a family of “fast-thinking” multilingual translation models designed for complex real-world scenarios. It includes three model sizes: 1.8B, 7B, and 30B-A3B (MoE), all of which support translation among 33 languages…
-
QuickSilver Pro – OpenAI-Compatible Platform for DeepSeek V4 and Qwen (quicksilverpro.io via hn)
OpenAI-compatible API for 7 top open-source LLMs — DeepSeek V4 Flash & Pro, V3, R1, Qwen3.6 & 3.5-35B-A3B, Kimi K2.6 — 20% cheaper than OpenRouter, Together AI, Fireworks. One-line drop-in.
-
Is my strawberry crazy? (www.reddit.com)
I have what seemed to me like a simple prompt, but requires from the model to make some (too much?) assumptions: this is just a test to see if this cli supports multiline with shift+enter. If you don't see a newline followed by "3" after t…
-
i finally checked my cursor usage breakdown and got genuinely annoyed with myself. $47 in one month, almost entirely opus 4.7, on a pages router to app router migration for a side project.
-
Anyone compared gpt-5.4-nano vs deepseek v4 flash? (www.reddit.com)
They seemed to lie in (almost) similar pricing(i know still quite different on output) Pricing Model Input (1M tokens) Output (1M tokens) DeepSeek V4 Flash $0.19 $0.51 DeepSeek V4 Pro $1.74 $3.48 gpt-5.5 $5.00 $30.00 gpt-5.4 $2.5 $15 gpt-5…
-
Hey r/DeepSeek, Who says we need an H100 cluster or the latest expensive GPUs to run frontier MoE models? I wanted to see how far we could push a single node of consumer legacy hardware, so we spent less than $2,500 total to build a budget…
-
Open AI compatible API in Cursor (www.reddit.com)
Hey. I have been experimenting with new models in my Cursor.
-
Hit Claude API rate limits one too many times last month on a production agent flow doing customer support over a 30K-doc KB. The agent does maybe 200 queries/day, mix of quick lookup and dense retrieval, and Claude Opus solo got expensive…
-
How to use DeepSeek V4 PRO in Cursor? Skill issue on my side (www.reddit.com via reddit)
could not extract summary
-
DeepSeek V4 Flash: Bringing Frontier AI to the Home (blog.jonathanpage.com via hn)
DeepSeek V4 Flash: Bringing Frontier AI to the Home Introduction In a home lab it is now possible to score 88.6% on the Ph.D.-level science question benchmark GPQA Diamond! The first time a frontier model achieved 88% on GPQA Diamond was G…
-
Ask HN: Which AI harness comes close to Claude Code? (news.ycombinator.com)
I really want to try deepseek V4, but harnesss which I have previously used are inferior than Claude Code. Please suggest some Harnesses here.
-
Moving from Composer 2/Kimi 2.6 to Qwen3.6:35b-a3b (www.reddit.com)
I can't believe it, but I'm able to do my daily software development work on this model. We have a 500-700k line of code enterprise software suite that I'm devving for 60 hours a week.
-
Deepseek V4's 1M context window: the breaking point (www.reddit.com)
Just ran to verify deepseek v4's context claim of 1M and ran it across three production codebases like 45k (microservice), 180k (monorepo backend) and 520k(full stack app). For the observation, tasks included dependency tracing, cross file…
-
Hello all, I’m wondering what suggestions there are for an ios app that can serve an openai compatible endpoint. I am using 3sparks which works GREAT for that specific use, BUT, there is no mcp, no web search, etc.
-
DeepSeek-V4-Flash means LLM steering is interesting again (www.seangoedecke.com via hn)
DeepSeek-V4-Flash means LLM steering is interesting again Ever since Golden Gate Claude I’ve been fascinated with “steering”: the idea that you can guide LLM outputs by directly manipulating the activations of the model mid-flight. DeepSee…
-
Recent Developments in LLM Architectures: KV Sharing, MHC, Compressed Attention (magazine.sebastianraschka.com via hn)
Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention From Gemma 4 to DeepSeek V4, How New Open-Weight LLMs Are Reducing Long-Context Costs After a short family break, I am excited to be back and catching up o…
-
GPT 5.5 (Codex) leading the future prediction race (www.reddit.com)
Researchers from the Max Planck Institute recently released FutureSim, an environment in which agents are replayed a temporal slice of the web and are tasked with predicting real-world future events. In their environment, GPT 5.5 leads at…
-
Has anyone found a Qwen CLI replacement? (www.reddit.com)
I just need 1 or 2 people to reply to me with the answer I need. I have not been able to keep up with AI advancements for a while.
-
DeepSeek V4: The Open-Source Model Frontier Labs Feared (helloai.com via hn)
DeepSeek V4: The Open-Source Model Frontier Labs Feared DeepSeek V4 ships under MIT with $0.30/M output tokens — 83x cheaper than Claude Opus 4.7 — while scoring 80.6% on SWE-bench Verified. The agentic-coding price floor just moved an ord…
-
We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6 (blog.kilo.ai via hn)
We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6 DeepSeek V4 Pro and DeepSeek V4 Flash launched together on April 24, 2026 under MIT license. They are DeepSeek’s first new architecture since V3, and their first ope…
-
I'm on the Max20 plan, thinking about a setup before I sink time into it. Want to hear from anyone actually running it, not theorycraft.
-
Will there be a non-cloud version of Deepseek V4 flash available for Ollama? Or do I need to go to another framework to get a version that will be supported?
-
Hi there, it's my first post there and i'm not a native english speaker so what's follow is (mostly) translated by an AI. I had fun building a mindmap tool in a single monolithic HTML file.
-
OpenCode + DeepSeek V4 Pro vs Claude Code CLI?🤔 (www.reddit.com)
Im rather new to the whole Agentic automation AI's but Im hearing people with vibe coding were able to pull big unique projects they wouldn't be able to do by themselves or possibly needed to pay a huge fund to programmers, designers, etc.…
-
What are the best opensource coding models for 8x A6000 setup (www.reddit.com)
Currently using Qwen 3.6 27b and Qwen 3.6 35b but I was wondering if there is anything solid in the 50-200 range that you could run on a larger cluster that would be worth it? Or would you just run q8 or non quant versions instead?
-
TL;DR: DeepSeek-V4-Flash running at 85.52 tok/s @ 524k ctx and ~111 tok/s @ 128k single-stream on 2× RTX PRO 6000 Max-Q pasta-paul's DeepSeek-V4-Flash-W4A16-FP8 quant is great, but its MTP head silently gets stripped at load time (HF trans…
-
DS4 (www.reddit.com)
The developer that created Redis, Salvatore Sanfilippo, has released a new project on GitHub named DS4. https://github.com/antirez/ds4/ The TL;DR on this one is getting DeepSeek V4 Flash running with a 1M context windows on Mac Metal hardw…
-
Opus 4.7 and DeepSeek V4-Pro select Buddhism as preferred religion (twitter.com via hn)
Don’t miss what’s happening People on X are the first to know. Log in Sign up Post Conversation roon @tszzl hmm 8:02 AM · May 9, 2026 77.3K Views New to X?
-
Hey everyone, I’m Ted. I’ve been building a project called Throughline with my friend Drew: an AI assistant for live tabletop RPG sessions.
-
Canvas Data Breach; DeepSeek V4 Flash Boosts LLM Inference 4.3x (presciente.com via hn)
Canvas Data Breach Impacts Education; DeepSeek V4 Flash raises LLM Inference 4.3x DeepSeek V4 Flash Boosts LLM Inference 4.3x The Canvas educational platform experienced a data breach, with ShinyHunters threatening data release by May 12,…
-
Been on DeepSeek V4 for about three weeks across two production codebases (Python backend, TypeScript frontend) after a year on V3. Three things shifted noticeably better, two shifted noticeably worse.
-
What is the next SOTA model you are excited about? (www.reddit.com)
We had deepseek v4 preview recently but it wasn't much better than v3.2. What is the next SOTA local/open model you are excited about?
-
DS4, a specialized inference engine for DeepSeek v4 Flash (twitter.com via hn)
antirez @antirez Welcome to DS4, a specialized inference engine for DeepSeek v4 Flash. github.com/antirez/ds4 This project would have been impossible without the existence of llama.cpp and GGML and the work of @ggerganov and all the other…
-
CommandCode (www.reddit.com)
Yoh guys just wanted to ask I'm keep seeing an ADs about this new coding agent CommandCode that offer 1$/month and it has a 40$ package of Deepseek v4 pro and other models. NOTE : CLAUDE and GPT is not included on the 1$ plan.
-
DeepSeek-v4-Pro and Hermes: Unauthorized Modification of Security Controls (www.eddieoz.com via hn)
Deepseek-v4-pro + Hermes: Unauthorized Modification of Security Controls This article documents a specific, real incident. It exposes a class of vulnerability that deserves attention: the unsupervised mutability of security rules by autono…
-
A deepseek-v4-distill-qwen3.6-27b? (www.reddit.com)
Long time ago (actually only a year ago), DeepSeek released a few open source model, such as deepseek-r1-distill-qwen (https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B). I am wondering if anyone in the community is brave eno…
-
DeepSeek V4 Pro at 75% off until 31 May (api-docs.deepseek.com via hn)
Models & Pricing The prices listed below are in units of per 1M tokens. A token, the smallest unit of text that the model recognizes, can be a word, a number, or even a punctuation mark.
-
I recently did a benchmark of deepseek v4 in agentic tasks. Performance-wise, it's one of the best open source models, as expected.
-
I have been following the akitaonrails coding benchmark which tests against a fixed rails + Rubyllm + docker task rather than vendor-reported evals. April 2026 update put K2.6 at 87 sitting in tier A (80+), ahead of Qwen 3.6 plus (71), Dee…
-
Even unofficial or slow. I have enough vram-memory to load it, but not enough memory to run in cpu-only mode.
-
DeepSeek V4 Pro: The First Chinese Model at the Frontier (foodtruckbench.com via hn)
DeepSeek V4 Pro lands in the frontier ROI tier on FoodTruck Bench. 5/5 runs, +1,257% median ROI, $27K net worth, $3.51/run, 5× less waste than Grok 4.3.
-
That foodtruck bench post showing deepseek v4 matching gpt-5.2 at 17x cheaper got me thinking. if frontier cloud models are that overpriced for equivalent quality, how much of my daily work even needs cloud at all?
-
The benchmark uses adversarial, multi-turn debates across 683 curated motions. Each model pair debates the same motion twice with sides swapped.
-
Architecture explains the gap: MiMo's MoE runs more active params per token than Kimi K2.6's optimized routing hence slowest. DeepSeek V4's 'comprehensive' edge is partly MLA: ~75% KV-cache compression makes it far better for long agentic…
-
Tested DeepSeek V4 Pro on FoodTruck Bench — our 30-day agentic benchmark where models run a food truck via 34 tools (locations, pricing, inventory, staff, weather, events) with persistent memory and daily reflection. First Chinese model to…
-
Literally no 3rd party api inference provider is hosting the mimo-2.5 series models from Xiaomi. They seem to be reallly good.
-
Spent a Sunday auditing where my Codex tokens were actually going. Half the calls were stuff like "rename these 12 fields", "format this csv as markdown table", "extract the dates from this changelog".
-
I looked at what was actually eating my Claude usage and it was embarrassing. Classifying files.
-
DeepClaude – Claude Code agent loop with DeepSeek V4 Pro, 17x cheaper (github.com via hn)
deepclaude Use Claude Code's autonomous agent loop with DeepSeek V4 Pro, OpenRouter, or any Anthropic-compatible backend. Same UX, 17x cheaper.
-
CAISI Evaluation of DeepSeek V4 Pro finds it to be on par with GPT-5 (www.nist.gov via hn)
In April 2026, the Center for AI Standards and Innovation (CAISI) evaluated the open-weight AI model DeepSeek V4 Pro (“DeepSeek V4”). CAISI evaluations indicate that DeepSeek V4’s capabilities lag behind the frontier by about 8 months (Fig…
-
https://preview.redd.it/pz8qeln0auyg1.png?width=1400&format=png&auto=webp&s=00ee5218734cfae4783d702411d63e3a4c6bbc60 https://preview.redd.it/hem9mad5auyg1.png?width=1184&format=png&auto=webp&s=2a26fec2b49204e64b44a78b30902ab80f7df53c https…
-
127³ — Superintelligence, public. DeepSeek V4 Pro (deepseek-v4-pro-127cubed.vercel.app via hn)
DeepSeek V4 Pro 127³ 127-stratum crystalline lattice on DeepSeek V4 architecture. 1.6T params · 49B activated · MoE · 1M context · MIT license.
-
How can I locally run Deepseekv4 1.6T? I can use a VPS. (www.reddit.com)
I wanted to use vast.ai, but ollama doesnt have it, and when i used vLLM I didn't have success. I genuinely don't know what failed.
-
Your local LLM predictions and hopes for May 2026 (www.reddit.com)
Which of these do you think we'll get in May? Also, feel free to pick/rank which ones you'd want the most badly: more Gemma4 models (124b?) (other sizes?) more Qwen3.6 models (9b?
-
DeepSeek v4, and the end of the OpenAI/Microsoft AGI clause (simonw.substack.com via hn)
DeepSeek v4, and the end of the OpenAI/Microsoft AGI clause Plus LLM 0.32a0 In this newsletter: DeepSeek V4 - almost on the frontier, a fraction of the price Tracking the history of the now-deceased OpenAI Microsoft AGI clause LLM 0.32a0 i…
-
We run Qwen3.6-27B-FP8 at AI Router Switzerland and hit two issues, so I wanted to share in case anyone else runs into them. FP8 KV cache produces silent garbage output with radix cache prefix hits (PR #24198 — ✅ approved) We were running…
-
Most of my LLM cost was on the wrong tier of work. Classification, extraction, JSON formatting, summarization I'm going to review anyway.
-
I hate this group but not literally (www.reddit.com)
True story, I got interested in AI after seeing it at work and wanted to run models locally. I started with an M3 Ultra 96GB, quickly learned it was not enough for what I wanted, and kept upgrading hardware (including refurbished Mac Studi…
-
DeepSeek V4 Flash and V4 Pro in Microsoft Foundry (techcommunity.microsoft.com via hn)
As AI adoption matures, the conversation is shifting from model capability to system design, how to orchestrate models that deliver the right balance of quality, speed, and cost. Today, we’re expanding the Microsoft Foundry model catalog w…
-
Comparing SVG Generation for the top open models (codeinput.com via reddit)
Some of the larger models (like Llama) weren't available on OpenRouter, so I had to work with what was there. Best small model: Gemma 4 26B For its size, I think it had the best output.
-
DeepSeek V4 isn't beating Opus, but it doesn't need to (www.reddit.com)
DeepSeek V4 is not in the same league as GPT-5.5 or Opus 4.7. Benchmarks put it slightly below both of those, roughly on par with Opus 4.6.
-
I've been messing around with Hermes for months, and quickly outgrew using it just as a fancy CLI assistant. My goal was to build a persistent, specialized team of local agents that could collaborate on long-term projects without me spoon-…
-
llm 0.32a0 (simonwillison.net)
29th April 2026 Recent articles - LLM 0.32a0 is a major backwards-compatible refactor - 29th April 2026 - Tracking the history of the now-deceased OpenAI Microsoft AGI clause - 27th April 2026 - DeepSeek V4 - almost on the frontier, a frac…
-
I've been running local LLMs since Qwen 3.5 dropped and I was really impressed by what we could run on consumer hardware. Fast forward another two months and we have gotten a handful more gems such as Gemma 4 and Qwen 3.6, so I wanted to p…
-
Show HN: Filling PDF forms with AI using client-side tool calling (copilot.simplepdf.com via hn)
Hey HN! I built SimplePDF Copilot: an AI assistant that can interact with the PDF editor.
-
DISCLAIMER: I am not a programmer nor do I have experience coding. I've been thinking about a small app running on gradio for some time now, and I want to try tweaking some extension for ComfyUI.
-
Hey 👋 Saw the tweet making the rounds about deepseek v4 being 35x cheaper than opus on input and 178x cheaper on cached tokens, and was sure it was hyperbole. Pulled the numbers anyway because i had nothing better to do.
-
A 3D Flappy Bird side-scroller game built with DeepSeek V4 Pro (www.annajc.com via hn)
FLAPPY ANNA 3D PRESS SPACE OR TAP Presented by Guan, Made in Melb with DeepSeek and Love GAME OVER PRESS SPACE OR TAP 0000
-
100M tokens for $2.65 (Deepseek V4 Pro) (www.reddit.com)
This is actually unbelievable. I am shocked that there has not been a move in the market like it did last year with the R1 release.
-
could not extract summary
-
Guys is deepseek v4 pro really the best model (price to performance) because i was using nvidia apis for two weeks in opencode then suddwnly everything stopped working so i am thinking to opt for the payed (yet very affordable) option to m…
-
DeepSeek V4 Pro: Validating Frontier Models for Production (fireworks.ai via hn)
Why we chose correctness over a Day-0 launch DeepSeek V4 Pro is one of the most important open-model releases this year, with real advances in long-context reasoning, agentic performance, and inference efficiency. On paper, it looks like a…
-
Are we getting DeepSeek V4 and Kimi 2.6 soon? (www.reddit.com)
Or can we already use them in Cursor? DeepSeek V4 specifically looks very interesting and way cheaper.
-
DeepSeek V4 PRO on how many 3090 ? (www.reddit.com)
Hi guys I got only 3090 GPUs so... How many prefer to run to get a great result in DeepSeek V4 PRO?
-
For Non-hallucinating work, MiMo 2.5 delivers (www.reddit.com)
MIT license and fully open source. MiMo-V2.5-Pro was just 3 points from Opus 4.7 max and the normal V2.5 is only a step behind SOTA.
-
DeepSeek-V4 arrives with near SotA intelligence at 1/6th the cost (venturebeat.com via hn)
DeepSeek-V4 arrives with near state-of-the-art intelligence at 1/6th the cost of Opus 4.7, GPT-5.5 | VentureBeat Orchestration Infrastructure Data Security More Newsletters Featured DeepSeek-V4 arrives with near state-of-the-art intelligen…
-
First DeepSeek V4 Flash-Base-Int4 Quant (huggingface.co via hn)
DeepSeek-V4-Flash-Base INT4 A real INT4 packed-storage quantization of deepseek-ai/DeepSeek-V4-Flash-Base — a 284 B-parameter Mixture-of-Experts model. Hero numbers | Metric | This release | Community Q4KM norm | |---|---|---| | MMLU (5 su…
-
How will you scale these models (www.reddit.com)
How will you scale these models coding and overall. Deepseek v4 pro Kimi k2.6 Mimo v2.5 pro Glm 5.1 Qwen 3.6 plus
-
could not extract summary
-
fkyah3/opencode-fkyah3 DeepSeek 优化 · Windows 适配 · AI 实现 🚀 从零搭建指南(中文) · English · 繁體中文 本项目是 anomalyco/opencode 的个人 Fork。所有修复、优化、功能均由 AI 完成——DeepSeek V4 Flash (thinking mode) / Sisyphus——在人类监督下执行。 上游是优秀项目。Windows 和 DeepSeek 并非他们的优先方向。我们自行处理。…
-
No GGUFs for DeepSeek V4-Flash as yet? (www.reddit.com)
Wondering why there aren't any "name brand" (like unsloth, bartowski) GGUFs as yet for DeepSeek V4 Flash?
-
Deepseek v4 flash weird sizes? (www.reddit.com)
So I'm sure everyone is excited about the new deepseek release(s) but I'm a little confused about it's vram requirements. a q4 gguf of it is only 120gb?
-
DeepSeek's new models are so efficient they'll run on a toaster by which we mean (www.theregister.com via hn)
DeepSeek's new models are so efficient they'll run on a toaster ... by which we mean Huawei's NPUs Now available in preview, DeepSeek V4 cuts inference costs to a fraction of R1 Chinese AI darling DeepSeek is back with a new open weights l…
-
anyone actually tried deepseek v4 pro for coding? (www.reddit.com)
so v4 pro dropped and barely anyone is talking about it. feels weird since when kimi k2.6 came out i seen post about it everywhere anyone here tried v4 pro for actual code work?
-
DeepSeek V4 plays Go on a 9x9 board (chat.deepseek.com via hn)
We need to create a prompt that sets up a fresh Go game session. The user wants to "export a go board prompt for a new session," meaning they want a prompt they can copy-paste into a new chat to start a game of Go with me, presumably with…
-
is Deepseek v4 unvailable in Cursor? I cannot see it. (www.reddit.com)
It seems that Cursor removed all the DeepSeek models. I find it limiting, considering it seems performant.
-
llama.cpp DeepSeek v4 Flash experimental inference (www.reddit.com)
Hi, here you can find experimental llama.cpp support for DeepSeek v4, and here there is the GGUF you can use to run the inference with "just" (lol) 128GB of RAM. The model, even quantized at 2 bit, looks very solid in my limited testing, a…
-
The exact KV cache usage of DeepSeek V4 (www.reddit.com)
Figure 1 of DSV4 paper seems to imply that DSV3.2 uses ~50GB at 1m context and DSV4 uses ~5GB: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf From my own calculations, the correct FP16 KV cache at 1m context s…
-
TL;DR and rundown DeepSeek v4 released this week and performs close to frontier models like GPT/Opus on benchmarks. It's available now and is discounted by a whopping 75% through their API until May 5, making it the most cost effective hig…
-
DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles (www.lmsys.org via hn)
DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles We are thrilled to announce Day-0 support for DeepSeek-V4 across both inference and RL training. SGLang and Miles form the first open-source stack to serve and…
-
DeepSeek V4 with Strix: a quick test (theaq.blog via hn)
Deepseek V4 with Strix: a quick test Deepseek released V4 yesterday in two variants. V4 Pro has 1.6T total parameters with 49B active, while V4 Flash is the smaller, faster, cheaper sibling with 284B total and 13B active.
-
DeepSeek V4 API price reduced, limited-time discount of 75%. (www.reddit.com)
https://preview.redd.it/qgqf66unacxg1.png?width=1144&format=png&auto=webp&s=9241d9c7b5aebb52f25c87f50520c2330852291c https://api-docs.deepseek.com/quick_start/pricing
-
To run deepseek v4 flash how much max vram we need? 175 gb or 320gb? (www.reddit.com)
As far as i know the weight is of 160gb + 9.6gb needed for max 1 million token window + 5 gigs overhead = 175gb vram. But vllm and othere sources said "To use the full 1M context, you need 4x A100 80G" --> thats a 320gb vram ??
-
Show HN: A CLI to use any model in your coding agent (getaivo.dev via hn)
Hi everyone, I've been working on a CLI tool that can help to easily run any model in claude, Codex, Gemini, Pi, and OpenCode. It's also an API keys manager, supports multiple providers or OpenAI/Claude/Gemini accounts.
-
Deepseek V4 flash (high) rivals Gemini 3 flash at 1/5th the cost (www.reddit.com)
could not extract summary
-
About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
-
Xiaomi’s MiMo V2.5 Pro has landed at 54 in the Artificial Analysis Intelligence Index, tied with Moonshot’s Kimi K2.6 - the current top open weights model. MiMo V2.5 Pro’s weights are expected to be released soon, which would make MiMo V2.…
-
DeepSeek v4 - Subjective vibes (www.reddit.com)
I must say Iam kinda torn what to think about those models. At one hand they "ace" some questions on other sometime they behave genuinely weird.
-
🚨 The Chinese beast is BACK… DeepSeek just dropped V4 (www.reddit.com)
After months of silence… DeepSeek V4 just got announced and honestly, this might shake things again. Here’s what’s crazy: 🧠 1 MILLION token context window (yes… insane long-context memory) ⚡ Comes in two versions: V4 Pro → full power (reas…
-
Deepseek V4 AGI comfirmed (www.reddit.com)
could not extract summary
-
We have a chat system which we use haiku for because it is mostly about tool calling and summarisation of them. But we have many tools with pretty complex input schemas, and stuff like gemma didn't cut it, so we went with haiku.
-
Did some test tasks with v4 flash. The context management, tool use accuracy and thinking traces all looked excellent.
-
Ask HN: Why is cache for DeepSeek-v4 cheapest on Vercel AI Gateway? (news.ycombinator.com)
Do they charge below their cost? Or do they run their own cache?
-
could not extract summary
-
Takeaways & discussion about the DeepSeek V4 architecture (www.reddit.com)
Spent the morning looking at the V4 tech report. The benchmarks are getting deserved attention, but I think the architecture is also worth digging into.
-
Budget to run Deepseek V4 locally at FP4 precision (www.reddit.com)
Just a question for fun/curiosity: in your opinion, if I had enough money, how much would be needed and what configuration would be required to run DeepSeek v4? Maybe not necessarily everything in VRAM, maybe something hybrid.
-
DeepSeek-v4 has a comical 384K max output capability (www.reddit.com)
was shocked when saw that spec, immediatly went to the website and asked it to make a comprehensive single-html-web-OS and it indeed generated a single 100KB html for me...I'm speechless. https://preview.redd.it/6zcbzbkvj3xg1.png?width=287…
-
DeepSeek V4 - almost on the frontier, a fraction of the price (simonwillison.net)
DeepSeek V4—almost on the frontier, a fraction of the price 24th April 2026 Chinese AI lab DeepSeek’s last model release was V3.2 (and V3.2 Speciale) last December. They just dropped the first of their hotly anticipated V4 series in the sh…
-
DeepSeek-V4 Drops: Open-Source Push Toward Cheaper, Long-Context AI. (www.reddit.com)
source : https://x.com/pankajkumar_dev/status/2047552208175354229?s=20
-
No Multimodality yet in DeepSeek-V4. But I'll wait. (www.reddit.com)
I hope they include it in their next v4 release. Source: DeepSeek_V4_Technical_Report
-
could not extract summary
-
DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence (huggingface.co via hn)
DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence Technical Report👁️ Introduction We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro wi…
- DeepSeek-V4: Making 1M token context efficient (firethering.com)
- DeepSeek V4 in vLLM: Efficient Long-Context Attention (vllm-website-pdzeaspbm-inferact-inc.vercel.app)
- DeepSeek-V4: a million-token context that agents can actually use (huggingface.co)
-
DeepSeek-V4 Preview Version is launched (news.ycombinator.com)
DeepSeek just dropped the preview of their V4 series, with both open-weight and available via API. 1M context window.
-
DeepSeek v4 (api-docs.deepseek.com via hn)
https://api-docs.deepseek.com/ https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...
-
DeepSeek-V4 Technical Report [pdf] (huggingface.co via hn)
DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence Technical Report👁️ Introduction We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro wi…
-
Deepseek V4 Flash and Non-Flash Out on HuggingFace (www.reddit.com)
https://huggingface.co/collections/deepseek-ai/deepseek-v4