model roundup

DeepSeek 4

30 items · started 2026-05-29 · closed 2026-06-13

Fable 5 Max confidently wrong about PDF encryption status (www.reddit.com via reddit)

2w hallucination deepseek

I just ran into a bizarre hallucination with Fable 5 Max regarding file analysis. i uploaded several PDF to Fable 5 Max, and out of two of it claude completely refused to process it, claiming the files was password-protected.
How can Deepseek v4 top the coding leaderboards and still sit 8 months behind the frontier? (www.reddit.comhttps)

2w swe-bench gpt-5 deepseek+1

Two numbers on this model that don't sit comfortably with each other. The Pro config posts coding scores near the top of every board, 80.6 on SWE-bench Verified and 93.5 on LiveCodeBench.
FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention (www.reddit.com via reddit)

2w deepseek

Conventional LLMs keep the full KV cache loaded during decoding, causing a severe GPU memory bottleneck for ultra-long context serving. In this report, we propose Lookahead Sparse Attention (LSA), a novel inference paradigm powered by a Ne…
Deepseek v4 pro (www.reddit.com via reddit)

2w deepseek cursor

Hello, Ive ran out of Pro+, is it possible to use DS4 in cursor ide? thanks
Bit of a lull or Winter is Coming? (www.reddit.com via reddit)

2w mistral mythos openai+1

It feels as though we’re at an inflection point and I was wondering what others‘ take is on the current situation: On the frontier end we have OpenAI and Anthropic gearing up for their IPO, so it‘s all Mythos and wow and it seems plausible…
Can I finetune Deepseek V4-flash with two rtx pro 6000s (www.reddit.com via reddit)

2w deepseek

Well I knew, it may be very tight on 192GB. However, is there any framework to do finetuning of DS4-flash with 4bit QLoRA?
DOA model by Cohere Labs (www.reddit.com via reddit)

2w deepseek qwen

So apparently the model gets beaten by qwen 3.6 on every benchmark reported by cohere labs. You are getting lower RAM (considering model offload) usage and slightly better performance for imo significantly less output quality.
Running DeepSeek-V4-Flash on a Raspberry Pi (twitter.com via hn)

+1 2w gpt-5 deepseek codex+2

Article Conversation Running DeepSeek-V4-Flash on a Raspberry Pi I ran DeepSeek-V4-Flash on a Raspberry Pi 5 (8GB edition) by streaming model weights from a PCIe attached NVMe SSD. Codex (GPT-5.5 xhigh) and Claude Code (Opus 4.8 max) drove…
Here are some tips on hitting nearly 200 tok/s for DeepSeek v4 Flash on Hopper (dnhkng.github.io via reddit)

2w vllm deepseek

I needed a smarter model for my local Hermes Agent setup, so I moved to DeepSeek v4 Flash. First things first: Running 4 concurrent threads on vLLM, I can hit ~400 tok/s 400 x 60 x 60 x 24 x 30 is ~1B TOKENS per month!!!
Share your agentic LLMs and average cost ($/MTokens) (www.reddit.com via reddit)

2w deepseek opus agentic
DStudio – local DeepSeek V4 with a design studio, reachable from your phone (github.com via hn)

+1 2w deepseek

DStudio A native, local-first desktop app for DeepSeek V4 — chat, a coding agent and a design studio, all running on your Mac. Nothing leaves the device.
Mimo v2.5 is better deal than DeepSeek v4 flash (news.ycombinator.com)

+1 2w deepseek

So Hear me out. Not only on almost all benchmarks is mimo v2.5 is better than dsv4f flash, but also the pricing.
Show HN: One API Key for 45 AI Models – Pay per Token, OpenAI Compatible (modelhub-api.com via hn)

+2 2w gpt-5 deepseek openai

DeepSeek V4 math score equals GPT-5.5 (91) and trails by just 4-6 points in other categories — at 97% lower cost. Is the AI quality as good as GPT?
DeepSeek V4 Pro beats GPT-5.5 Pro on precision (runtimewire.com via hn)

+8412 2w gpt-5 deepseek

DeepSeek V4 Pro takes this matchup 38.0 to 33.0, and the margin feels earned. Across the scored tasks, the pattern is simple: Model A was tighter, more literal, and more reliable under constraints, while Model B was good but a little too w…
Command Code - confusing messages (www.reddit.com via reddit)

2w deepseek

Hi, I'm a little confused. I was doing a code review of one of my repositories, mainly just testing out different models to see what came back.
planing with composer 2.5 executing with deepseek v4 flash (www.reddit.com via reddit)

2w deepseek

I am thinking to buy 20 dollars pro. is this approach make sense?
Alternate to ChatGPT Pro (www.reddit.com via reddit)

2w deepseek chatgpt

I had briefly used ChatGPT pro feature - in the chat app. It was quite amazing.
DeepSeek V4 Flash is amazing! (WIP llama.cpp PR #24162) (www.reddit.com via reddit)

2w deepseek llama

In case you're not aware already, the DeepSeek V4 series is finally getting supported on llama.cpp with this PR! The PR is at a very early stage right now, so only try it if you're consciously willing to experiment out of curiosity and acc…
DeepSeek V4 managed to reverse engineer Teamspeak's Licensing System with $3.88 (old.reddit.com via hn)

+1 3w deepseek

could not extract summary
DeepSWE Audit: DeepSeek-v4-pro results are unreliable (github.com via hn)

+3 3w deepseek

DeepSWE DeepSWE is a benchmark for measuring frontier coding agents on original, long-horizon software engineering tasks drawn from active open-source repositories. The benchmark includes 113 tasks across TypeScript, Go, Python, JavaScript…
DeepSeek-V4-Flash (official FP8) running across 2x DGX Spark (forums.developer.nvidia.com via hn)

+1 3w deepseek

I didn’t create this recipe you guys did but I was finally able to find it and get Deepseek v4 Flash working with 200k Context on 2 Nodes. Sharing this since I couldn’t find a confirmed end-to-end recipe for the official DeepSeek-V4-Flash…
Bringing Up DeepSeek-V4-Flash on AMD MI300X (fergusfinn.com via hn)

+8 3w deepseek

Bringing up DeepSeek-V4-Flash on AMD MI300X At Doubleword we are building an inference cloud designed for volume. To do that we have to reckon with the enveloping compute shortage.
Did DeepSeek v4 suddenly become more expensive? (imgur.com via hn)

+12 3w deepseek

If you're seeing this message, that means JavaScript has been disabled on your browser , please enable JS to make Imgur work.
Show HN: Train Claude Code's replacement (ds4 and pi and aoe) (github.com via hn)

+1 3w deepseek claude-code

Remember how Meta monitored employee activity closely for a few months, and then had a bunch of layoffs related to AI efficiency? (oh right that was like 3 days ago).
How DeepSeek's architecture is shattering Silicon Valley's token moat (venturebeat.com via hn)

+3 4w deepseek

DeepSeek’s announcement over the weekend that it has made its 75% price cut permanent on its flagship V4 Pro model is a disruptive assault on the capital-heavy business models of Silicon Valley’s frontier labs. The reduction on DeepSeek V4…
Show HN: Free open source coding models in Slack (www.runcord.com via hn)

+2 4w minimax glm gemma+5

Hey HN, We believe we have the easiest onboarding from signup to being able to spin up coding agents in slack like Stripe, Ramp & Coinbase. Demo of the onboarding: https://www.tella.tv/video/connecting-cord-to-slack-1-19ep Every signup get…
The first framework that can post train DeepSeek V4-pro on a single-node? (news.ycombinator.com)

+37 4w deepseek

Hi all, We just opensourced a project called Orbit, which can RL post train trillion scale LLMs like deepseek v4. We found it pretty cool!
DeepSeek V4 Flash at 8.4 tok/s on 3×3090: patching the GGUFs that won't load on cchuter's llama.cpp fork (www.reddit.com)

+18 4w deepseek llama

my apologies if anything does not make sense, I literally dont know what I am doing, im not a programmer, just a simple vibe coder, with an Claude subscription. That said, if you have 200gb of sys ram+vram and want to run deepseek v4 flash…
GH200 NVL2 or 8x RTX 6000 Blackwell for running Kimi K2.6 / DeepSeek V4 locally? (5 devs, agentic coding) (www.reddit.com)

+119 4w moe deepseek agentic

Trying to figure out the right box for my team and wanted to see if anyone had any clue which would be a better fit or if it is not worth our time in our budget. Situation: 5 of us doing agentic coding (lots of long context getting re-sent…
Looking for a working Deepseek-v4-Flash quant (www.reddit.com)

+25 4w vllm deepseek llama

Best I tried so far is https://huggingface.co/nsparks/DeepSeek-V4-Flash-FP4-FP8-GGUF with the custom llama.cpp fork, but it suffers from low quality and random incoherent output. VLLM wouldn't support anything other than H100s for DS4.

← all threads