model roundup

DeepSeek 4

16 items · started 2026-06-11 · ongoing (last activity 2026-06-17)

Using Claude Opus as planner + DeepSeek as worker in Claude Code — anyone solved the single-session routing problem? (www.reddit.com via reddit)

8h deepseek opus anthropic+1

I've been running a hybrid planner/worker setup with Claude Code and hit a tricky constraint I'm hoping the community has thoughts on. The setup Planner — Claude Opus for architecture, planning, and review Worker — DeepSeek V4 Pro / DeepSe…
Native Coding Agent Optimized for Local LLM and DeepSeek v4 with Vector Memory (code.intellios.ai via hn)

+1 1d deepseek openai

cwcode A terminal coding agent built around DeepSeek V4 Pro, Qwen3.6‑27B, Kimi, Azure, and anything else that speaks OpenAI’s chat API. Written in Go.
DeepSeek V4 Pro at 5% the cost of Claude – what it takes to close the gap (howardchen.substack.com via hn)

+8 1d deepseek

DeepSeek V4 Pro at 5% the cost of Claude — what it takes to close the gap Hash-anchored edits, a sticky prefix cache, and the autonomous loops we run on production code We’ve been using DeepSeek V4 Pro as our daily-driver coding model for…
Kimi 2.7 vs. DeepSeek Coder (simpletechguides.com via hn)

+31 2d deepseek

Kimi K2.7 Code vs MiMo Code vs DeepSeek V4 Pro: Three Open-Source Coding Tools Compared Three Chinese AI labs shipped major coding tools in the same window this spring: Moonshot AI released Kimi K2.7 Code, Xiaomi shipped MiMo Code, and Dee…
DeepSeek-V4 Can't Read Images? I Made It Read (www.dataleadsfuture.com via hn)

+2 2d deepseek

DeepSeek-V4 Can't Read Images? I Made It Read Don't wait for a multimodal model, you can use it now Introduction Have you ever had that frustrating moment: you are coding with deepseek-v4 in OpenCode, your code throws an error, you want to…
Fable 5 Is Dead. And Honestly? We Might Be Better Off (www.reddit.com via reddit)

2d gpt-5 deepseek gemini+2

3 days after launch, the US gov forced Anthropic to pull its most powerful model — Fable 5. Then OpenRouter dropped a benchmark suggesting you might not even need it.
International Market Retention Strategy After the Fable 5 Export Ban (www.reddit.com via reddit)

3d deepseek anthropic

Like many of you, I lost access to Fable 5 on June 12. The next day, I co-authored a strategy paper with Claude addressing the core business problem: how does Anthropic retain its international market now that cloud-only deployment has bee…
Ask HN: Which cheap Chinese LLM are you using? (news.ycombinator.com)

+4 3d minimax glm deepseek

In the last one or two months, starting from DeepSeek V4 Pro, there are quite many low-price Chinese models coming out. Their performance looks more or less similar to me: Mimo V2.5 Pro, MiniMax M3, and the just released GLM 5.2, etc.
Fable 5 Max confidently wrong about PDF encryption status (www.reddit.com via reddit)

6d hallucination deepseek

I just ran into a bizarre hallucination with Fable 5 Max regarding file analysis. i uploaded several PDF to Fable 5 Max, and out of two of it claude completely refused to process it, claiming the files was password-protected.
How can Deepseek v4 top the coding leaderboards and still sit 8 months behind the frontier? (www.reddit.comhttps)

6d swe-bench gpt-5 deepseek+1

Two numbers on this model that don't sit comfortably with each other. The Pro config posts coding scores near the top of every board, 80.6 on SWE-bench Verified and 93.5 on LiveCodeBench.
FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention (www.reddit.com via reddit)

7d deepseek

Conventional LLMs keep the full KV cache loaded during decoding, causing a severe GPU memory bottleneck for ultra-long context serving. In this report, we propose Lookahead Sparse Attention (LSA), a novel inference paradigm powered by a Ne…
Deepseek v4 pro (www.reddit.com via reddit)

7d deepseek cursor

Hello, Ive ran out of Pro+, is it possible to use DS4 in cursor ide? thanks
Bit of a lull or Winter is Coming? (www.reddit.com via reddit)

7d mistral mythos openai+1

It feels as though we’re at an inflection point and I was wondering what others‘ take is on the current situation: On the frontier end we have OpenAI and Anthropic gearing up for their IPO, so it‘s all Mythos and wow and it seems plausible…
Can I finetune Deepseek V4-flash with two rtx pro 6000s (www.reddit.com via reddit)

7d deepseek

Well I knew, it may be very tight on 192GB. However, is there any framework to do finetuning of DS4-flash with 4bit QLoRA?
DOA model by Cohere Labs (www.reddit.com via reddit)

8d deepseek qwen

So apparently the model gets beaten by qwen 3.6 on every benchmark reported by cohere labs. You are getting lower RAM (considering model offload) usage and slightly better performance for imo significantly less output quality.
Running DeepSeek-V4-Flash on a Raspberry Pi (twitter.com via hn)

+1 8d gpt-5 deepseek codex+2

Article Conversation Running DeepSeek-V4-Flash on a Raspberry Pi I ran DeepSeek-V4-Flash on a Raspberry Pi 5 (8GB edition) by streaming model weights from a PCIe attached NVMe SSD. Codex (GPT-5.5 xhigh) and Claude Code (Opus 4.8 max) drove…

← all threads