model roundup

DeepSeek 4

26 items · started 2026-05-29 · ongoing (last activity 2026-06-09)

  1. So apparently the model gets beaten by qwen 3.6 on every benchmark reported by cohere labs. You are getting lower RAM (considering model offload) usage and slightly better performance for imo significantly less output quality.

  2. Article Conversation Running DeepSeek-V4-Flash on a Raspberry Pi I ran DeepSeek-V4-Flash on a Raspberry Pi 5 (8GB edition) by streaming model weights from a PCIe attached NVMe SSD. Codex (GPT-5.5 xhigh) and Claude Code (Opus 4.8 max) drove…

  3. I needed a smarter model for my local Hermes Agent setup, so I moved to DeepSeek v4 Flash. First things first: Running 4 concurrent threads on vLLM, I can hit ~400 tok/s 400 x 60 x 60 x 24 x 30 is ~1B TOKENS per month!!!

  4. DStudio A native, local-first desktop app for DeepSeek V4 — chat, a coding agent and a design studio, all running on your Mac. Nothing leaves the device.

  5. Over the last few weeks I've been comparing the latest frontier AI models, including Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro, Grok 4.3, Perplexity AI and DeepSeek V4-Pro. Instead of focusing only on benchmark scores, I looked at: Real-wor…

  6. So Hear me out. Not only on almost all benchmarks is mimo v2.5 is better than dsv4f flash, but also the pricing.

  7. DeepSeek V4 math score equals GPT-5.5 (91) and trails by just 4-6 points in other categories — at 97% lower cost. Is the AI quality as good as GPT?

  8. DeepSeek V4 Pro takes this matchup 38.0 to 33.0, and the margin feels earned. Across the scored tasks, the pattern is simple: Model A was tighter, more literal, and more reliable under constraints, while Model B was good but a little too w…

  9. Hi, I'm a little confused. I was doing a code review of one of my repositories, mainly just testing out different models to see what came back.

  10. I am thinking to buy 20 dollars pro. is this approach make sense?

  11. I had briefly used ChatGPT pro feature - in the chat app. It was quite amazing.

  12. In case you're not aware already, the DeepSeek V4 series is finally getting supported on llama.cpp with this PR! The PR is at a very early stage right now, so only try it if you're consciously willing to experiment out of curiosity and acc…

  13. could not extract summary

  14. DeepSWE DeepSWE is a benchmark for measuring frontier coding agents on original, long-horizon software engineering tasks drawn from active open-source repositories. The benchmark includes 113 tasks across TypeScript, Go, Python, JavaScript…

  15. I didn’t create this recipe you guys did but I was finally able to find it and get Deepseek v4 Flash working with 200k Context on 2 Nodes. Sharing this since I couldn’t find a confirmed end-to-end recipe for the official DeepSeek-V4-Flash…

  16. Bringing up DeepSeek-V4-Flash on AMD MI300X At Doubleword we are building an inference cloud designed for volume. To do that we have to reckon with the enveloping compute shortage.

  17. If you're seeing this message, that means JavaScript has been disabled on your browser , please enable JS to make Imgur work.

  18. Remember how Meta monitored employee activity closely for a few months, and then had a bunch of layoffs related to AI efficiency? (oh right that was like 3 days ago).

  19. DeepSeek’s announcement over the weekend that it has made its 75% price cut permanent on its flagship V4 Pro model is a disruptive assault on the capital-heavy business models of Silicon Valley’s frontier labs. The reduction on DeepSeek V4…

  20. Hey HN, We believe we have the easiest onboarding from signup to being able to spin up coding agents in slack like Stripe, Ramp & Coinbase. Demo of the onboarding: https://www.tella.tv/video/connecting-cord-to-slack-1-19ep Every signup get…

  21. Hi all, We just opensourced a project called Orbit, which can RL post train trillion scale LLMs like deepseek v4. We found it pretty cool!

  22. my apologies if anything does not make sense, I literally dont know what I am doing, im not a programmer, just a simple vibe coder, with an Claude subscription. That said, if you have 200gb of sys ram+vram and want to run deepseek v4 flash…

  23. Trying to figure out the right box for my team and wanted to see if anyone had any clue which would be a better fit or if it is not worth our time in our budget. Situation: 5 of us doing agentic coding (lots of long context getting re-sent…

  24. Best I tried so far is https://huggingface.co/nsparks/DeepSeek-V4-Flash-FP4-FP8-GGUF with the custom llama.cpp fork, but it suffers from low quality and random incoherent output. VLLM wouldn't support anything other than H100s for DS4.

← all threads