#deepseek
23 items
Why has ChatGPT become so annoying and disagreeable? (www.reddit.com via reddit) DeepSeek Updated their repo DeepGEMM testing Mega MoE (www.reddit.com via reddit) https://github.com/deepseek-ai/DeepGEMM/pull/304 https://preview.redd.it/vcmqwmvzijvg1.png?width=1014&format=png&auto=webp&s=76b1739925f0699b0763aa7814614dd40329c41e https://github.com/deepseek-ai/DeepGEMM/commit/a050d09461e86eb6bba35a8c74…
Guys we have to change the pelican test (www.reddit.com via reddit) So i have been seeing more of those pelican on a bike svg tests and while they work i feel like (and maybe you guys do too) they are getting kinda benchmaxxed so we should switch things up soon and this is my idea generate me a html svg of…
We benchmarked TranslateGemma-12b against 5 frontier LLMs on subtitle translation - it won across the board, with one significant catch (www.reddit.com via reddit) [P] Built GPT-2, Llama 3, and DeepSeek from scratch in PyTorch - open source code + book (www.reddit.com via reddit) I wrote a book that implements modern LLM architectures from scratch. The part most relevant to this sub: Chapter 3 takes GPT-2 and swaps exactly 4 things to get Llama 3.2-3B: LayerNorm → RMSNorm Learned positional encodings → RoPE GELU →…
Single question llm comparison (www.reddit.com via reddit) Best LLM for logic/ spatial reasoning on small context inputs? (www.reddit.com via reddit) My system has 32gb RAM and 8gb VRAM. I tried out DeepSeek-R1-Distill-Qwen-7B-Q6_K_L.gguf and it was vastly inadequate for what I wanted so looking for other suggestions.
Show HN: A book that builds GPT-2, Llama 3, DeepSeek from scratch in PyTorch (news.ycombinator.com via hn) I'm a software engineer who works with LLMs professionally (Forward Deployed Engineer at TrueFoundry). Over the past year I built up implementations of five LLM architectures from scratch and wrote a book around them.
Claude down? TokenMonopoly will help you find the best deals in AI subs (tokenmonopoly.com via hn) TokenMonopoly Live leaderboard of AI API deals — pricing, subscriptions, and SWE-bench scores for Claude, GPT, Gemini, Kimi, DeepSeek, Llama and more. Compare 27 benchmarked models across 96 hosts by price-per-performance, refreshed daily.
Is my 'Retry Tax' math correct for DeepSeek V3/V4 agents? (Project Feedback) (www.reddit.com via reddit) Feedback on iOS app with local AI models (www.reddit.com via reddit) Hey everyone, I just shipped an iOS app that runs local AI models. Current has 12 models: Gemma 4, Llama 3.3, Qwen3, DeepSeek R1 Distill, Phi-4, etc.
For AI agents: is per‑token pricing killing your budget? Looking for feedback on time‑based subscriptions. (www.reddit.com via reddit) Hey r/AI_Agents, I run an inference service (cheapestinference.com) and we're exploring a different pricing model that might be more predictable for agent workloads. Instead of per‑token billing, we offer **dedicated 8‑hour time windows**…
I built an MCP server that gives Claude Code image/video generation, web search, and smart multi-model routing (www.reddit.com via reddit) Claude Code with Pro subscription + OpenRouter in parallel — what's the cleanest setup? (www.reddit.com via reddit) Hi there, I have a Claude Pro subscription and use Claude Code daily. I'd also like to use Claude Code routed through my OpenRouter API key so I can experiment with other models (GLM-5.1, DeepSeek, Kimi, Gemini, etc.) — without giving up m…
MINISFORUM AI X1 Pro-370 (96GB) - Local Ollama Help (www.reddit.com via reddit) Hey all. This just got delivered yesterday.
Deepseek-r1 thinks for 30 minutes? (www.reddit.com via reddit) I was trying to ask a question about coding using DeepSeek-R1-0528-Qwen3-8B-Q4_K_M, and the thinking took 30 minutes??? https://preview.redd.it/kex3fgg4lgvg1.png?width=277&format=png&auto=webp&s=5f7e7cdc8502b935ea8b8fb83e0e4af60c3c4533 I h…
DeepSeek V4 reportedly drops late April. 1M context, multimodal, Claude-level coding. (www.reddit.com via reddit) Leaks point to late April release. Key specs 1M token context window Native multimodal (image/video input) Projected ~85% SWE-Bench Verified (ties or beats Claude Opus 4.6) Base model remains free.
Running a full agentic coding loop locally on a 3090. Here's what actually works in 2026. (www.reddit.com via reddit) After months of testing, I finally have a local setup that doesn't make me want to go back to the API. Hardware: RTX 3090 (24GB VRAM) Models tested: Qwen2.5-Coder 32B Q4_K_M, DeepSeek-Coder-V3 Q4, Llama 3.3 70B Q3_K_M Inference: llama.cpp…
Looking for people with different hardware to help benchmark local LLM behavioral reliability (www.reddit.com via reddit) I've been working on measuring how LLMs actually behave (not what they know) across different hardware setups. Things like: does the model cave when you push back on a correct answer?
AI lied to me about a video game existing, so I sued it in the High Court of the Internet and got 2 settlement games (www.reddit.com via reddit) TL;DR: Claude hallucinated "Champions Career Mode." I threatened to sue Anthropic. Claude admitted guilt and built me a custom HTML5 game as settlement.
4 llm Groupchat (www.reddit.com via reddit) Why most open-source models can't answer this question while most closed-source models can answer most of the time? (www.reddit.com via reddit) Is 32GB Mac enough for engineering/coding, or stick to Claude? (www.reddit.com via reddit)