I have 4x 128 GB VRAM now , what should i do. (www.reddit.com via reddit)
model roundup
Qwen 3.5
-
-
https://preview.redd.it/z66h627yi96h1.png?width=1080&format=png&auto=webp&s=94040bb76c0f8099b58927771c2193dd6a5019da qwen3.5 9b at 0 bit quant>>>>>>>copilot
-
I took the liberty to test both models today on my favorite benchmark question, head to head. Device: Apple Mac M3 Max 64GB Environment: llama.cpp, all defaults Gemma4-12B's token generation speed: 47 tps with MTP and 2 predicted tokens 29…
-
Nex N2 has a funny "few words do trick" reasoning (www.reddit.com via reddit)
-
Preferred two LLM combo (www.reddit.com via reddit)
I’m using my MacBook Pro M1 Pro with 32GB to run Qwen3.5-35B in Q4 as my coding agent. I have a gaming PC with a 5070 Ti that I’m currently not using but would like to.
-
Dense vs MoE quantization resiliance (www.reddit.com via reddit)
Which one is more resiliant to quantization? Especially at 4-bit?
-
Running Hermes fully local (www.reddit.com via reddit)
Before Hermes was announced, I was working on my own fully local, personal agentic system. Now, I'm a novice when it comes to coding.
-
I can't wait for all the x250 sample distills of Mythos and GPT-5.6 (www.reddit.com via reddit)
Just kidding. Are there any distills that actually improve a model's quality?
-
It felt good to return my Asus Spark (www.reddit.com via reddit)
It's an incredible little package but too expensive of a price to pay for the performance and I simply didn't want to be part of the great "Superchip lie" - it could be super, but its super ruined by its limited memory bandwidth even thoug…
-
Launch HN: General Instinct (YC P26) – Frontier models on edge devices (news.ycombinator.com)
Hey HN, Guanming and Bill here from General Instinct (https://general-instinct.com/). After years of working in robotics, we kept running into the same problem: the best models never fit the hardware we actually had available.
-
Show HN: Hitoku Draft – Context aware local assistant (hitoku.me via hn)
Hi guys. I have been working on Hitoku Draft, an open-source, voice-first AI assistant that runs entirely locally.