model roundup

Qwen 3.5

11 items · started 2026-06-04 · ongoing (last activity 2026-06-09)

  1. https://preview.redd.it/z66h627yi96h1.png?width=1080&format=png&auto=webp&s=94040bb76c0f8099b58927771c2193dd6a5019da qwen3.5 9b at 0 bit quant>>>>>>>copilot

  2. I took the liberty to test both models today on my favorite benchmark question, head to head. Device: Apple Mac M3 Max 64GB Environment: llama.cpp, all defaults Gemma4-12B's token generation speed: 47 tps with MTP and 2 predicted tokens 29…

  3. I’m using my MacBook Pro M1 Pro with 32GB to run Qwen3.5-35B in Q4 as my coding agent. I have a gaming PC with a 5070 Ti that I’m currently not using but would like to.

  4. Which one is more resiliant to quantization? Especially at 4-bit?

  5. Before Hermes was announced, I was working on my own fully local, personal agentic system. Now, I'm a novice when it comes to coding.

  6. Just kidding. Are there any distills that actually improve a model's quality?

  7. It's an incredible little package but too expensive of a price to pay for the performance and I simply didn't want to be part of the great "Superchip lie" - it could be super, but its super ruined by its limited memory bandwidth even thoug…

  8. Hey HN, Guanming and Bill here from General Instinct (https://general-instinct.com/). After years of working in robotics, we kept running into the same problem: the best models never fit the hardware we actually had available.

  9. Hi guys. I have been working on Hitoku Draft, an open-source, voice-first AI assistant that runs entirely locally.

← all threads