Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s

reddit-localllama · www.reddit.com ·86 pts·35 replies ↗ ·23h

Spent a bunch of time tuning llama.cpp on a Windows 11 box (i7-13700F 64GB) with an RTX 4060 Ti 16GB, trying to get unsloth Qwen3.5-35B-A3B-UD-Q4_K_L running well at 64k context. I finally got it into a pretty solid place, so I wanted to s…

moellamamcp

open →

← back to top