Qwen 3.5 122B A10B running 50tok/s on DGX SPARK / Asus Ascent

reddit-localllama · www.reddit.com ·5 pts·10 replies ↗ ·3d

Hello guys, wanted to share this: https://github.com/albond/DGX_Spark_Qwen3.5-122B-A10B-AR-INT4 I am running it on my DGX Spark Int4 V2 with Max context window - and getting 50tok/sec with Multi Token Prediction: Its working great for tool…

qwen

open →

← back to top