DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max)

reddit-localllama · www.reddit.com ·83 pts·30 replies ↗ ·3d

open →

← back to top