Tested DFlash speculative decoding on oMLX — Results are mixed.
I spent time this evening benchmarking the new DFlash block-diffusion speculative decoding in oMLX v0.3.5-rc1 on my Mac Studio M2 Max, 96GB. Couldn't find much real-world data out there, so here's what I got.