model roundup

Qwen 3

3 items · started 2026-05-22 · closed 2026-05-25

I ran a quantization shootout on Qwen3-Coder and the results are... interesting (www.reddit.com)

+136 4w llama

Out of random curiousity I ran a shootout on Qwen3-Coder-Next. I've been using the MXFP4_MOE from unsloth for awhile as it's just really fast on my system.
Seeking resources to read about llama.cpp server and how offloading works (www.reddit.com)

+17 4w llama

SETUP INFO: Amd R9700 AI PRO. Using llama-cpp server, ROCM docker version.
New Release of ROCm based MLX LLM Engine - lemon-mlx-engine (www.reddit.com)

+148 5w moe

Hey everyone lemon-mlx-engine just got done integrating TheRock / ROCm 7.13 into the lemon-mlx-engine which means you get to try the latest ROCm on your local hardware with the MLX engine! This also includes various bug fixes and kernel fi…