Tuning CPU-only Qwen3-30B inference with an IBM Quantum sampling loop (github.com via hn)
model roundup
Qwen 3
-
Qwen Air QPU/MCP Lab Quantum-enhanced autoresearch for high-performance, CPU-only Mixture-of-Experts LLM inference on legacy hardware. This repository contains the benchmark harness, MCP-style tool boundary, experiment logs, paper draft, a…
-
i kept running local models on my own hardware, they'd say something dumb, id sit there going "no thats not what i meant", id close the chat and the model never learned. so i built the correction loop into a desktop app.
-
My pipeline for the best speech to transcript results (www.reddit.com)
I wished the new ASR (automatic speech recognition) models to give me the accurate output but I was disappointed, specially when the input was multilingual and noisy (all my use cases). I had to put in significant efforts in audio pre/post…
-
Out of random curiousity I ran a shootout on Qwen3-Coder-Next. I've been using the MXFP4_MOE from unsloth for awhile as it's just really fast on my system.
-
SETUP INFO: Amd R9700 AI PRO. Using llama-cpp server, ROCM docker version.
-
New Release of ROCm based MLX LLM Engine - lemon-mlx-engine (www.reddit.com)
Hey everyone lemon-mlx-engine just got done integrating TheRock / ROCm 7.13 into the lemon-mlx-engine which means you get to try the latest ROCm on your local hardware with the MLX engine! This also includes various bug fixes and kernel fi…
-
Built a personal Jarvis-style AI using MCP and open models (www.reddit.com)
Still heavily work in progress, but I finally built a personal Jarvis-style AI using MCP and open models. It currently supports memory, autonomous file editing, visible tool-call tracing, confirmation before dangerous actions, persistent c…