model roundup

Qwen 3

7 items · started 2026-05-22 · closed 2026-06-01

Tuning CPU-only Qwen3-30B inference with an IBM Quantum sampling loop (github.com via hn)

+1 3w qwen mcp

Qwen Air QPU/MCP Lab Quantum-enhanced autoresearch for high-performance, CPU-only Mixture-of-Experts LLM inference on legacy hardware. This repository contains the benchmark harness, MCP-style tool boundary, experiment logs, paper draft, a…
Turning every "no thats not what i meant" in chat into actual LoRA training data (www.reddit.com)

+31 4w jailbreak security

i kept running local models on my own hardware, they'd say something dumb, id sit there going "no thats not what i meant", id close the chat and the model never learned. so i built the correction loop into a desktop app.
My pipeline for the best speech to transcript results (www.reddit.com)

+11 4w qwen

I wished the new ASR (automatic speech recognition) models to give me the accurate output but I was disappointed, specially when the input was multilingual and noisy (all my use cases). I had to put in significant efforts in audio pre/post…
I ran a quantization shootout on Qwen3-Coder and the results are... interesting (www.reddit.com)

+136 4w llama

Out of random curiousity I ran a shootout on Qwen3-Coder-Next. I've been using the MXFP4_MOE from unsloth for awhile as it's just really fast on my system.
Seeking resources to read about llama.cpp server and how offloading works (www.reddit.com)

+17 4w llama

SETUP INFO: Amd R9700 AI PRO. Using llama-cpp server, ROCM docker version.
New Release of ROCm based MLX LLM Engine - lemon-mlx-engine (www.reddit.com)

+148 5w moe

Hey everyone lemon-mlx-engine just got done integrating TheRock / ROCm 7.13 into the lemon-mlx-engine which means you get to try the latest ROCm on your local hardware with the MLX engine! This also includes various bug fixes and kernel fi…
Built a personal Jarvis-style AI using MCP and open models (www.reddit.com)

5w operator mcp

Still heavily work in progress, but I finally built a personal Jarvis-style AI using MCP and open models. It currently supports memory, autonomous file editing, visible tool-call tracing, confirmation before dangerous actions, persistent c…

← all threads