model roundup

Qwen 3

6 items · started 2026-04-25 · ongoing (last activity 2026-04-26)

How do you actually use Qwen3 72B Instruct locally? (www.reddit.com)

8 2h vllm ollama llama

I just got Qwen3 72B Instruct running on a high RAM setup and I’m kinda confused about the proper way to use it. What’s the correct workflow for running it smoothly (like best quant, tools, or runtime)?
What kind of model or harness would be the best for teaching stuff to you from documents (www.reddit.com)

+1 12h chatgpt

Going through university right now, and we have massive 100 page pdfs/ppts with soo much fluff that its annoying to go through. until now ive been using chatgpt for it, but realized that the output tokens are HEAVILY limited, and loses a L…
Just for person who is in search for a best tts model to run . (Allowed for commercial use) (www.reddit.com)

+13 1d qwen

If you have low vram - qwen 3 tts is good If you need something unique go for - tada 3b but it need 28gb vram If you want best tts rn + have the commercial use allowed then go for - moss tts 8b its literally the best model out there Litera…
Qwen3 TTS is seriously underrated - I got it running locally in real-time and it's one of the most expressive open TTS models I've tried (www.reddit.com)

+5813 4d qwen
LLM speed t/s (www.reddit.com)

+550 4d llama
7B showdown on 18GB (benchmark) (www.reddit.com)

+21 4d deepseek

Hey r/LocalLLaMA, I've been coding for a while but not in the local AI space and wanted to run some benchmarks on my 18GB M3 Pro. The theme of this one was "specialists vs generalists" at the 7-8B range: qwen2.5-coder:7b, deepseek-r1:7b, m…

← all threads