Qwen3.5 50% expert reduction success news.ycombinator.com
thread
Qwen 3.5
Local Coding Stacks www.reddit.com
Gemma4 26b & E4B are crazy good, and replaced Qwen for me! www.reddit.com
Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s www.reddit.com
What's your favorite small-medium local model? www.reddit.com
How faster is Gemma 4 26B-A4B during inference vs 31B? www.reddit.com
DFlash is real: x2 tg on small context with oMLX www.reddit.com
Thinking issue [Qwen3.5] www.reddit.com
Summarizing text locally, medical literature www.reddit.com
I want to run qwen3.5 27B q4_k_m on CPU, and I need help. www.reddit.com
running models bigger than physical memory capacity www.reddit.com
GRaPE 2 Model Family www.reddit.com
Llama.cpp llama-server command recommendations? www.reddit.com
Qwen 122B is AMAZING but is my config right? (128GB M4 Max) www.reddit.com
Can LLM make small change to the software program? www.reddit.com
Comparing Qwen3.5 27B vs Gemma 4 31B for agentic stuff www.reddit.com
Been out of the loop - Will this work for EXO/MLX? www.reddit.com
Why don't Groq (with a q) and Cerebras add new models www.reddit.com
current: 1x 16GB 5060Ti. worth a 2nd for OpenCode? www.reddit.com
Which AI model is best for real data analysis? [benchmark] www.reddit.com
DGX spark www.reddit.com
Lora training www.reddit.com
What is the best way to deploy LLM on 3x3090? www.reddit.com
Opinion on best suit for my hardware www.reddit.com
Speed on m5 pro 48Gb www.reddit.com