model
Qwen3.5-4B
huggingface.co/Qwen/Qwen3.5-4B ↗
10156913 downloads608 likesimage-text-to-texttransformers
from the model card
Qwen3.5-4B [!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. Over recent months, we have intensified our focus on developing foundation models that deliver exceptional utility and performance. Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency. Qwen3.5 Highlights Qwen3.5 features the following enhancement: Unified Vision-Language Foundation: Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks. Efficient Hybrid Architecture: Gated Delta Networks combined with sparse Mixture-of-Experts deliver high-throughput inference with minimal latency and cost overhead. Scalable RL Generalization: Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability. Global Linguistic Coverage: Expanded support to 201 languages and dialects, enabling inclusive, worldwide deployment with nuance…
discussions
- Qwen 3.5 11 ongoing since 2026-06-04
recent items
I have 4x 128 GB VRAM now , what should i do. (www.reddit.com via reddit) nice_meme (www.reddit.com via reddit) https://preview.redd.it/z66h627yi96h1.png?width=1080&format=png&auto=webp&s=94040bb76c0f8099b58927771c2193dd6a5019da qwen3.5 9b at 0 bit quant>>>>>>>copilot
[Opinion/Benchmark] Gemma4-12B's architecture change is too big of a tradeoff; A quick reasoning comparison between Gemma4-12B and Qwen 3.5-9B (www.reddit.com via reddit) I took the liberty to test both models today on my favorite benchmark question, head to head. Device: Apple Mac M3 Max 64GB Environment: llama.cpp, all defaults Gemma4-12B's token generation speed: 47 tps with MTP and 2 predicted tokens 29…
Nex N2 has a funny "few words do trick" reasoning (www.reddit.com via reddit) Preferred two LLM combo (www.reddit.com via reddit) I’m using my MacBook Pro M1 Pro with 32GB to run Qwen3.5-35B in Q4 as my coding agent. I have a gaming PC with a 5070 Ti that I’m currently not using but would like to.
Running Hermes fully local (www.reddit.com via reddit) Before Hermes was announced, I was working on my own fully local, personal agentic system. Now, I'm a novice when it comes to coding.
Dense vs MoE quantization resiliance (www.reddit.com via reddit) Which one is more resiliant to quantization? Especially at 4-bit?
Launch HN: General Instinct (YC P26) – Frontier models on edge devices (news.ycombinator.com) Hey HN, Guanming and Bill here from General Instinct (https://general-instinct.com/). After years of working in robotics, we kept running into the same problem: the best models never fit the hardware we actually had available.
I can't wait for all the x250 sample distills of Mythos and GPT-5.6 (www.reddit.com via reddit) Just kidding. Are there any distills that actually improve a model's quality?
↯ Anthropic Mythos↯ Gemma 4↯ Qwen 3.6↯ Qwen 3.5mythosgpt-5gemma+1
It felt good to return my Asus Spark (www.reddit.com via reddit) It's an incredible little package but too expensive of a price to pay for the performance and I simply didn't want to be part of the great "Superchip lie" - it could be super, but its super ruined by its limited memory bandwidth even thoug…
Show HN: Hitoku Draft – Context aware local assistant (hitoku.me via hn) Hi guys. I have been working on Hitoku Draft, an open-source, voice-first AI assistant that runs entirely locally.