Hey all. This just got delivered yesterday.
model
Qwen2.5-7B-Instruct
huggingface.co/Qwen/Qwen2.5-7B-Instruct ↗
12506262 downloads1204 likestext-generationtransformers
from the model card
Qwen2.5-7B-Instruct Introduction Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. Long-context Support up to 128K tokens and can generate up to 8K tokens. Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. This repo contains the instruction-tuned 7B Qwen2.5 model, which has the following features: Type: Causal Language Models Training Stage: Pretraining & Post-training Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias Number of Parameters: 7.61B Number of Paramaters (Non-Embedding): 6.53B Number of Layers: 28 Number of Attention Heads (GQA): 28 for Q and 4 for KV Context …
discussions
- Qwen 2.5 11 ongoing since 2026-04-13
recent items
MINISFORUM AI X1 Pro-370 (96GB) - Local Ollama Help (www.reddit.com via reddit) m5 pro 64gb worth it for local agents or wait? (www.reddit.com via reddit) I am currently on an m3 mbp with 24gb ram. For regular python and django work the machine is perfect and i have no need to upgrade for speed.
duda sobre descargarse IA de forma local (www.reddit.com via reddit) Hola, tengo actualmente un dispositivo con Truenas Scale, mi disposotivo tiene in i5 4570, 32gb ddr3, varios ssd para NAS y le instale hace poco una rtx 3060 de 12gb con el proposito de correr una ia local, para llamar a claude code o tene…
Lower inference speed of Gemma4 26BA4B on vllm. (www.reddit.com via reddit) For my earlier use case I used to host qwen 2.5 vl 7b gptq int4. Now I was looking to switch to Gemma4 26B A4B, as it would improve performance as well as improve latency considering only 4B parameters are active..
24/7 Headless AI Server on Xiaomi 12 Pro (Guide & Benchmarks) Gemma4 VS Qwen2.5 (www.reddit.com via reddit) https://preview.redd.it/2olx2ckl9evg1.jpg?width=4088&format=pjpg&auto=webp&s=b8ee69bff72a4ca21888dccf6f825da11b2b89a2 Here is the build guide for my setup. While it isn't a massive textbook, it provides enough detail to replicate the steps.
Tested 6 browser use agents for real-world tasks — here's an honest breakdown + looking for recommendations (www.reddit.com via reddit) Laptop has AMD Radeon + RTX 3050 — Which GPU should I use and how do I force apps to use the RTX? (www.reddit.com via reddit) Hardware needed for Gemma 26B MoE vs Qwen 14B for ~100–300 users (vLLM, single node?) (www.reddit.com via reddit) Looking for a reliable browser use agent that handles most daily tasks. (www.reddit.com via reddit) What's the current best code autocomplete LLM for local deployment (as of April 2026)? (www.reddit.com via reddit) Is 32GB Mac enough for engineering/coding, or stick to Claude? (www.reddit.com via reddit)