model

GLM-5.1

84784 downloads·1206 likes·text-generation·transformers

from the model card

GLM-5.1 👋 Join our WeChat or Discord community. 📖 Check out the GLM-5.1 blog and GLM-5 Technical report. 📍 Use GLM-5.1 API services on Z.ai API Platform. 🔜 GLM-5.1 will be available on chat.z.ai in the coming days. [Paper] [GitHub] Introduction GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks). But the most meaningful leap goes beyond first-pass performance. Previous models—including GLM-5—tend to exhaust their repertoire early: they apply familiar techniques for quick initial gains, then plateau. Giving them more time doesn't help. GLM-5.1, by contrast, is built to stay effective on agentic tasks over much longer horizons. We've found that the model handles ambiguous problems with better judgment and stays productive over longer sessions. It breaks complex problems down, runs experiments, reads results, and identifies blockers with real precision. By revisiting its reasoning and revising its strategy through repeated iteration, GLM-5.1 sustains optimization over hundreds of rounds and thousands of tool calls. The longer it runs, the better the result. Benchmark | | GLM-5.1 | GLM-5 | Qwen3.6-Plus | Minimax M2.7 | DeepSee…

discussions

GLM 5.1 4 2026-05-15 – 2026-05-21
GLM 5.1 7 2026-05-01 – 2026-05-09
GLM 5.1 6 2026-04-25 – 2026-05-04
GLM 5.1 11 2026-04-12 – 2026-04-25

recent items

Has anybody been able to achieve reliable agentic performance with cheap/open source models? (www.reddit.com) +11 5w

Basically the title. Recently I've been trying various open source and comparatively cheaper models like minimax m2.7, qwen models and glm5.1 in Pi agent from openrouter, and the performance on coding tasks have be moderately adequate at b…

↯ Minimax ↯ GLM 5.1 minimax qwen agentic+1
How to Find Open-Source Models / Providers that Do not Train on Data (www.reddit.com) 5 5w

A lot of people are saying just use X, just do Y, just run Z locally, but the best models cannot be run locally (GLM 5.1). No one ever talks about privacy, but for those concerned about privacy, how do we know when we use Z AI's GLM 5.1 th…

↯ Glm ↯ GLM 5.1 glm
Show HN: Chuddy, self-hosted media downloading, translation and OCR Telegram bot (github.com via hn) +2 5w

My latest project, about 60% of the codebase was written with Z.ai's GLM-5.1 model. It's basically a Telegram bot that allows for embedding/downloading media easier within group chats.

↯ Glm ↯ GLM 5.1 glm
Show HN: Grunden – Frontier AI inference hosted in Sweden, OpenAI-compatible (grunden.ai via hn) +31 6w

grunden.ai är en svensk AI-tjänst för utvecklare, myndigheter och helt vanliga människor. GLM 5.1 (open-weight) med EU-jurisdiktion, ett OpenAI-kompatibelt API och prissättning i kronor.

↯ Glm ↯ GLM 5.1 glm openai
Mac Studio local loadout - May 2026 (www.reddit.com) 2 7w

Day-to-day user vibes, not rigorous benchmarks, so YMMV. GLM 5.1 has by far been my biggest winner in the last batch of releases.

↯ Glm ↯ GLM 5.1 glm claude-code
GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? (www.reddit.com) 13 7w

I wonder which one is better, I tested it a little bit (too slow, of course) and I'm still unsure. Does the GLM-5.1 smol-IQ2_KS loses too much?

↯ Glm ↯ GLM 5.1 glm
Group Buys for Shared Compute or Model Hosting? Is this a thing? (www.reddit.com) +1 7w

I've been using GLM 5.1 a lot lately, and I love this model. However I don't love sending all my requests to China.

↯ Glm ↯ GLM 5.1 glm gemini
PP speed on dual RTX 6000 12c EPYC setup (www.reddit.com) +16 7w

I want to run big models like GLM 5.1 or Kimi k2.6. I can buy Mac Studio M3 Ultra with 512gb ram, but PP speed would be ofc bad.

↯ Glm ↯ GLM 5.1 glm
Built a self-hosted agent for small businesses that writes its own skills. ~$0.15 per customer booking on GLM-5.1 (www.reddit.com) +14 7w

Been working on this for a while and finally at a point where it's running in production for a couple of small businesses, so figured I'd share. The thing that kept bugging me about "AI employee" products is that none of them are something…

↯ Glm ↯ GLM 5.1 ↯ GLM 5.1 glm
Ask HN: Are there any good open-source chat apps? (news.ycombinator.com) +2 8w

Hi HN family! I've recently been messing around with open models through ollama (glm-5.1 and kimi-k2.6), and I've been impressed with just how close they are to Claude Sonnet for my needs, especially programming.

↯ Glm ↯ GLM 5.1 ↯ GLM 5.1 glm ollama sonnet
GLM 5.1 Locally: 40tps, 2000+ pp/s (www.reddit.com) +78 8w

After some sglang patching and countless experiments, managed to get reap-ed nvfp4 version running stable and FAST on 4 x RTX 6000 Pros (limited to 350W). Very happy with performance and quality.

↯ Glm ↯ GLM 5.1 glm sonnet claude-code
Capacity vs Speed trade-off: 1.1TB Mac Unified Memory vs. RTX 6000 Pros (www.reddit.com) +38 9w

I'm usually a Windows person, but I’m currently running a Mac cluster for local LLM orchestration. My setup consists of four 256GB Mac Studios plus one 96GB Mac Studio, giving me about 1.1TB of unified memory.

↯ Glm ↯ GLM 5.1 ↯ GLM 5.1 ↯ GLM 5.1 glm
llama.cpp / ik_llama MoE Expert Offloading - Main Memory Bandwidth vs. PCIe Bandwidth (www.reddit.com) +418 9w

↯ Glm ↯ GLM 5.1 glm moe llama
Best app to use Nvidia Nim? (www.reddit.com) +1 9w

↯ Glm ↯ GLM 5.1 glm
What's the best GPU cluster/configuration 30k $ can buy? (www.reddit.com) +344 9w

Edit: I’m getting the consensus is that the budget I suggested is not enough for my lil ambitious project. I’d like to reshape the question for the upcoming comments: what’s the minimal budget to achieve my goal?

↯ Glm ↯ GLM 5.1 glm
FREE Claude Code alternative using GLM 5.1 + VS Code (tutorial) (www.reddit.com) 8 10w

https://youtu.be/tL3cOdgukt8

↯ Glm ↯ GLM 5.1 glm claude-code
Claude Code with Pro subscription + OpenRouter in parallel — what's the cleanest setup? (www.reddit.com) 3 10w

Hi there, I have a Claude Pro subscription and use Claude Code daily. I'd also like to use Claude Code routed through my OpenRouter API key so I can experiment with other models (GLM-5.1, DeepSeek, Kimi, Gemini, etc.) — without giving up m…

↯ Glm ↯ GLM 5.1 glm deepseek sonnet+2
Local GLM 5.1 - Parkour! (www.reddit.com) +62 10w

Some more 'sloptuber' content for those who are enjoying it :) Model: unsloth glm 5.1 @ IQ2_XXS UD Prompt 1: Task: in a single web page, build a city based parkour game. wsad controls, moving player aligned with current camera direction.

↯ Glm ↯ GLM 5.1 glm
Model API Performance (news.ycombinator.com) +1 10w

We’ve been benchmarking a few models on our API platform and got some interesting performance numbers: - MiniMax M2.5 → 0.118s time-to-first-token, 103 tokens/sec - GLM 5.1 → 120 tokens/sec throughput - Kimi K2.5 → 0.643s TTFT, 69 tokens/s…

↯ Glm ↯ Minimax ↯ GLM 5.1 minimax glm
I got better results when I made each AI tool do one job (www.reddit.com) +32 10w

I spent too much time trying to find one AI dev tool that could do everything. Planning, coding, fixing, reviewing, maybe filing my taxes too It never really worked.

↯ Glm ↯ Minimax ↯ GLM 5.1 minimax glm sonnet+4
What Am I Doing Wrong? Models Won't Listen, At All (GLM 5.1, MiniMax M2.7, Kimi K2.5) (www.reddit.com) +114 10w

What am I doing wrong here? I can't get models to follow my instructions, pretty much at all.

↯ Glm ↯ Minimax ↯ GLM 5.1 minimax glm ollama
Ollama Cloud Pro ($20/mo) vs OpenAI Plus ($23/mo). Which gives more tokens ? (www.reddit.com) +64 10w

Hey everyone, I'm comparing these two plans side by side for running AI agents daily through OpenClaw (self-hosted AI agent platform): • Ollama Cloud Pro — $20/month • OpenAI Plus — €23/month (~$25) My setup: 3 agents running in parallel (…

↯ Glm ↯ GLM 5.1 glm ollama openclaw+1

← all models