event

Glm

133 items · started 2026-02-14 · ongoing (last activity 2026-06-17)

GLM-5.2 Beats Fable 5 on Reasoning – 24 Hours After the U.S. Export Ban (explainx.ai via hn)

+1 1h glm

GLM-5.2 by Zhipu AI tops BridgeBench reasoning 24 hours after the U.S. banned Fable 5.
GLM-5.2 is the new leading open weights model on Artificial Analysis (artificialanalysis.ai via hn)

+9426 3h glm

June 17, 2026 GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index Z ai’s GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index scoring 51 and it sits on the Pare…
GLM-5.2: Built for Long-Horizon Tasks (huggingface.co)

4h glm

GLM-5.2: Built for Long-Horizon Tasks - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance…
GLM 5.2 ranks #2 in Code Arena: Frontend (twitter.com via hn)

+21 5h minimax glm opus

Exciting news: GLM-5.2 (Max) ranks #2 in Code Arena: Frontend, with +29pt over Claude Opus 4.7 (Thinking) and only behind Fable 5! GLM-5.2 is the best open model vs Kimi-K2.6 and Minimax-M3 by a large margin.
GLM 5.2 Performance Benchmarks (artificialanalysis.ai via hn)

+2 5h glm

GLM-5.2 (max) Intelligence, Performance & Price Analysis Model summary IntelligenceUpdated Speed Price Cache Hit Price Verbosity GLM-5.2 (max) is amongst the leading models in intelligence, but particularly expensive when comparing to othe…
[AINews] GLM-5.2: the top Frontend Coding model in the world, IndexShare for Speculative Decoding (www.latent.space)

7h glm

[AINews] GLM-5.2: the top Frontend Coding model in the world, IndexShare for Speculative Decoding We have a new top open model in the world! Last 6 days before regular tickets sell out at AI Engineer World’s Fair - this is the single bigge…
GLM-5.2: Frontier Intelligence, Open Weights (twitter.com via hn)

+145 15h glm agentic

Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits,…
Z.ai GLM 5.2 (huggingface.co via hn)

+41 18h glm

GLM-5.2 👋 Join our WeChat or Discord community. 📖 Check out the GLM-5.2 blog and GLM-5 Technical report.
Ask HN: Which cheap Chinese LLM are you using? (news.ycombinator.com)

+4 3d minimax glm deepseek

In the last one or two months, starting from DeepSeek V4 Pro, there are quite many low-price Chinese models coming out. Their performance looks more or less similar to me: Mimo V2.5 Pro, MiniMax M3, and the just released GLM 5.2, etc.
GLM-5.2 is now available with 1M-context support (twitter.com via hn)

+2 4d glm

Intelligence should be open, accessible, and ready to build with, empowering every developer, everywhere. GLM-5.2 is now available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans.
I thought Chinese censorship didn't affect me. I was wrong. (www.reddit.com via reddit)

6d glm

I was debugging some code and LLM crashed out: ``` The debug_log config defaults to "debug.json" and creates a FileHandler — which appends by default. That file is a log of everything that happened, never cleared.
Suitable replacement to grok fast 4.1 (www.reddit.com via reddit)

7d glm grok
Can you really replace paid models with a local model? (www.reddit.com via reddit)

7d minimax glm deepseek+2

Long time lurker, and I say this as someone who genuinely loves this community and runs many local models myself. I’ve been using LLMs since the early GPT and LLaMA days.
Claude Fable/Mythos 5 just came out, so it will take Deepseek or Z.ai or Xiaomi or Kimi 9-12 months to release a model just as good as Fable? (www.reddit.com via reddit)

7d minimax glm deepseek+2

It should be at least 7-8 months until we have an open Fable(not just as good as Fable in benchmarks, but actually as good as Fable), probably more like 9-12 months. By the time, an open Fable model comes out, Fable 6.5-7 will be way bette…
Would you pay for Chinese AI models if the quality was close enough? (www.reddit.com via reddit)

8d glm deepseek qwen

DeepSeek, Qwen, and GLM aren't necessarily winning every benchmark. But they don't need to.
GLM-5.1 and Kimi K2.6 THE CHEAPEST WAY TO RUN (www.reddit.com via reddit)

8d glm

Guys how to run it as cheap as possible to get at least 15-20 ts? Asking for a friend!
Dynamic Workflows With External Models and Max Plan? (www.reddit.com via reddit)

9d glm haiku deepseek+1

Has anyone figured out a way to mix max plan with models from other providers (like GLM or Deepseek) while using dynamic workflows? I suppose we could create a passthrough proxy and route sonnet and haiku to other models?
Show HN: LimitPing – Keep Claude Code and Codex rate-limit windows continuous (github.com via hn)

+1 9d glm codex claude-code

CCLimitPing (limitping) English | 中文 Keep your Claude Code, Codex, and GLM (Zhipu / Z.ai Coding Plan) rate-limit windows back-to-back. These providers bill on a 5-hour rolling window (plus a weekly cap), and the 5h window starts on your fi…
Z.ai, we need Air! GLM GGUF wen? (www.reddit.com via reddit)

10d glm gemma qwen+1

First we never saw an upgraded Air model after 4.5. Then GLM 4.7 Turbo was great, but quickly surpassed for coding.
Fuck, sucessfully ran minecraft server on GLM AI's Agent lol. (www.reddit.com via reddit)

10d glm

I just told it, make a minecraft server and let me play and it worked lol. I just asked "host a minecraft server so I can play" and it did host it, made me a dashboard ands its crazyyyyy lol, It is hosted in hongkong somewere TwT
Show HN: Free open source coding models in Slack (www.runcord.com via hn)

+2 2w minimax glm gemma+5

Hey HN, We believe we have the easiest onboarding from signup to being able to spin up coding agents in slack like Stripe, Ramp & Coinbase. Demo of the onboarding: https://www.tella.tv/video/connecting-cord-to-slack-1-19ep Every signup get…
Zai replaced the network architecture running GLM-5.1 inference and the gains are pretty wild (www.reddit.com)

+29832 2w glm

Been following the infrastructure side of AI more lately and stumbled on this from Zai. They upgraded the network architecture on a thousand-GPU cluster running GLM-5.1 coding inference from the standard ROFT setup to something they built…
do you use different models for different steps in your agent, or just one for everything? (www.reddit.com)

+512 2w glm grok

Our dev team flagged last week that xAI is retiring grok 4.1 fast. We weren't using it for anything critical but it made me ask something I'd never actually asked: how did we pick the models we're running?
DeepSeek's 10T USD grand strategy (twitter.com via hn)

+5 2w minimax glm deepseek

Have you ever wondered, how DeepSeek may make money, and lot of it? They didn't come up with competitive coding plans like GLM, MoonShot and MiniMax.
Noob here, curious about roughly how advanced of a video game a model like Qwen3.6 27b could create, if kept fully offline, and got unlimited attempts/revisions (maybe ~1 month project time limit). Like, could it make something equivalent to Pokemon Red? Doom? Doom II? What if using GLM 5.1? (www.reddit.com)

+125 2w glm

So, I got interested in local LLMs a few months ago, but, I don't have a background in coding, and I don't know how to code, and I am not good with computers or anything. So far I mainly just was having fun with comparing different local L…
Went to the monthly AI dev meetup (www.reddit.com)

21 2w glm llama codex+1

Usual crowd. Everyone's on Claude or Codex, nobody's really sure how any of it actually works, and that's fine, that's the vibe.
I ran GLM-5.1 on a 16GB RAM machine (github.com via hn)

+3 3w glm moe

🧠 MoE-on-a-Potato Running a 754-Billion Parameter LLM on a 16GB RAM Consumer PC "Saying it's impossible is not engineering. Saying we don't know how yet is science." MoE-on-a-Potato is an experimental project dedicated to testing the extre…
Is Qwen3.6 current king for local agentic use? (www.reddit.com)

+1122 3w glm moe agentic

I've been testing other models but it seems like nothing even come close to Qwen3.6 35B A3B for agentic use. The worse I'd get is a loop sometimes, while Gemma4 produced broken tool calls occasionally and I couldn't even get GLM 4.7 Flash…
Is Composer 2.5 better than Glm 5.1 and DeepSeek v4 pro in real world tasks? (www.reddit.com)

+1 3w glm deepseek codex+1

I am new to Cursor and still testing the free version. Benchmark for Composer 2.5 indicates it is better than DeepSeek v4 and Glm 5.1.
I built a local GUI for the TradingAgents framework — works with Ollama (www.reddit.com)

+4 3w minimax glm ollama+4

https://preview.redd.it/i90oxxk7n03h1.png?width=1898&format=png&auto=webp&s=7d219c804fda7dfe122b84fcdb6d0d6883818c68 A while back I came across TradingAgents — a really cool multi-agent LLM stock analysis framework where like a dozen "agen…
Some tests with qwen3.6 27b + 35b a3b about MTP vs ngram-mod (www.reddit.com)

14 3w glm

I will try to keep this short ;) I used GLM 5.1 to vibecode a vague prompt on my vibecoded react web app and have GLM 5.1 rank the plans made with each other and the one it made itself. Test strategy: - use starter prompt as always - add v…
Open weights GLM and Mimo are better than Gemini 3.5 flash according to arena (www.reddit.com)

+33 4w glm gemini

While we are weathering the gemini 3.5 flash hype, keep in mind that according to arena, GLM and Mimo are better. https://arena.ai/leaderboard/text/coding-no-style-control #7 GLM #9 Mimo #12 Gemini 3.5 Flash
The pacman benchmark: finally a viable local agentic coding agent with Qwen 3.6 27b (www.reddit.com)

+178 4w glm qwen chatgpt+2

One way I like to test new models, is by one-shoting (with a good prompt) a single webpage clone of the classic arcade game pacman. I usually do 3 attempts and keep the best one.
cdesktop — open-source Claude Code Desktop alternative, runs locally via npx, supports any provider (www.reddit.com)

+35 4w glm deepseek gemini+3

I built cdesktop with Claude Code — it's an open-source alternative to Anthropic's Claude Code Desktop, running locally on your machine via npx cdesktop. Free, Apache 2.0.
I expanded DystopiaBench to 42 models and 6 dystopia types. Claude is still the only one I'd trust with nuclear codes. (www.reddit.com)

+86 4w glm grok gpt-5+2

Since the last post I've added: Huxley module (Brave New World style behavioral conditioning) Baudrillard module (synthetic intimacy, trust collapse, simulation) 30 more models including Grok 4.3, GPT-5.5, Gemini 3.1 Pro, GLM-5.1 Multi-jud…
When configuring a third-party AI large model on the MacBook Claude Code desktop client, an error message appears. How can this be resolved? (www.reddit.com)

+12 4w glm claude-code

This is my GLM-4.6 model API configuration, and this error is really confusing me. I'm not sure which step went wrong.
OCR: what is the best way to extract data in JSON format from this old French book? (www.reddit.com)

10 4w glm

As some of you may have guessed, what we have here is an old Bible. I would like to extract the following information from the page: { verse: number, verse_content: string, comments: string[] } I've played around with PaddleOCR a bit; I co…
How to Find Open-Source Models / Providers that Do not Train on Data (www.reddit.com)

5 4w glm

A lot of people are saying just use X, just do Y, just run Z locally, but the best models cannot be run locally (GLM 5.1). No one ever talks about privacy, but for those concerned about privacy, how do we know when we use Z AI's GLM 5.1 th…
GPT 5.5 (Codex) leading the future prediction race (www.reddit.com)

+61 4w glm deepseek codex+2

Researchers from the Max Planck Institute recently released FutureSim, an environment in which agents are replayed a temporal slice of the web and are tasked with predicting real-world future events. In their environment, GPT 5.5 leads at…
I built a 24h TPS + Intelligence Index table for Ollama Cloud models (www.reddit.com)

4w glm ollama

I recently made ollamatps.com for my own model-selection workflow and thought it might be useful here too. It shows 39 Ollama cloud models sorted by average TPS over the last 24 hours, and I added the Artificial Analysis Intelligence Index…
Show HN: Chuddy, self-hosted media downloading, translation and OCR Telegram bot (github.com via hn)

+2 4w glm

My latest project, about 60% of the codebase was written with Z.ai's GLM-5.1 model. It's basically a Telegram bot that allows for embedding/downloading media easier within group chats.
Reliable Open Source LLM as a Service (www.reddit.com)

+12 4w glm qwen gemini+1

Has anyone figured out a provider whose open source models (Kimi, Qwen, GLM e.t.c) can be used reliably in production. I have tested some well known providers and they all suffer from high latency and poor uptime rendering them mostly usel…
Multi-LLM AI trading agent harness (github.com via hn)

+1 4w glm deepseek gemini+2

1rok 1rok is a standalone harness for running portfolio-construction agents across OpenAI, Anthropic, Gemini, xAI, DeepSeek, GLM, and OpenRouter against the same financial tool surface. Agents query Alpaca, Yahoo Finance, FRED, and Tavily…
Vertex MaaS GLM-5 prompt cache telemetry seems inconsistent. Anyone else seeing this? (www.reddit.com)

+11 4w glm openai

I'm testing prompt-cache behavior for GLM models on Vertex AI MaaS and I'm seeing inconsistent telemetry. I reproduced it with a synthetic long prompt and repeated identical requests.
Open source battle: GLM vs Kimi vs MiMo vs DeepSeek (www.youtube.com via reddit)

+31 4w glm deepseek

About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
What’s going on with GLM? Are they scamming or what? (www.reddit.com)

+21 4w glm claude-code

I have a GLM subscription that’s marketed as offering 3× higher usage than Claude Pro. I primarily use it through Claude Code CLI as a backup coding model.
Show HN: Grunden – Frontier AI inference hosted in Sweden, OpenAI-compatible (grunden.ai via hn)

+31 5w glm openai

grunden.ai är en svensk AI-tjänst för utvecklare, myndigheter och helt vanliga människor. GLM 5.1 (open-weight) med EU-jurisdiktion, ett OpenAI-kompatibelt API och prissättning i kronor.
Tips for using Composer 2? New to Cursor (www.reddit.com)

+53 5w glm cursor claude-code

Hi. I new to using Cursor - coming from Claude Code, Antigravity and most recently GLM coding plan.
Which Chinese Model is best for planning and which is best for implementation? I'm currently using Opencode with an Openrouter API Key, mostly wanna decide between Kimi, GLM, DeepSeek, Qwen, Minimax and Mimo (www.reddit.com)

+11 5w minimax glm deepseek+1

Original plan was to use Kimi/GLM for planning and DeepSeek for implementation, but seeing a lot of love for MiMo and Minimax lately. Anyone running a planner + coder split on Opencode?
Best AI coding plan alternative to Claude and ChatGPT (news.ycombinator.com)

+43 5w minimax glm haiku+2

With the lowering usage limit in Claude, I am thinking of jumping ship to Chinese AI, since the benchmark is already very near compared to Sonnet or Haiku 4.5 , but for a fraction of the price. I am not worried about where is my data endin…
Chinese AI Coding Plan (www.reddit.com)

+25 5w minimax glm haiku+1

With the lowering usage limit in Claude, I am thinking of jumping ship to Chinese AI, since the benchmark is already very near compared to Sonnet or Haiku 4.5 , but for a fraction of the price. I am not worried about where is my data endin…
We built Irene — an AI agent platform that actually remembers you, builds its own tools , adapts and improve as you use it (www.reddit.com)

14 5w minimax glm ollama+2

Hey r/AI_Agents — we're launching Irene today, and I want to be straight about what it is, why we built it, and where it's going. What makes Irene different Affordable with massive token limits and the latest open-source models We have gen…
Mac Studio local loadout - May 2026 (www.reddit.com)

2 5w glm claude-code

Day-to-day user vibes, not rigorous benchmarks, so YMMV. GLM 5.1 has by far been my biggest winner in the last batch of releases.
GLM-5.1 smol-IQ2_KS at 2.3t/s or GLM-4.7 UD-Q3_K_XL at 4.42t/s, which is "better" for chats (no coding)? (www.reddit.com)

13 5w glm

I wonder which one is better, I tested it a little bit (too slow, of course) and I'm still unsure. Does the GLM-5.1 smol-IQ2_KS loses too much?
Which model has less restrictions now? (www.reddit.com)

+12 5w glm qwen opus

GPT and Opus block on certain requests. This didnt use to be the case 2 months ago and I made signficant progress with Opus and then one day I had a 2 week break and then a single prompt to continue the work resulted in refusal.
Group Buys for Shared Compute or Model Hosting? Is this a thing? (www.reddit.com)

+1 6w glm gemini

I've been using GLM 5.1 a lot lately, and I love this model. However I don't love sending all my requests to China.
Ran K2.6 through a third-party coding benchmark: heres how the figures stand up (www.reddit.com)

+31 6w glm deepseek qwen+1

I have been following the akitaonrails coding benchmark which tests against a fixed rails + Rubyllm + docker task rather than vendor-reported evals. April 2026 update put K2.6 at 87 sitting in tier A (80+), ahead of Qwen 3.6 plus (71), Dee…
I plan to use a chinese AI model through API for coding through a harness, I'm a uni student so nothing prod related for now. should i go deepseek, minimax, kimi or glm? kinda confused (www.reddit.com)

+11 6w minimax glm deepseek+2

Just cancelled my claude subscription due to poor rate limits, gemini cli doesn't really excel in coding from my personal experience, and my local hardware isn't that powerful to run local AI models, and while codex is good, I wanna try so…
Best local model for MBP 48GB UM (www.reddit.com)

2 6w glm openclaw qwen

I have been toying with GLM 4.7 flash mlx a while ago using lmstudio. I had integrated it successfully with openclaw and it was kinda stable in tool calling.
Update to the LLM Debate Benchmark: GPT-5.5, Grok 4.3, DeepSeek V4 Pro, GLM-5.1, Kimi K2.6, Qwen 3.6 Max Preview, Xiaomi MiMo V2.5 Pro, Tencent Hy3 Preview, and Mistral Medium 3.5 High Reasoning added (www.reddit.com)

+4 6w mistral glm grok+4

The benchmark uses adversarial, multi-turn debates across 683 curated motions. Each model pair debates the same motion twice with sides swapped.
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents (arxiv.org via hn)

+385 6w glm agentic

We present GLM-5V-Turbo, a step toward native foundation models for multimodal agents. As foundation models are increasingly deployed in real environments, agentic capability depends not only on language reasoning, but also on the ability…
tested four newest open source Kimi K2.6 is the fastest, GLM 5.1 the fanciest, DeepSeek V4 is the most comprehensive, and Xiaomi MiMo is the slowest (www.reddit.com)

+21 6w glm moe deepseek+1

Architecture explains the gap: MiMo's MoE runs more active params per token than Kimi K2.6's optimized routing hence slowest. DeepSeek V4's 'comprehensive' edge is partly MLA: ~75% KV-cache compression makes it far better for long agentic…
PP speed on dual RTX 6000 12c EPYC setup (www.reddit.com)

+16 6w glm

I want to run big models like GLM 5.1 or Kimi k2.6. I can buy Mac Studio M3 Ultra with 512gb ram, but PP speed would be ofc bad.
Why is no open weight model inference provider hosting Mimo-v2.5 or Mimo-v2.5-pro? (www.reddit.com)

+23 6w glm deepseek

Literally no 3rd party api inference provider is hosting the mimo-2.5 series models from Xiaomi. They seem to be reallly good.
Running 7 autonomous AI agents for 14 days. Here's what actually happens when they need to find customers. (www.reddit.com)

5 6w glm gpt-5 deepseek+2

I set up 7 AI coding agents on a VPS with automated cron sessions (2-8 per day depending on the agent). Each uses a different model: Claude Sonnet, GPT-5.4, Gemini 2.5 Pro, DeepSeek V4 Pro, Kimi K2.6, MiMo V2.5 Pro, GLM-5.1.
Does running a model (like qwen3.6-27b) on vllm or transformers use less VRAM than llama.cpp? (www.reddit.com)

5 6w glm vllm qwen+1

I have been using llama.cpp to run some models recently. For example, I've been running GLM-4.7-Flash with this command .\llama-server.exe -hf unsloth/GLM-4.7-Flash-GGUF:Q6_K_XL --alias "GLM-4.7-Flash" --host 127.0.0.1 --port 10000 --ctx-s…
Anyone tried +- 100B models locally with foreign languages? (www.reddit.com)

+56 6w glm gemma qwen

I am quite curious as I tried Gemma 4 31B, Qwen 3.6 27B, GLM 4.7 30B and some others in my native language (czech). Gemma performs "best" and considering the fact its "just" 18GB model - it actually blows my mind how well it can respond in…
Local LLM Benchmark about Backend Generation by Function Calling (GLM vs Qwen vs DeepSeek) (www.reddit.com)

+1 6w function-calling glm gpt-5+3

Detailed Article: https://autobe.dev/articles/local-llm-benchmark-about-backend-generation.html Five months ago I posted the "Hardcore function calling benchmark in backend coding agent" thread here. As I wrote in that post, it was an unco…
Built a self-hosted agent for small businesses that writes its own skills. ~$0.15 per customer booking on GLM-5.1 (www.reddit.com)

+14 6w glm

Been working on this for a while and finally at a point where it's running in production for a couple of small businesses, so figured I'd share. The thing that kept bugging me about "AI employee" products is that none of them are something…
Your local LLM predictions and hopes for May 2026 (www.reddit.com)

+620 6w mistral minimax glm+2

Which of these do you think we'll get in May? Also, feel free to pick/rank which ones you'd want the most badly: more Gemma4 models (124b?) (other sizes?) more Qwen3.6 models (9b?
Who else thinks AI is reaching a plateau (www.reddit.com)

+213 6w glm mythos opus+1

I must say that I almost feel no difference in all of the latest models that are coming out. Opus 4.7 is almost equal to 4.6 and 4.5, same about the other GPT models, the Kimi K models and the GLM models they all I feel they’re almost all…
Should I replace stored models? (www.reddit.com)

7 6w glm deepseek qwen

Hello everyone, the question is easy, with the new models of deepseek, kimi, GLM and qwen, should you replace the old models with the new version? Do I lose some quality, information or performance in the process?
Received a message from Z.AI about occasional garbled outputs and unexpected behavior (www.reddit.com)

+12 6w glm

I received this mail: "Hi developers, Some of you flagged occasional garbled outputs and unexpected behavior when building with the GLM-5 series, especially under heavy workloads. We heard you, reproduced the issues, and the fixes are now…
Did anyone of you already make the "doomsday" or "offgrid" knowledge based? (ofc powered with LLM) (www.reddit.com)

8 6w glm gemma qwen

Basically, I’m really into the idea of a fully offline setup. (Another way to say it: I’m a data hoarder.) For LLMs, I’m using uncensored models from both Western (Gemma, GPT-OSS) and Eastern ones (GLM 4.7 Flash, Qwen 35B).
Comparing SVG Generation for the top open models (codeinput.com via reddit)

+1 6w minimax glm gemma+2

Some of the larger models (like Llama) weren't available on OpenRouter, so I had to work with what was there. Best small model: Gemma 4 26B For its size, I think it had the best output.
Scaling Pain of Coding Agent Serving: Lessons from Debugging GLM-5 at Scale (z.ai via hn)

+51 6w glm

Our belief in Scaling Laws has not only driven continuous breakthroughs in model parameters and data scale, but has also pushed infrastructure engineering toward its limits. This process inevitably comes with growing pains, which we refer…
Ask HN: Are there any good open-source chat apps? (news.ycombinator.com)

+2 6w glm ollama sonnet

Hi HN family! I've recently been messing around with open models through ollama (glm-5.1 and kimi-k2.6), and I've been impressed with just how close they are to Claude Sonnet for my needs, especially programming.
3 of TIME's top 10 AI companies are Chinese and I only knew one by name (www.reddit.com)

+24 7w glm gemini openai+1

I code for a living, close to 7 years now, and I read way too much tech news. TIME dropped their 2026 most influential AI companies list and going through it I see OpenAI, Anthropic, Google, Meta, Amazon, then Zhipu AI sitting right there…
Open Source Company Coding Plans (www.reddit.com)

+23 7w glm qwen

I’ve been looking to buy a coding plan from one of the major open source contributors to give my meager support to them and transition away from Claude. I would love to hear some feedback from the community of their experience with some of…
I'm Not a Dev But I Use Qwen 3.6 35b to Code (www.reddit.com)

+247 7w glm qwen

Full disclosure: I used to program a bit, but I was garbage at it so I found a new career. This was eons ago so I'm not a dev, obviously.
Abliterlitics: Benchmarks and Tensor Comparison for Heretic, Abliterlix, Huiui, HauhauCS for GLM 4.7 Flash (www.reddit.com)

+193 7w mixture-of-experts glm qwen

This is a follow up to the previous benchmark and tensor analysis of abliteration techniques across the Qwen model family. Same approach, same toolkit, new model family.
Qwen 3.6 27b S2 Opus + GLM + Kimi (huggingface.co via reddit)

1 7w glm qwen opus

My first time releasing a fine-tune publicly! If anyone wants to independently eval against base, that’d be awesome.
Just got a beast. (www.reddit.com)

+316 7w glm

1.5 tb ram with 128gb vram and a 28 core processor. Mac Pro 2019.
Best value in the 20$ range coding agents? I want the best quality and high-usage-limit I can get at that price. (www.reddit.com)

+11 7w windsurf glm copilot+3

I'm a compsci student and I've been using the 10$ copilot plan for about 2 years now, and it was fine for me since I did a good model distribution taking into account the complexity of the task, I was able to get through the month always u…
Used a Claude Code skill to fine-tune Qwen3-1.7B from 327 noisy traces, matches GLM-5 (www.reddit.com)

+5 7w glm claude-code

Had 327 production traces from a restaurant-reservation agent I wanted to retrain. The plan was to fine-tune a smaller self-hostable model so I could ditch the frontier-API bill.
How will you scale these models (www.reddit.com)

5 7w glm deepseek qwen

How will you scale these models coding and overall. Deepseek v4 pro Kimi k2.6 Mimo v2.5 pro Glm 5.1 Qwen 3.6 plus
Anthropic's Claude remote uses GLM-4.7 (www.reddit.com)

4 7w glm anthropic claude-code

I just noticed this after a bug wasn't getting fixed. If you start a Claude code remote environment the default model (hidden on mobile) is glm 4.7 I assumed anthropic only used their own models for everything so it was interesting to me t…
anyone actually tried deepseek v4 pro for coding? (www.reddit.com)

+12 7w glm deepseek

so v4 pro dropped and barely anyone is talking about it. feels weird since when kimi k2.6 came out i seen post about it everywhere anyone here tried v4 pro for actual code work?
GLM 5.1 Locally: 40tps, 2000+ pp/s (www.reddit.com)

+78 7w glm sonnet claude-code

After some sglang patching and countless experiments, managed to get reap-ed nvfp4 version running stable and FAST on 4 x RTX 6000 Pros (limited to 350W). Very happy with performance and quality.
QClaw-4B — a 4B agent model fine-tuned for tool use and agentic workflows (www.reddit.com)

3 7w tool-use glm openclaw+1

QClaw-4B is a 4-billion parameter language model fine-tuned for agentic tasks and tool use, designed for use with OpenClaw-compatible agent frameworks. Despite its compact size, QClaw-4B achieves state-of-the-art results in the 4B class, m…
I'm glad we have deepseek (www.reddit.com)

+17532 7w minimax glm gemma+2

other companies are slowly going away from open weight, not releasing base models, delaying open weight distribution, not releasing top models (this one I think is fair, but still), and I also noticed they stopped publishing research (old…
Capacity vs Speed trade-off: 1.1TB Mac Unified Memory vs. RTX 6000 Pros (www.reddit.com)

+38 7w glm

I'm usually a Windows person, but I’m currently running a Mac cluster for local LLM orchestration. My setup consists of four 256GB Mac Studios plus one 96GB Mac Studio, giving me about 1.1TB of unified memory.
Recent Open models from last 6 Months - Nov 2025 - Apr 2026 (www.reddit.com)

+11628 8w mistral glm gemma+1

I created this chart with recent open models from last 6 months. Few might be older than that possibly.
Qwen 3.5 397b and GLM 5.1 Opus fine tune (www.reddit.com)

+12 8w glm qwen opus

Hi all. Many models on hugging face have been fine tuned with that 3000x opus dataset, but the two I mentioned in the title are missing it.
Current state of open-source ? (www.reddit.com)

+416 8w mistral minimax glm+2

I’m trying to understand the current open-source LLM landscape beyond surface-level hype. We all got used to the nerfed products of Claude/Geminj so I believe really in opensource as a solution.
Tested how OpenCode Works with SelfHosted LLMS: Qwen 3.5, 3.6, Gemma 4, Nemotron 3, GLM-4.7 Flash - v2 (www.reddit.com)

+1732 8w glm gemma qwen

I have run two tests on each LLM with OpenCode to check their basic readiness and convenience: - Create IndexNow CLI in Golang (Easy Task) and - Create Migration Map for a website following SiteStructure Strategy. (Complex Task) Tested Qwe…
llama.cpp / ik_llama MoE Expert Offloading - Main Memory Bandwidth vs. PCIe Bandwidth (www.reddit.com)

+418 8w glm moe llama
(Interactive)OpenCode Racing Game Comparison Qwen3.6 35B vs Qwen3.5 122B vs Qwen3.5 27B vs Qwen3.5 4B vs Gemma 4 31B vs Gemma 4 26B vs Qwen3 Coder Next vs GLM 4.7 Flash (www.reddit.com)

+6629 8w glm gemma mcp
2x 512gb ram M3 Ultra mac studios (www.reddit.com)

+346106 8w glm deepseek
Best app to use Nvidia Nim? (www.reddit.com)

+1 8w glm
Comparing GPT-5.4, Opus 4.6, GLM-5.1, Kimi K2.5, MiMo V2 Pro and MiniMax M2.7 (www.codejam.info via hn)

+62 8w minimax glm gpt-5+1
Best open source LLM for planning ? (www.reddit.com)

4 8w glm sonnet opus
The quality of GPT-5.4 is infuriatingly POOR (www.reddit.com)

2 8w glm gpt-5 codex

I got a Codex membership when GPT-5.4 launched and was getting by well enough for a while. Then I started using Claude and GLM 5.1, and my production quality improved significantly.
Show HN: RepoGauge – save token costs and compare agents on your own repos (repogauge.org via hn)

+1 8w glm sonnet opus

I've grown increasingly skeptical that public coding benchmarks tell me much about which model is actually worth paying for and worried that as demand continues to spike model providers will silently drop performance. I did a few manual an…
What's the best GPU cluster/configuration 30k $ can buy? (www.reddit.com)

+344 8w glm

Edit: I’m getting the consensus is that the budget I suggested is not enough for my lil ambitious project. I’d like to reshape the question for the upcoming comments: what’s the minimal budget to achieve my goal?
FREE Claude Code alternative using GLM 5.1 + VS Code (tutorial) (www.reddit.com)

8 8w glm claude-code

https://youtu.be/tL3cOdgukt8
What’s your LLM routing strategy for personal agents? (www.reddit.com)

8w mistral minimax glm+3

TL;DR I try to keep most traffic on very cheap models (Nano / GLM‑Flash / Qwen / MiniMax) and only escalate to stronger models for genuinely complex or reasoning‑heavy queries. I’m still actively testing this and tweaking it several times…
Minimax vs Qwen vs Kimi vs Mimo(Omni) vs Glm ( via reddit)

+1 8w minimax glm qwen

could not extract summary
Kimi K2.6-Code-Preview, Opus 4.7, GLM 5.1, Minimax M2.7 and more tested in coding (www.reddit.com)

+91 8w minimax glm opus

Hi everyone. It's been a while since I posted (was a lil burned out), but some of you may have seen my older SanityHarness posts.
Cursor 3 eating GLM 5.1 usage (www.reddit.com)

+21 8w glm cursor

Hello all just as it sounds. I recently started using GLM 5.1 in cursor 3 but unlike in the past, GLM 5.1 ran through my entire daily budget from summarizing chat context and running commands.
Claude Code with Pro subscription + OpenRouter in parallel — what's the cleanest setup? (www.reddit.com)

3 8w glm deepseek sonnet+2

Hi there, I have a Claude Pro subscription and use Claude Code daily. I'd also like to use Claude Code routed through my OpenRouter API key so I can experiment with other models (GLM-5.1, DeepSeek, Kimi, Gemini, etc.) — without giving up m…
Long context prompt help (www.reddit.com)

3 8w glm

Hi all, I'm running GLM 4.7 flash uncensored (Q8) on a 5090. I'm trying to get it to edit a short story (about 8.5k tokens, added via PDF) to add a scene.
Upgrade paths for my 256g ddr4 ram + 4x24g vram system (www.reddit.com)

+110 9w glm llama

So I was just about to give up playing with local models, until I realised I can actually run GLM 5.1 at not too horrible speeds, using this quant https://huggingface.co/ubergarm/GLM-5.1-GGUF/tree/main/IQ2_KL in ik llama. Getting around 6.…
Major drop in intelligence across most major models. (www.reddit.com)

+510319 9w glm grok sonnet+3

As of mid Apr 2026, I have noticed every model has had a major intelligence drop. And no I'm not talking about just ChatGPT.
Guys we have to change the pelican test (www.reddit.com)

+4864 9w minimax glm deepseek+3

So i have been seeing more of those pelican on a bike svg tests and while they work i feel like (and maybe you guys do too) they are getting kinda benchmaxxed so we should switch things up soon and this is my idea generate me a html svg of…
do GLM-4.7 Flash Q4_K_M have problem with claude or agent? (www.reddit.com)

+36 9w glm ollama

I'm brand new to local LLMs and started with GLM-4.7 Flash q4_K_M. When I run it directly: ollama run glm-4.7-flash:q4_K_M it works pretty decently — nothing amazing, but usable and responsive.
ZAI might stop open-weighting their models? (www.reddit.com)

+3448 9w glm openai anthropic

Ever since the company went public, they’ve been making a lot of changes that clearly seem to be prioritizing profit without regard to their customers. For example, with their coding plans: - They promised/advertised that the Lite coding p…
Local GLM 5.1 - Parkour! (www.reddit.com)

+62 9w glm

Some more 'sloptuber' content for those who are enjoying it :) Model: unsloth glm 5.1 @ IQ2_XXS UD Prompt 1: Task: in a single web page, build a city based parkour game. wsad controls, moving player aligned with current camera direction.
Running gpt and glm-5.1 side by side. Honestly can’t tell the difference (www.reddit.com)

+2418 9w swe-bench glm gpt-5+1

So I have been running gpt and glm-5.1 side by side lately and tbh the gap is way smaller than what im paying for On SWE-Bench Pro glm-5.1 actually took the top spot globally, beat gpt-5.4 and opus 4.6. overall coding score is like 55 vs g…
Which AI model is best for real data analysis? [benchmark] (www.reddit.com)

+1 9w glm ollama gpt-5

I created and run a benchmark for AI models in data analysis tasks. In contrary to other benchmarks, it is not one-prompt benchmark, but I tried to simulate the real work of data analyst.
Model API Performance (news.ycombinator.com)

+1 9w minimax glm

We’ve been benchmarking a few models on our API platform and got some interesting performance numbers: - MiniMax M2.5 → 0.118s time-to-first-token, 103 tokens/sec - GLM 5.1 → 120 tokens/sec throughput - Kimi K2.5 → 0.643s TTFT, 69 tokens/s…
I got better results when I made each AI tool do one job (www.reddit.com)

+32 9w minimax glm sonnet+4

I spent too much time trying to find one AI dev tool that could do everything. Planning, coding, fixing, reviewing, maybe filing my taxes too It never really worked.
Speed on m5 pro 48Gb (www.reddit.com)

9w glm gemma qwen

Hey guys! How would you reckon a 30-50b model would run on a 48 GBs m5 pro?
What Am I Doing Wrong? Models Won't Listen, At All (GLM 5.1, MiniMax M2.7, Kimi K2.5) (www.reddit.com)

+114 9w minimax glm ollama

What am I doing wrong here? I can't get models to follow my instructions, pretty much at all.
Why most open-source models can't answer this question while most closed-source models can answer most of the time? (www.reddit.com)

30 9w minimax glm grok+4

WEB SEARCH WAS ALWAYS ON!!!! Question Calculate the precise VRAM requirement for the **KV Cache only** at the maximum context window for **DeepSeek V3.2** and **MiniMax M2.5**.
GLM OCR for Arabic (www.reddit.com)

2 9w glm rag

So, I have been testing GLM OCR for my rag app, but it is not working good for Arabic. It is unable to extract data either on textual page, scanned pages or even images.
What's the current best code autocomplete LLM for local deployment (as of April 2026)? (www.reddit.com)

+34 9w glm

I know this question has already been asked a thousand times, probably, but... what's the best or close-to-best model I can use with Continue for local IDE-like code autocomplete?
Ollama Cloud Pro ($20/mo) vs OpenAI Plus ($23/mo). Which gives more tokens ? (www.reddit.com)

+64 9w glm ollama openclaw+1

Hey everyone, I'm comparing these two plans side by side for running AI agents daily through OpenClaw (self-hosted AI agent platform): • Ollama Cloud Pro — $20/month • OpenAI Plus — €23/month (~$25) My setup: 3 agents running in parallel (…
Do you guys think there’s a high chance of Singularity being open source? (www.reddit.com)

+7467 9w glm gemma qwen+1

GLM 5.1 is dominant in almost every aspect in Design arena, surpassing Opus 4.6 in many tasks. Although user experiences vary dependent on subscription plans for both of those one of them is open source.
Chinese AI companies are shipping faster and cheaper than anyone expected and I'm not sure the west has a good answer for it (www.reddit.com)

+528275 9w glm opus

Something keeps nagging at me about the Chinese AI space lately. Every few months a new Chinese model drops that closes the gap with US frontier models a little more(not by throwing more compute at it, just genuinely clever engineering at…
Single question llm comparison (www.reddit.com)

+101 17w minimax glm haiku+6
Minimax M2.5 vs. GLM-5 vs. Kimi k2.5: How do they compare to Codex and Claude for coding? (www.reddit.com)

+5742 17w minimax glm codex+1
Stop donating your salary to OpenAI: Why Minimax M2.5 is making GPT-5.2 Thinking look like an overpriced dinosaur for coding plans. (www.reddit.com)

10 17w swe-bench minimax glm+5

← all threads