event

Glm

133 items · started 2026-02-14 · ongoing (last activity 2026-06-17)

  1. GLM-5.2 by Zhipu AI tops BridgeBench reasoning 24 hours after the U.S. banned Fable 5.

  2. June 17, 2026 GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index Z ai’s GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index scoring 51 and it sits on the Pare…

  3. GLM-5.2: Built for Long-Horizon Tasks - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance…

  4. Exciting news: GLM-5.2 (Max) ranks #2 in Code Arena: Frontend, with +29pt over Claude Opus 4.7 (Thinking) and only behind Fable 5! GLM-5.2 is the best open model vs Kimi-K2.6 and Minimax-M3 by a large margin.

  5. GLM-5.2 (max) Intelligence, Performance & Price Analysis Model summary IntelligenceUpdated Speed Price Cache Hit Price Verbosity GLM-5.2 (max) is amongst the leading models in intelligence, but particularly expensive when comparing to othe…

  6. [AINews] GLM-5.2: the top Frontend Coding model in the world, IndexShare for Speculative Decoding We have a new top open model in the world! Last 6 days before regular tickets sell out at AI Engineer World’s Fair - this is the single bigge…

  7. Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits,…

  8. GLM-5.2 👋 Join our WeChat or Discord community. 📖 Check out the GLM-5.2 blog and GLM-5 Technical report.

  9. In the last one or two months, starting from DeepSeek V4 Pro, there are quite many low-price Chinese models coming out. Their performance looks more or less similar to me: Mimo V2.5 Pro, MiniMax M3, and the just released GLM 5.2, etc.

  10. Intelligence should be open, accessible, and ready to build with, empowering every developer, everywhere. GLM-5.2 is now available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans.

  11. I was debugging some code and LLM crashed out: ``` The debug_log config defaults to "debug.json" and creates a FileHandler — which appends by default. That file is a log of everything that happened, never cleared.

  12. Long time lurker, and I say this as someone who genuinely loves this community and runs many local models myself. I’ve been using LLMs since the early GPT and LLaMA days.

  13. It should be at least 7-8 months until we have an open Fable(not just as good as Fable in benchmarks, but actually as good as Fable), probably more like 9-12 months. By the time, an open Fable model comes out, Fable 6.5-7 will be way bette…

  14. DeepSeek, Qwen, and GLM aren't necessarily winning every benchmark. But they don't need to.

  15. Guys how to run it as cheap as possible to get at least 15-20 ts? Asking for a friend!

  16. Has anyone figured out a way to mix max plan with models from other providers (like GLM or Deepseek) while using dynamic workflows? I suppose we could create a passthrough proxy and route sonnet and haiku to other models?

  17. CCLimitPing (limitping) English | 中文 Keep your Claude Code, Codex, and GLM (Zhipu / Z.ai Coding Plan) rate-limit windows back-to-back. These providers bill on a 5-hour rolling window (plus a weekly cap), and the 5h window starts on your fi…

  18. First we never saw an upgraded Air model after 4.5. Then GLM 4.7 Turbo was great, but quickly surpassed for coding.

  19. I just told it, make a minecraft server and let me play and it worked lol. I just asked "host a minecraft server so I can play" and it did host it, made me a dashboard ands its crazyyyyy lol, It is hosted in hongkong somewere TwT

  20. Hey HN, We believe we have the easiest onboarding from signup to being able to spin up coding agents in slack like Stripe, Ramp & Coinbase. Demo of the onboarding: https://www.tella.tv/video/connecting-cord-to-slack-1-19ep Every signup get…

  21. Been following the infrastructure side of AI more lately and stumbled on this from Zai. They upgraded the network architecture on a thousand-GPU cluster running GLM-5.1 coding inference from the standard ROFT setup to something they built…

  22. Our dev team flagged last week that xAI is retiring grok 4.1 fast. We weren't using it for anything critical but it made me ask something I'd never actually asked: how did we pick the models we're running?

  23. Have you ever wondered, how DeepSeek may make money, and lot of it? They didn't come up with competitive coding plans like GLM, MoonShot and MiniMax.

  24. So, I got interested in local LLMs a few months ago, but, I don't have a background in coding, and I don't know how to code, and I am not good with computers or anything. So far I mainly just was having fun with comparing different local L…

  25. Usual crowd. Everyone's on Claude or Codex, nobody's really sure how any of it actually works, and that's fine, that's the vibe.

  26. 🧠 MoE-on-a-Potato Running a 754-Billion Parameter LLM on a 16GB RAM Consumer PC "Saying it's impossible is not engineering. Saying we don't know how yet is science." MoE-on-a-Potato is an experimental project dedicated to testing the extre…

  27. I've been testing other models but it seems like nothing even come close to Qwen3.6 35B A3B for agentic use. The worse I'd get is a loop sometimes, while Gemma4 produced broken tool calls occasionally and I couldn't even get GLM 4.7 Flash…

  28. I am new to Cursor and still testing the free version. Benchmark for Composer 2.5 indicates it is better than DeepSeek v4 and Glm 5.1.

  29. https://preview.redd.it/i90oxxk7n03h1.png?width=1898&format=png&auto=webp&s=7d219c804fda7dfe122b84fcdb6d0d6883818c68 A while back I came across TradingAgents — a really cool multi-agent LLM stock analysis framework where like a dozen "agen…

  30. I will try to keep this short ;) I used GLM 5.1 to vibecode a vague prompt on my vibecoded react web app and have GLM 5.1 rank the plans made with each other and the one it made itself. Test strategy: - use starter prompt as always - add v…

  31. While we are weathering the gemini 3.5 flash hype, keep in mind that according to arena, GLM and Mimo are better. https://arena.ai/leaderboard/text/coding-no-style-control #7 GLM #9 Mimo #12 Gemini 3.5 Flash

  32. One way I like to test new models, is by one-shoting (with a good prompt) a single webpage clone of the classic arcade game pacman. I usually do 3 attempts and keep the best one.

  33. I built cdesktop with Claude Code — it's an open-source alternative to Anthropic's Claude Code Desktop, running locally on your machine via npx cdesktop. Free, Apache 2.0.

  34. Since the last post I've added: Huxley module (Brave New World style behavioral conditioning) Baudrillard module (synthetic intimacy, trust collapse, simulation) 30 more models including Grok 4.3, GPT-5.5, Gemini 3.1 Pro, GLM-5.1 Multi-jud…

  35. This is my GLM-4.6 model API configuration, and this error is really confusing me. I'm not sure which step went wrong.

  36. As some of you may have guessed, what we have here is an old Bible. I would like to extract the following information from the page: { verse: number, verse_content: string, comments: string[] } I've played around with PaddleOCR a bit; I co…

  37. A lot of people are saying just use X, just do Y, just run Z locally, but the best models cannot be run locally (GLM 5.1). No one ever talks about privacy, but for those concerned about privacy, how do we know when we use Z AI's GLM 5.1 th…

  38. Researchers from the Max Planck Institute recently released FutureSim, an environment in which agents are replayed a temporal slice of the web and are tasked with predicting real-world future events. In their environment, GPT 5.5 leads at…

  39. I recently made ollamatps.com for my own model-selection workflow and thought it might be useful here too. It shows 39 Ollama cloud models sorted by average TPS over the last 24 hours, and I added the Artificial Analysis Intelligence Index…

  40. My latest project, about 60% of the codebase was written with Z.ai's GLM-5.1 model. It's basically a Telegram bot that allows for embedding/downloading media easier within group chats.

  41. Has anyone figured out a provider whose open source models (Kimi, Qwen, GLM e.t.c) can be used reliably in production. I have tested some well known providers and they all suffer from high latency and poor uptime rendering them mostly usel…

  42. 1rok 1rok is a standalone harness for running portfolio-construction agents across OpenAI, Anthropic, Gemini, xAI, DeepSeek, GLM, and OpenRouter against the same financial tool surface. Agents query Alpaca, Yahoo Finance, FRED, and Tavily…

  43. I'm testing prompt-cache behavior for GLM models on Vertex AI MaaS and I'm seeing inconsistent telemetry. I reproduced it with a synthetic long prompt and repeated identical requests.

  44. About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC

  45. I have a GLM subscription that’s marketed as offering 3× higher usage than Claude Pro. I primarily use it through Claude Code CLI as a backup coding model.

  46. grunden.ai är en svensk AI-tjänst för utvecklare, myndigheter och helt vanliga människor. GLM 5.1 (open-weight) med EU-jurisdiktion, ett OpenAI-kompatibelt API och prissättning i kronor.

  47. Hi. I new to using Cursor - coming from Claude Code, Antigravity and most recently GLM coding plan.

  48. Original plan was to use Kimi/GLM for planning and DeepSeek for implementation, but seeing a lot of love for MiMo and Minimax lately. Anyone running a planner + coder split on Opencode?

  49. With the lowering usage limit in Claude, I am thinking of jumping ship to Chinese AI, since the benchmark is already very near compared to Sonnet or Haiku 4.5 , but for a fraction of the price. I am not worried about where is my data endin…

  50. With the lowering usage limit in Claude, I am thinking of jumping ship to Chinese AI, since the benchmark is already very near compared to Sonnet or Haiku 4.5 , but for a fraction of the price. I am not worried about where is my data endin…

  51. Hey r/AI_Agents — we're launching Irene today, and I want to be straight about what it is, why we built it, and where it's going. What makes Irene different Affordable with massive token limits and the latest open-source models We have gen…

  52. Day-to-day user vibes, not rigorous benchmarks, so YMMV. GLM 5.1 has by far been my biggest winner in the last batch of releases.

  53. I wonder which one is better, I tested it a little bit (too slow, of course) and I'm still unsure. Does the GLM-5.1 smol-IQ2_KS loses too much?

  54. GPT and Opus block on certain requests. This didnt use to be the case 2 months ago and I made signficant progress with Opus and then one day I had a 2 week break and then a single prompt to continue the work resulted in refusal.

  55. I've been using GLM 5.1 a lot lately, and I love this model. However I don't love sending all my requests to China.

  56. I have been following the akitaonrails coding benchmark which tests against a fixed rails + Rubyllm + docker task rather than vendor-reported evals. April 2026 update put K2.6 at 87 sitting in tier A (80+), ahead of Qwen 3.6 plus (71), Dee…

  57. Just cancelled my claude subscription due to poor rate limits, gemini cli doesn't really excel in coding from my personal experience, and my local hardware isn't that powerful to run local AI models, and while codex is good, I wanna try so…

  58. I have been toying with GLM 4.7 flash mlx a while ago using lmstudio. I had integrated it successfully with openclaw and it was kinda stable in tool calling.

  59. The benchmark uses adversarial, multi-turn debates across 683 curated motions. Each model pair debates the same motion twice with sides swapped.

  60. We present GLM-5V-Turbo, a step toward native foundation models for multimodal agents. As foundation models are increasingly deployed in real environments, agentic capability depends not only on language reasoning, but also on the ability…

  61. Architecture explains the gap: MiMo's MoE runs more active params per token than Kimi K2.6's optimized routing hence slowest. DeepSeek V4's 'comprehensive' edge is partly MLA: ~75% KV-cache compression makes it far better for long agentic…

  62. I want to run big models like GLM 5.1 or Kimi k2.6. I can buy Mac Studio M3 Ultra with 512gb ram, but PP speed would be ofc bad.

  63. Literally no 3rd party api inference provider is hosting the mimo-2.5 series models from Xiaomi. They seem to be reallly good.

  64. I set up 7 AI coding agents on a VPS with automated cron sessions (2-8 per day depending on the agent). Each uses a different model: Claude Sonnet, GPT-5.4, Gemini 2.5 Pro, DeepSeek V4 Pro, Kimi K2.6, MiMo V2.5 Pro, GLM-5.1.

  65. I have been using llama.cpp to run some models recently. For example, I've been running GLM-4.7-Flash with this command .\llama-server.exe -hf unsloth/GLM-4.7-Flash-GGUF:Q6_K_XL --alias "GLM-4.7-Flash" --host 127.0.0.1 --port 10000 --ctx-s…

  66. I am quite curious as I tried Gemma 4 31B, Qwen 3.6 27B, GLM 4.7 30B and some others in my native language (czech). Gemma performs "best" and considering the fact its "just" 18GB model - it actually blows my mind how well it can respond in…

  67. Detailed Article: https://autobe.dev/articles/local-llm-benchmark-about-backend-generation.html Five months ago I posted the "Hardcore function calling benchmark in backend coding agent" thread here. As I wrote in that post, it was an unco…

  68. Been working on this for a while and finally at a point where it's running in production for a couple of small businesses, so figured I'd share. The thing that kept bugging me about "AI employee" products is that none of them are something…

  69. Which of these do you think we'll get in May? Also, feel free to pick/rank which ones you'd want the most badly: more Gemma4 models (124b?) (other sizes?) more Qwen3.6 models (9b?

  70. I must say that I almost feel no difference in all of the latest models that are coming out. Opus 4.7 is almost equal to 4.6 and 4.5, same about the other GPT models, the Kimi K models and the GLM models they all I feel they’re almost all…

  71. Hello everyone, the question is easy, with the new models of deepseek, kimi, GLM and qwen, should you replace the old models with the new version? Do I lose some quality, information or performance in the process?

  72. I received this mail: "Hi developers, Some of you flagged occasional garbled outputs and unexpected behavior when building with the GLM-5 series, especially under heavy workloads. We heard you, reproduced the issues, and the fixes are now…

  73. Basically, I’m really into the idea of a fully offline setup. (Another way to say it: I’m a data hoarder.) For LLMs, I’m using uncensored models from both Western (Gemma, GPT-OSS) and Eastern ones (GLM 4.7 Flash, Qwen 35B).

  74. Some of the larger models (like Llama) weren't available on OpenRouter, so I had to work with what was there. Best small model: Gemma 4 26B For its size, I think it had the best output.

  75. Our belief in Scaling Laws has not only driven continuous breakthroughs in model parameters and data scale, but has also pushed infrastructure engineering toward its limits. This process inevitably comes with growing pains, which we refer…

  76. Hi HN family! I've recently been messing around with open models through ollama (glm-5.1 and kimi-k2.6), and I've been impressed with just how close they are to Claude Sonnet for my needs, especially programming.

  77. I code for a living, close to 7 years now, and I read way too much tech news. TIME dropped their 2026 most influential AI companies list and going through it I see OpenAI, Anthropic, Google, Meta, Amazon, then Zhipu AI sitting right there…

  78. I’ve been looking to buy a coding plan from one of the major open source contributors to give my meager support to them and transition away from Claude. I would love to hear some feedback from the community of their experience with some of…

  79. Full disclosure: I used to program a bit, but I was garbage at it so I found a new career. This was eons ago so I'm not a dev, obviously.

  80. This is a follow up to the previous benchmark and tensor analysis of abliteration techniques across the Qwen model family. Same approach, same toolkit, new model family.

  81. My first time releasing a fine-tune publicly! If anyone wants to independently eval against base, that’d be awesome.

  82. 1.5 tb ram with 128gb vram and a 28 core processor. Mac Pro 2019.

  83. I'm a compsci student and I've been using the 10$ copilot plan for about 2 years now, and it was fine for me since I did a good model distribution taking into account the complexity of the task, I was able to get through the month always u…

  84. Had 327 production traces from a restaurant-reservation agent I wanted to retrain. The plan was to fine-tune a smaller self-hostable model so I could ditch the frontier-API bill.

  85. How will you scale these models coding and overall. Deepseek v4 pro Kimi k2.6 Mimo v2.5 pro Glm 5.1 Qwen 3.6 plus

  86. I just noticed this after a bug wasn't getting fixed. If you start a Claude code remote environment the default model (hidden on mobile) is glm 4.7 I assumed anthropic only used their own models for everything so it was interesting to me t…

  87. so v4 pro dropped and barely anyone is talking about it. feels weird since when kimi k2.6 came out i seen post about it everywhere anyone here tried v4 pro for actual code work?

  88. After some sglang patching and countless experiments, managed to get reap-ed nvfp4 version running stable and FAST on 4 x RTX 6000 Pros (limited to 350W). Very happy with performance and quality.

  89. QClaw-4B is a 4-billion parameter language model fine-tuned for agentic tasks and tool use, designed for use with OpenClaw-compatible agent frameworks. Despite its compact size, QClaw-4B achieves state-of-the-art results in the 4B class, m…

  90. other companies are slowly going away from open weight, not releasing base models, delaying open weight distribution, not releasing top models (this one I think is fair, but still), and I also noticed they stopped publishing research (old…

  91. I'm usually a Windows person, but I’m currently running a Mac cluster for local LLM orchestration. My setup consists of four 256GB Mac Studios plus one 96GB Mac Studio, giving me about 1.1TB of unified memory.

  92. I created this chart with recent open models from last 6 months. Few might be older than that possibly.

  93. Hi all. Many models on hugging face have been fine tuned with that 3000x opus dataset, but the two I mentioned in the title are missing it.

  94. I’m trying to understand the current open-source LLM landscape beyond surface-level hype. We all got used to the nerfed products of Claude/Geminj so I believe really in opensource as a solution.

  95. I have run two tests on each LLM with OpenCode to check their basic readiness and convenience: - Create IndexNow CLI in Golang (Easy Task) and - Create Migration Map for a website following SiteStructure Strategy. (Complex Task) Tested Qwe…

  96. I got a Codex membership when GPT-5.4 launched and was getting by well enough for a while. Then I started using Claude and GLM 5.1, and my production quality improved significantly.

  97. I've grown increasingly skeptical that public coding benchmarks tell me much about which model is actually worth paying for and worried that as demand continues to spike model providers will silently drop performance. I did a few manual an…

  98. Edit: I’m getting the consensus is that the budget I suggested is not enough for my lil ambitious project. I’d like to reshape the question for the upcoming comments: what’s the minimal budget to achieve my goal?

  99. https://youtu.be/tL3cOdgukt8

  100. TL;DR I try to keep most traffic on very cheap models (Nano / GLM‑Flash / Qwen / MiniMax) and only escalate to stronger models for genuinely complex or reasoning‑heavy queries. I’m still actively testing this and tweaking it several times…

  101. could not extract summary

  102. Hi everyone. It's been a while since I posted (was a lil burned out), but some of you may have seen my older SanityHarness posts.

  103. Hello all just as it sounds. I recently started using GLM 5.1 in cursor 3 but unlike in the past, GLM 5.1 ran through my entire daily budget from summarizing chat context and running commands.

  104. Hi there, I have a Claude Pro subscription and use Claude Code daily. I'd also like to use Claude Code routed through my OpenRouter API key so I can experiment with other models (GLM-5.1, DeepSeek, Kimi, Gemini, etc.) — without giving up m…

  105. Hi all, I'm running GLM 4.7 flash uncensored (Q8) on a 5090. I'm trying to get it to edit a short story (about 8.5k tokens, added via PDF) to add a scene.

  106. So I was just about to give up playing with local models, until I realised I can actually run GLM 5.1 at not too horrible speeds, using this quant https://huggingface.co/ubergarm/GLM-5.1-GGUF/tree/main/IQ2_KL in ik llama. Getting around 6.…

  107. As of mid Apr 2026, I have noticed every model has had a major intelligence drop. And no I'm not talking about just ChatGPT.

  108. So i have been seeing more of those pelican on a bike svg tests and while they work i feel like (and maybe you guys do too) they are getting kinda benchmaxxed so we should switch things up soon and this is my idea generate me a html svg of…

  109. I'm brand new to local LLMs and started with GLM-4.7 Flash q4_K_M. When I run it directly: ollama run glm-4.7-flash:q4_K_M it works pretty decently — nothing amazing, but usable and responsive.

  110. Ever since the company went public, they’ve been making a lot of changes that clearly seem to be prioritizing profit without regard to their customers. For example, with their coding plans: - They promised/advertised that the Lite coding p…

  111. Some more 'sloptuber' content for those who are enjoying it :) Model: unsloth glm 5.1 @ IQ2_XXS UD Prompt 1: Task: in a single web page, build a city based parkour game. wsad controls, moving player aligned with current camera direction.

  112. So I have been running gpt and glm-5.1 side by side lately and tbh the gap is way smaller than what im paying for On SWE-Bench Pro glm-5.1 actually took the top spot globally, beat gpt-5.4 and opus 4.6. overall coding score is like 55 vs g…

  113. I created and run a benchmark for AI models in data analysis tasks. In contrary to other benchmarks, it is not one-prompt benchmark, but I tried to simulate the real work of data analyst.

  114. We’ve been benchmarking a few models on our API platform and got some interesting performance numbers: - MiniMax M2.5 → 0.118s time-to-first-token, 103 tokens/sec - GLM 5.1 → 120 tokens/sec throughput - Kimi K2.5 → 0.643s TTFT, 69 tokens/s…

  115. I spent too much time trying to find one AI dev tool that could do everything. Planning, coding, fixing, reviewing, maybe filing my taxes too It never really worked.

  116. Hey guys! How would you reckon a 30-50b model would run on a 48 GBs m5 pro?

  117. What am I doing wrong here? I can't get models to follow my instructions, pretty much at all.

  118. WEB SEARCH WAS ALWAYS ON!!!! Question Calculate the precise VRAM requirement for the **KV Cache only** at the maximum context window for **DeepSeek V3.2** and **MiniMax M2.5**.

  119. So, I have been testing GLM OCR for my rag app, but it is not working good for Arabic. It is unable to extract data either on textual page, scanned pages or even images.

  120. I know this question has already been asked a thousand times, probably, but... what's the best or close-to-best model I can use with Continue for local IDE-like code autocomplete?

  121. Hey everyone, I'm comparing these two plans side by side for running AI agents daily through OpenClaw (self-hosted AI agent platform): • Ollama Cloud Pro — $20/month • OpenAI Plus — €23/month (~$25) My setup: 3 agents running in parallel (…

  122. GLM 5.1 is dominant in almost every aspect in Design arena, surpassing Opus 4.6 in many tasks. Although user experiences vary dependent on subscription plans for both of those one of them is open source.

  123. Something keeps nagging at me about the Chinese AI space lately. Every few months a new Chinese model drops that closes the gap with US frontier models a little more(not by throwing more compute at it, just genuinely clever engineering at…

← all threads