model roundup

GLM 5.1

11 items · started 2026-04-12 · closed 2026-04-25

Capacity vs Speed trade-off: 1.1TB Mac Unified Memory vs. RTX 6000 Pros (www.reddit.com)

+38 9w glm

I'm usually a Windows person, but I’m currently running a Mac cluster for local LLM orchestration. My setup consists of four 256GB Mac Studios plus one 96GB Mac Studio, giving me about 1.1TB of unified memory.
llama.cpp / ik_llama MoE Expert Offloading - Main Memory Bandwidth vs. PCIe Bandwidth (www.reddit.com)

+418 9w glm moe llama
Best app to use Nvidia Nim? (www.reddit.com)

+1 9w glm
What's the best GPU cluster/configuration 30k $ can buy? (www.reddit.com)

+344 9w glm

Edit: I’m getting the consensus is that the budget I suggested is not enough for my lil ambitious project. I’d like to reshape the question for the upcoming comments: what’s the minimal budget to achieve my goal?
FREE Claude Code alternative using GLM 5.1 + VS Code (tutorial) (www.reddit.com)

8 10w glm claude-code

https://youtu.be/tL3cOdgukt8
Claude Code with Pro subscription + OpenRouter in parallel — what's the cleanest setup? (www.reddit.com)

3 10w glm deepseek sonnet+2

Hi there, I have a Claude Pro subscription and use Claude Code daily. I'd also like to use Claude Code routed through my OpenRouter API key so I can experiment with other models (GLM-5.1, DeepSeek, Kimi, Gemini, etc.) — without giving up m…
Local GLM 5.1 - Parkour! (www.reddit.com)

+62 10w glm

Some more 'sloptuber' content for those who are enjoying it :) Model: unsloth glm 5.1 @ IQ2_XXS UD Prompt 1: Task: in a single web page, build a city based parkour game. wsad controls, moving player aligned with current camera direction.
Model API Performance (news.ycombinator.com)

+1 10w minimax glm

We’ve been benchmarking a few models on our API platform and got some interesting performance numbers: - MiniMax M2.5 → 0.118s time-to-first-token, 103 tokens/sec - GLM 5.1 → 120 tokens/sec throughput - Kimi K2.5 → 0.643s TTFT, 69 tokens/s…
I got better results when I made each AI tool do one job (www.reddit.com)

+32 10w minimax glm sonnet+4

I spent too much time trying to find one AI dev tool that could do everything. Planning, coding, fixing, reviewing, maybe filing my taxes too It never really worked.
What Am I Doing Wrong? Models Won't Listen, At All (GLM 5.1, MiniMax M2.7, Kimi K2.5) (www.reddit.com)

+114 10w minimax glm ollama

What am I doing wrong here? I can't get models to follow my instructions, pretty much at all.
Ollama Cloud Pro ($20/mo) vs OpenAI Plus ($23/mo). Which gives more tokens ? (www.reddit.com)

+64 10w glm ollama openclaw+1

Hey everyone, I'm comparing these two plans side by side for running AI agents daily through OpenClaw (self-hosted AI agent platform): • Ollama Cloud Pro — $20/month • OpenAI Plus — €23/month (~$25) My setup: 3 agents running in parallel (…

← all threads