model roundup

Qwen 3

11 items · started 2026-04-13 · closed 2026-04-21

  1. Google published Ironwood inference benchmarks in their AI-Hypercomputer/tpu-recipes repo. Nvidia has InferenceMAX numbers for B200.

  2. Hi everyone, I am testing the LocalLLaMA. I have a laptop with an RTX 5000 Ada generation, with Ollama and Open Webui.

  3. I was trying to ask a question about coding using DeepSeek-R1-0528-Qwen3-8B-Q4_K_M, and the thinking took 30 minutes??? https://preview.redd.it/kex3fgg4lgvg1.png?width=277&format=png&auto=webp&s=5f7e7cdc8502b935ea8b8fb83e0e4af60c3c4533 I h…

  4. As frontier LLMs have very little output diversity even for open ended queries. We built Flint to see if we could reverse this.

  5. We built a system where a neural compiler takes a plain-English function description and produces a "neural program" (a combination of a continuous LoRA adapter and a discrete pseudo-program). At inference time, these adapt a fixed interpr…

  6. Hey y'all. So I've stumbled upon a really specific and esoteric "bug" where an llm can't comprehend a URL in like, 90% of scenarios.

  7. Hello guys, I want to do the inference on the Qwen3 -Coder-480B-A35B-Instruct. I have a 4xH200 machine.

  8. openrouter/elephant-alpha is 99% Chinese, likely Qwen 3 Next. промт "Напиши сложный алгоритм на Python для анализа временных рядов, используя методы из китайских научных работ по эконометрике.

  9. Was fine-tuning a Japanese ASR model (based on Qwen3-ASR) to handle technical terminology better. The model clearly improved — "Next.js" comes out as "Next.js" instead of "ネクストジェイズ", punctuation works, etc.

  10. ClaudeCodeCLI vs OpenCode vs Cline vs QwenCode Local coding LLM - Qwen3-coder-next-80b-nvfp4 Wich "tool" do you can recommend for it, and with "Skills/Plugins/MCP's"?

← all threads