model roundup
Qwen 3
-
Heya guys and gals, Around a year ago I released and posted about Persona Engine as a fun side project, trying to get the whole ASR -> LLM -> TTS pipeline going fully locally while having a realtime avatar that is lip-synced (think VTuber)…
-
-
TPU v7x Ironwood vs Nvidia B200 (www.reddit.com)
Google published Ironwood inference benchmarks in their AI-Hypercomputer/tpu-recipes repo. Nvidia has InferenceMAX numbers for B200.
-
Tokens per second - RTX 5000 Ada generation (www.reddit.com)
Hi everyone, I am testing the LocalLLaMA. I have a laptop with an RTX 5000 Ada generation, with Ollama and Open Webui.