model roundup

Qwen 3.6

10 items · started 2026-05-27 · closed 2026-05-30

  1. Theres been talk of late about using HTML rather than markdown in Claude Code. I was curious how this worked with a local model so loaded up Qwen3.6 35B A3B at Q8 and F16 KV cache.

  2. EDIT - IGNORE. I MADE A MISTAKE.

  3. I'm using llama.cpp, and I've tried Bartowski's and my own quants. When using Qwen3.5-122B or Qwen3.6-27B, I'm seeing really low draft acceptance in chats with interleaved code snippets (chatting with the LLM about programming / a code pro…

  4. I'm posting this because it may be helpful to squeeze the 12GB VRAM in the 3060. All credit goes to spiritbuun's fork (github.com/spiritbuun/buun-llama-cpp) and mudler's APEX quantizations (huggingface.co/mudler).

  5. Context Krasis is an LLM runtime for running models that don't fit into VRAM. Krasis streams the model through VRAM from system RAM efficiently and handles prefill and decode as separate architectures and optimised usecases.

  6. Model Description Q-Judger is a vision-language model fine-tuned specifically for automated evaluation of text-to-image generated images. Given a text prompt and a generated image, the model evaluates the image on fine-grained quality crit…

  7. Used the vllm version of https://github.com/noonghunna/club-3090 It worked fine for myabe 20 40k context, havent tried the new one. Anyone used the new llama.cpp patched one for single 3090?

  8. Hoping the community can guide me on this one. I'm on the fence about the following purchase: Refurbished 16-inch MacBook Pro Apple M4 Max Chip with 16‑Core CPU and 40‑Core GPU, 64gb ram for $3,479.00 vs The new 16-inch MacBook Pro Apple M…

  9. I’m considering building a local machine for AI inference using a Dell Precision T5820 and 2 Intel Arc A770’s. From this I could get 32GB DDR4 RAM, 1TB SSD and 32GB VRAM, all for like $1000.

  10. Hi all, I'm somewhat new to the scene (been lurking for maybe 4-5 months now), but i think I have all the basics figured out. My setup: 9800x3d with 64GB of RAM, 6900xt with 16GB VRAM.

← all threads