model roundup

Llama 3.2

5 items · started 2026-05-18 · closed 2026-05-23

Open-source LLMs are still weak against long reasoning jailbreaks, even with lightweight defenses (www.reddit.com)

1 2w jailbreak mistral prompt-injection+5

Found this ACM paper on prompt injection and jailbreak attacks against open-source LLMs. The authors tested 10 open-source models across 94 prompt injection and 73 jailbreak scenarios, including Phi, Mistral, DeepSeek-R1, Llama 3.2, Qwen,…
Sapient Intelligence releases HRM-Text 1B: 40B tokens, ~$1k pretrain, beats Llama3.2 3B on MATH and DROP (www.reddit.com)

+54 3w

Sapient Intelligence (the HRM/hierarchical reasoning folks) dropped HRM-Text 1B today. Posting because the benchmark chart is interesting enough to be worth a look even if you're skeptical of the marketing.
🧬 flux-genotype: A self-evolving AI kernel that runs on CPU with Ollama — mutates its own architecture (www.reddit.com)

+13 3w ollama deepseek llama

`🧬 Flux‑Genotype – A CPU LLM that rewrites itself` I've been working on an open-source kernel called **flux-genotype**. It orchestrates local models (TinyLlama, Llama 3.2, Hermes 3, DeepSeek-Coder) into a self-modifying ecosystem.
Big new memory tool with local benchmarks (www.reddit.com)

+2 3w mistral

NOT MINE: https://github.com/rtk-ai/icm Knowledge retention: Agent recalls specific facts from a dense technical document across sessions. Session 1 reads and memorizes; sessions 2+ answer 10 factual questions without the source text.
Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update (www.reddit.com)

+1 3w llama

In standard AWQ, per-channel scales and quantization ranges are picked in separate steps: scales first, then the quantization parameters. But they're not independent, i.e., the rounding error from one depends on the choice of the other, so…

← all threads