Trained a 125M LM from scratch (custom tokenizer) + released instruct checkpoint and SFT framework so others can fine-tune their own variants I’ve been experimenting with training small language models fully from scratch (no GPT-2 init, no…
#fine-tuning
11 items
Trained a 125M LM from scratch instead of fine-tuning GPT-2 — releasing weights + SFT framework for others to build on (www.reddit.com via reddit) [D] Released a 100k-sample dataset on Hugging Face (www.reddit.com via reddit) We’ve released a 100,000-sample Chain-of-Thought (CoT) dataset for fine-tuning local reasoning models. Each sample includes explicit intermediate reasoning traces, rather than answer-only supervision.
Built a Japanese ASR benchmark because existing ones can't measure quality differences properly (www.reddit.com via reddit) Fine-tuning and deploying Gemma 4 is not that easy (ghost.oxen.ai via hn) Writing a fine-tuning and deployment pipeline isn't as easy as it looks (Gemma 4 Version) Fine-tune and deploy Gemma 4 on Oxen.ai Google's Gemma 4 dropped in April 2026 with multimodal support (text, image, video, audio), a novel hybrid KV…
Show HN: Rollquation – A Rolling-Ball Math Puzzle Game for Android (Solo Dev) (play.google.com via hn) Hey HN! I'm a solo dev and I just wanted to share my latest Android game — Rollquation.
Friday, self-evolving assistant, only CC $100 plan, no agent framework (github.com via hn) Friday — A 24/7 AI Assistant Built Entirely on Claude Code An always-on personal AI system using only Claude Code CLI ($100/month) and Telegram — no custom AI, no cloud VMs, no fine-tuning. Live page: missingus3r.github.io/friday-showcase…
A guide to model quantization in fine-tuning (and how to pick the right GGUF) (www.siquick.com via hn) A guide to model quantization in fine-tuning (and how to pick the right GGUF) About this post Fine-tuning with Unsloth and Axolotl is, on the whole, a well thought-out experience where a lot of the complexity is handled for you. However on…
Gemopus: A Gemma fine-tune that prioritizes stability over long chain-of-thought (huggingface.co via hn) 🌟 Gemopus-4-26B-A4B-it [!NOTE] Gemopus is an attempt at fine-tuning Gemma 4 with a core philosophy of "stability first". While preserving the original reasoning order of Gemma 4 as much as possible, we conducted targeted refinements for an…
DGX Spark users: What's the easiest way to do multi-node vLLM clustering with a browser UI and training? (www.reddit.com via reddit) Hey r/LocalLLaMA, I've been running a small 4-node DGX Spark cluster on a 400µT fabric switch and got frustrated with the usual raw Ray/vLLM scripts and EXO basically ignoring pure NVIDIA paths. I started from the solid foundation in [eugr…
Curiosity about Chatterbox's architecture led me to fine-tune it for 8 Indian languages by LoRA, using 1.4% params (www.reddit.com via reddit) TL;DR: Fine-tuned Chatterbox-Multilingual for Telugu, Kannada, Bengali, Tamil, Malayalam, Marathi, Gujarati, and Hindi using LoRA adapters + tokenizer extension. Only 7.8M / 544M parameters trained.
I open-sourced media-tsunami — a tool that extracts your brand voice into a CLAUDE.md any LLM can load (www.reddit.com via reddit)