#fine-tuning

11 items

Trained a 125M LM from scratch instead of fine-tuning GPT-2 — releasing weights + SFT framework for others to build on (www.reddit.com via reddit) 44 pts·16 replies· 3d

Trained a 125M LM from scratch (custom tokenizer) + released instruct checkpoint and SFT framework so others can fine-tune their own variants I’ve been experimenting with training small language models fully from scratch (no GPT-2 init, no…

fine-tuning
[D] Released a 100k-sample dataset on Hugging Face (www.reddit.com via reddit) 19 pts·6 replies· 12h

We’ve released a 100,000-sample Chain-of-Thought (CoT) dataset for fine-tuning local reasoning models. Each sample includes explicit intermediate reasoning traces, rather than answer-only supervision.

fine-tuning
Built a Japanese ASR benchmark because existing ones can't measure quality differences properly (www.reddit.com via reddit) 9 pts· 2d

↯ Qwen 3 fine-tuning
Fine-tuning and deploying Gemma 4 is not that easy (ghost.oxen.ai via hn) 4 pts· 4h

Writing a fine-tuning and deployment pipeline isn't as easy as it looks (Gemma 4 Version) Fine-tune and deploy Gemma 4 on Oxen.ai Google's Gemma 4 dropped in April 2026 with multimodal support (text, image, video, audio), a novel hybrid KV…

↯ Gemma 4 fine-tuning gemma
Show HN: Rollquation – A Rolling-Ball Math Puzzle Game for Android (Solo Dev) (play.google.com via hn) 2 pts· 9h

Hey HN! I'm a solo dev and I just wanted to share my latest Android game — Rollquation.

operator fine-tuning
Friday, self-evolving assistant, only CC $100 plan, no agent framework (github.com via hn) 1 pts·1 replies· 1d

Friday — A 24/7 AI Assistant Built Entirely on Claude Code An always-on personal AI system using only Claude Code CLI ($100/month) and Telegram — no custom AI, no cloud VMs, no fine-tuning. Live page: missingus3r.github.io/friday-showcase…

fine-tuning claude-code claude
A guide to model quantization in fine-tuning (and how to pick the right GGUF) (www.siquick.com via hn) 1 pts· 1d

A guide to model quantization in fine-tuning (and how to pick the right GGUF) About this post Fine-tuning with Unsloth and Axolotl is, on the whole, a well thought-out experience where a lot of the complexity is handled for you. However on…

fine-tuning
Gemopus: A Gemma fine-tune that prioritizes stability over long chain-of-thought (huggingface.co via hn) 1 pts· 2d

🌟 Gemopus-4-26B-A4B-it [!NOTE] Gemopus is an attempt at fine-tuning Gemma 4 with a core philosophy of "stability first". While preserving the original reasoning order of Gemma 4 as much as possible, we conducted targeted refinements for an…

↯ Gemma 4 fine-tuning gemma
DGX Spark users: What's the easiest way to do multi-node vLLM clustering with a browser UI and training? (www.reddit.com via reddit) 3 replies· 10h

Hey r/LocalLLaMA, I've been running a small 4-node DGX Spark cluster on a 400µT fabric switch and got frustrated with the usual raw Ray/vLLM scripts and EXO basically ignoring pure NVIDIA paths. I started from the solid foundation in [eugr…

fine-tuning vllm openai
Curiosity about Chatterbox's architecture led me to fine-tune it for 8 Indian languages by LoRA, using 1.4% params (www.reddit.com via reddit) 1d

TL;DR: Fine-tuned Chatterbox-Multilingual for Telugu, Kannada, Bengali, Tamil, Malayalam, Marathi, Gujarati, and Hindi using LoRA adapters + tokenizer extension. Only 7.8M / 544M parameters trained.

fine-tuning
I open-sourced media-tsunami — a tool that extracts your brand voice into a CLAUDE.md any LLM can load (www.reddit.com via reddit) 2 replies· 2d

fine-tuning chatgpt claude

← all tags