Import AI 455: AI systems are about to start building themselves. Jack Clark thinks there’s a ~30% chance by the end of 2027 and a ~60%+ chance by the end of 2028 that AI research becomes automated, with models eventually helping train the…
#fine-tuning
123 items
Anthropic co-founder Jack Clark says AI is nearing the point where it can automate AI research (www.reddit.com) I built a 103B-token Usenet corpus (1980–2013) — pre-web, human-only, zero AI contamination. Got strong traction on r/ML, thought this community would find it useful. (www.reddit.com) Posted this to r/MachineLearning a couple weeks ago (30K views, 100+ upvotes) and have been meaning to share it here where the fine-tuning angle is more directly relevant. I spent years building and processing a complete Usenet corpus from…
Trained a 125M LM from scratch instead of fine-tuning GPT-2 — releasing weights + SFT framework for others to build on (www.reddit.com) Trained a 125M LM from scratch (custom tokenizer) + released instruct checkpoint and SFT framework so others can fine-tune their own variants I’ve been experimenting with training small language models fully from scratch (no GPT-2 init, no…
GPT 5.5 "secret sauce" is just having the thinking be some stupid caveman mode? (www.reddit.com) I think I had GPT-5.5 leak its trace during a normal conversation, and it really reads like the caveman mode fad from a few months back. Maybe we can achieve better token efficiency by taking some high-quality thinking trace from an open m…
ServiceNow-AI/SuperApriel-15B-Instruct · Hugging Face (huggingface.co via reddit) A 15B-parameter token-mixer supernet with 8 optimized deployment presets spanning 1.0× to 10.7× decode throughput at 32K sequence length, all from a single checkpoint. Derived from Apriel-1.6 through stochastic distillation and targeted su…
[D] Released a 100k-sample dataset on Hugging Face (www.reddit.com) We’ve released a 100,000-sample Chain-of-Thought (CoT) dataset for fine-tuning local reasoning models. Each sample includes explicit intermediate reasoning traces, rather than answer-only supervision.
End-2-end tutorial on fine-tuning, the whole journey (docs.liquid.ai via reddit) I put together a hands-on tutorial that takes you from problem framing to fine-tuning, step by step. I decided to build a wildfire prevention system that uses satellite images and a Small Vision-Language Model (LFM2.5-VL-450M) to extract r…
Jackrong/Qwopus3.5-9B-Coder-GGUF · Hugging Face (huggingface.co via reddit) Qwopus3.5-9B-coder is specially optimized and fine-tuned for high-performance 🤖 Agentic Coding, complex Tool Calling, and logical reasoning. 💡 Why the 9B Dense Model?
OpenAI has announced they will be winding down fine tuning. (www.reddit.com) Got an email today about the announcement. > OpenAI is winding down the fine-tuning API and platform.
Built a Japanese ASR benchmark because existing ones can't measure quality differences properly (www.reddit.com) Was fine-tuning a Japanese ASR model (based on Qwen3-ASR) to handle technical terminology better. The model clearly improved — "Next.js" comes out as "Next.js" instead of "ネクストジェイズ", punctuation works, etc.
How Unsloth and Nvidia made LLM training 25% faster on consumer GPUs (unsloth.ai via hn) Fine-tuning is one of today's most computationally intensive workloads, and it continues to push hardware to its limits. NVIDIA GPUs are purpose-built for these workloads: they break complex problems into pieces and process them in paralle…
Show HN: MemFactory: Unified Inference and Training Framework for Agent Memory (arxiv.org via hn) Memory-augmented Large Language Models (LLMs) are essential for developing capable, long-term AI agents. Recently, applying Reinforcement Learning (RL) to optimize memory operations, such as extraction, updating, and retrieval, has emerged…
Number-aware embeddings (www.reddit.com) If you look at the cosine sim between the embeddings of "a 500 hp car", "a 1,200 hp car" and "a 73 hp car", you'll soon see that embedding models have no sense of number ordering at all. (I tested Qwen and ModernBERT-based embeddings) It m…
How to Fine-Tune LLMs on AMD Strix Halo and Other Exotic AMD Hardware (www.reddit.com) After the first general general fine-tuning tutorial i posted here (https://www.promptinjection.net/p/the-ultimate-llm-ai-fine-tuning-guide-tutorial) some people asked if i can't make the same for AMD Strix Halo because approach here is qu…
Fine-Tuning TranslateGemma-4B to improve bi-directional English & Welsh translations on an H200 GPU! (metalglot.com via reddit) Open source repo: https://github.com/grctest/finetuned-gemmatranslate-cy 5% of the fine-tuning took 40 minutes and cost a couple dollars to prove the process works. Looking forwards to Flash Attention v4 to leave beta, to test fine-tuning…
Llama models: still valuable for finetuning or surpassed by everything new? (www.reddit.com) Hello there people. So I have noticed that people are pretty much ignoring Llama 3 plus 3.1, 3.2, and 3.3 these days.
Finetuning Dataset: Claude Opus 4.6/4.7 - 8.7k Chats (www.reddit.com) https://huggingface.co/datasets/angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k A synthetic fine-tuning dataset created from Claude 4.6/4.7. 8,706 total examples all with reasoning.
losing my mind fine-tuning jina-v5 for a legal corpus (www.reddit.com) For the last month i've been trying to fine-tune jina-v5 (which has performed best on my corpus out of the box) on slovak law chunks, time and time again no matter what i do I can't get the model to learn nuance of slovak syntax. here's th…
Dropping learning rate fixed my Qlora fine-tune more than anything else i tried (www.reddit.com) Been fine-tuning llama 3.1 8b with Qlora for a classification task using about 8k samples. I was getting bad eval results for a while and kept thinking something was wrong with my data.
Show HN: LLM post-training to speak like GenZ, costing less than a cup of coffee (github.com via hn) GenZ LLM A post-trained language model that responds in GenZ slang, built on top of Qwen2.5-0.5B-Instruct using Supervised Fine-Tuning (SFT) followed by Reinforcement Learning with GRPO. The fine-tuned model is available on Hugging Face: a…
I have practically unlimited access to Opus and every other frontier model. I'd like to help contribute to a dataset. (www.reddit.com) No, I won't tell you how. No this is not for anyone who is not already a proven contributor to the fine-tuning space.
Show HN: ShadowPEFT – Centralized and Detachable Parameter-Efficient Fine-Tuning (github.com via hn) Unlike LoRA and its variants, which inject trainable parameters directly into the weights of the Transformer, requiring tight coupling with the backbone. ShadowPEFT instead enhances the frozen large base model by adding a lightweight, cent…
RTX PRO 5000 (48GB) vs MacBook Pro M5 MAX (128GB RAM) - The choice for fine-tuning & agentic coding (www.reddit.com) I Kept a Diary for Seven Years. An LLM Finally Read It. (www.reddit.com) I've kept a personal diary since 2019. Last week I fed 200+ entries to an LLM and asked it how I've changed over 7 years.
Liquid AI releases fine-tuning harness for AI agents (lqh.ai via hn) Liquid Harness is an autonomous agent by Liquid AI that takes a plain-English spec and ships a fine-tuned Liquid Foundation Model. Spec, data, eval, training, deployment — all in one run.
Learn, run and test Agentic AI on your browser for free! (Built with Claude Opus 4.7 in 2 days) (www.reddit.com) Hey Everyone, Over the last few months, I noticed a massive gap in how we learn about Agentic AI. There are a million theoretical blog posts and dense whitepapers on RAG, tool calling, and swarms, but almost nowhere to just sit down, run a…
↯ Fine Tuning↯ Opus 4.7↯ Function Callingfunction-callingfine-tuningrag+4
Fine-tuning and deploying Gemma 4 is not that easy (ghost.oxen.ai via hn) Writing a fine-tuning and deployment pipeline isn't as easy as it looks (Gemma 4 Version) Fine-tune and deploy Gemma 4 on Oxen.ai Google's Gemma 4 dropped in April 2026 with multimodal support (text, image, video, audio), a novel hybrid KV…
AI content detector based on Qwen 0.8b fine-tuned on Pangram dataset (www.reddit.com) I've fine-tuned Qwen 3.5 0.8B on the dataset provided by Pangram with their EditLens paper. It's available via a Chrome extension; you can just click selected text and it's going to give you the probability distribution of how likely it is…
Demo of fine-tuning Orpheus 3B on a TTS dataset in Transformer Lab (open source) (www.reddit.com) I'm part of the team building Transformer Lab, an open source ML research platform. We put together a short demo of how to run text to speech training, which you can do on your own hardware using a Local provider.
I Built a desktop app for generating LLM fine-tuning datasets — started it a week ago while learning FT (www.reddit.com) Hey, I've been building side projects with Claude Code for a few months, but I'm completely new to fine-tuning — started experimenting maybe a week ago. From day one I wanted a GUI for the dataset side of the workflow, so this desktop app…
Converting XQuery to SQL with Local LLMs: Do I Need Fine-Tuning or a Better Approach? (www.reddit.com) Show HN: We're open sourcing Superlog (YC P26), an autonomous monitoring tool (github.com via hn) Hi HN! This is Arseniy from Superlog (YC P26).
SkillOpt – Executive Strategy for Self-Evolving Agent Skills (microsoft.github.io via hn) A skill is external state for an agent. Instead of fine-tuning a model or hand-maintaining prompts, SkillOpt runs the frozen agent on scored batches, asks a separate optimizer model to propose structured edits, and accepts a candidate only…
What workstation to get for ~13k EUR? (www.reddit.com) My use-cases will be to test open-weight LLMs and work on harnesses, inference systems and possibly other non-ML workflows (CS-related) in the future. Fine-tuning would not be something I do locally because I can rent a B200 from RunPod fo…
RAG vs. Fine-Tuning – The Question Every AI Builder Gets Wrong (thingswithai.org via hn) RAG vs. Fine-Tuning — The Question Every AI Builder Gets Wrong AI models don't know your private data.
RAG vs. Fine-Tuning: Which AI Strategy Saves Your Team Time and Budget (lightrains.com via hn) Two weeks before a Fortune 500 product launch, we told a client to scrap their fine-tuned model and rebuild with RAG instead. They lost eight weeks and $180K.
Best open-weight model to run locally on 8x A100 80GB for generating teacher data? (www.reddit.com) I have (free) access to a SLURM cluster with 8x NVIDIA A100 80GB GPUs (=640 GB VRAM) on a single task, and I want to run an open-weight model locally with llama.cpp for data generation, not coding. My use case is generating teacher data fo…
Show HN: I built a 2nd-order PyTorch optimizer for LLMs that runs on 16GB GPUs (news.ycombinator.com) Hi HN, I'm Danilo. I've been struggling with the limitations of AdamW when fine-tuning LLMs locally.
Findings: Gemma4 26B-A4B fine-tuning on a single RTX 4090 — 10 patches, benchmark, PCIELink path #1 (www.reddit.com) Summary of Findings This issue documents what we learned making Gemma4 26B-A4B-it train on consumer hardware (RTX 4090, 24GB VRAM). No A100.
Show HN: Rollquation – A Rolling-Ball Math Puzzle Game for Android (Solo Dev) (play.google.com via hn) Hey HN! I'm a solo dev and I just wanted to share my latest Android game — Rollquation.
Fine-tuning LLMs on 30M academic papers from ScholarAPI (scholarapi.net via hn) Global academic literature at your fingertips. Reliable Google Scholar alternative for large-scale access to academic PDFs and metadata, with full-text search and bulk download.
Learn from Your Mistakes: Tree-Like Self-Play for Secure Code LLMs (arxiv.org via hn) While Large Language Models (LLMs) excel in code generation, they remain prone to replicating subtle yet critical vulnerabilities endemic to their training data. Current alignment techniques, such as Supervised Fine-Tuning (SFT) and Reinfo…
Fine-tuning an LLM to write docs like it's 1995 (passo.uno via hn) In my predictions for 2030 I wrote that tech writers would be using specialized LLMs, running locally on powerful hardware. I see hints of this move to “local first” among engineering pundits, but we’re not there yet, in part because of ho…
Fine-Tuning for Engagement (robertdruska.com via hn) May 29, 2026 It’s been quite some time since major LLM providers introduced the behaviour that the chatbots often end their response with a question. The motivation is clear: more engagement, more data to train on.
Show HN: Hyper, the self driving company brain (heyhyper.ai via hn) Hey HN -- I'm Shalin, one of the cofounders of Hyper. My cofounder Kanyes and I have been power users of a lot of second brain type software like Notion and Obsidian for years, and tried fine-tuning GPT-2 back in 2020 to solve this exact p…
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models (www.computer.org via hn) I.Introduction Transformer-based PLMs [1],[2],[3],[4],[5] have demonstrated remarkable performance across a wide range of NLP tasks. To fully harness the potential of PLMs, fine-tuning is commonly employed to adapt them to task-specific da…
The LLM Fine-Tuning Guide (www.promptinjection.net via hn) The Ultimate LLM Fine-Tuning Guide From dataset to GGUF - every parameter explained, every step runnable Fine-tuning is a direct intervention into how a language model behaves. Not prompting, not system instructions, not RAG - actual weigh…
Personal continual learning for LLMs without GPU — position paper [OC] (www.reddit.com) I proposed two architectures for enabling LLMs to learn daily from personal interactions: Internal KV-Sphere Architecture (IKSA) Background Micro Fine-Tuning (BMFT) Both work with zero GPU and zero catastrophic forgetting. Full paper: in c…
About to start fine-tuning on RunPod. What should I know to not waste money? (www.reddit.com) I was MLOps lead at an AI company managing 5000+ GPUs across GCP and CoreWeave. Left to start my own thing and now I'm back to renting GPUs like everyone else.
Open-sourced our MCP server for GPU workload execution looking for feedback (www.reddit.com) Hey everyone I’m Jaguar, building Jungle Grid. We just open-sourced our MCP server for agentic GPU workload execution.
Introducing AI finetuner, Source available and free Claude skill to fine tune your vibe coded UI with live preview (www.reddit.com) Fine-tuning UI with AI right now: "Make the shadow softer." "Stronger." "No, less." "Go back." "A bit more." 17 messages later, you've spent more tokens than the shadow is soft. I built something that breaks the loop.
Model Spec Midtraining: Improving How Alignment Training Generalizes (alignment.anthropic.com via hn) We introduce model spec midtraining (MSM): after pre-training but before alignment fine-tuning, we train models on synthetic documents discussing their Model Spec. This shapes how models generalize from subsequent alignment training.
Anthropic just published new alignment research that could fix "alignment faking" in AI agents here's what it actually means (www.reddit.com) Anthropic's alignment team published a paper this week called Model Spec Midtraining (MSM) and I think it's one of the more practically interesting alignment results I've seen in a while. The core problem they're solving: Current alignment…
HELP - How to fine-tune an LLM to match academic writing style (www.reddit.com) I've been using LLMs to help write my thesis, but the output feels dry and uses awkward phrasing (especially in translation). I'm looking to fine-tune an accessible LLM to better match natural academic writing in my language.
Can LLMs create lasting flashcards from readers' highlights? (memory-machines.com via hn) We tested prompting, fine-tuning, RL, and grounded evaluation across ~1,500 labeled flashcards—and found models catch obvious misses but not plausible failures.
Run, Learn and test Agentic AI for free, on your browser! (Open AI Models are included) (www.reddit.com) Hey Everyone, Over the last few months, I noticed a massive gap in how we learn about Agentic AI. There are a million theoretical blog posts and dense whitepapers on RAG, tool calling, and swarms, but almost nowhere to just sit down, run a…
↯ Fine Tuning↯ Function Callingfunction-callingfine-tuningrag+3
Research note: Fine-tuning experiments on CoT controllability (metr.org via hn) We find that a small amount of fine-tuning on instruction following in the CoT generalizes to meaningful increases in CoT controllability on an out-of-distribution set of tasks. We fine-tune four reasoning models on small datasets of instr…
A weekend with LoRA on Gemma 4 E2B: instrumenting what fine-tuning changes (aiexplr.com via hn) Spent a week doing LoRA fine-tuning on Gemma 4 E2B (~5.1B total params, ~2B active in text decoder) for a narrow Python code-generation task. Bad outputs went from ~5% to 0% (greedy) and 1.5% (sampled) across 134 tests.
Show HN: ClickMVP – Deterministic full-stack code generation (no LLMs) (app.clickmvp.com via hn) I've built software for clients for 38 years and kept hitting the same wall: weeks spent scaffolding the data layer and the Clean Architecture around it before any real work begins. I asked Claude to estimate how long it would take to gene…
Pioneer: Vibetune Your LLMs (pioneer.ai via hn) +30% avg accuracy lift on classification & extraction tasks vs. base Gemma ~7 days until your first auto-improvement run lands in production 0 lines of fine-tuning code you have to write, ever $0/retrain starting price.
LLM from scratch (32l) – Interventions: updated instruction fine-tuning results (www.gilesthomas.com via hn) Which kind of base/fine-tunes have you done? And which data did you use? (www.reddit.com) [Release] Swedish Construction FAQ — 503 bilingual (SV+EN) Q&As for fine-tuning, CC BY 4.0, now on HF / PyPI / Kaggle / Zenodo (www.reddit.com) I've been building an open Q&A dataset for the Swedish construction industry (byggbransch) over the last few weeks — something that's been a gap in Swedish-language domain-specific datasets. Finally hit a milestone worth sharing.
Friday, self-evolving assistant, only CC $100 plan, no agent framework (github.com via hn) Friday — A 24/7 AI Assistant Built Entirely on Claude Code An always-on personal AI system using only Claude Code CLI ($100/month) and Telegram — no custom AI, no cloud VMs, no fine-tuning. Live page: missingus3r.github.io/friday-showcase…
A guide to model quantization in fine-tuning (and how to pick the right GGUF) (www.siquick.com via hn) A guide to model quantization in fine-tuning (and how to pick the right GGUF) About this post Fine-tuning with Unsloth and Axolotl is, on the whole, a well thought-out experience where a lot of the complexity is handled for you. However on…
Gemopus: A Gemma fine-tune that prioritizes stability over long chain-of-thought (huggingface.co via hn) 🌟 Gemopus-4-26B-A4B-it [!NOTE] Gemopus is an attempt at fine-tuning Gemma 4 with a core philosophy of "stability first". While preserving the original reasoning order of Gemma 4 as much as possible, we conducted targeted refinements for an…
How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions (arxiv.org) Financial transaction processing requires extracting structured merchant information from noisy, abbreviated bank transaction strings at scale. Our current production system, a LoRA-fine-tuned LLaMA 3.1-8B, achieves 96.95% F1 on this task,…
Phantom transitions in language model fine-tuning (arxiv.org) A Mechanistic Analysis of Adversarial Fine-tuning of Vision Transformers (arxiv.org) Subtitle-Aligned Fine-Tuning of Whisper for Swiss German ASR: Benchmark Contamination, Convention Mismatch, and an Honest Baseline at 25.6% WER (13.8% cWER) (arxiv.org) Single-Cell Cross-Modal Transfer by Adversarial Fine-Tuning of Foundation Models (arxiv.org) Ego-Pi: VLA Fine-Tuning for Ego-Centric Human and Robot Data (arxiv.org) FiberTune: Preserving Action-Fiber Visual Residuals in Vision-Language-Action Fine-Tuning (arxiv.org) Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan (arxiv.org) Self-Mined Hardness for Safety Fine-Tuning (arxiv.org) AlignFed: Alignment-Aware Asynchronous Federated Fine-Tuning for Large Language Models in Heterogeneous Edge Environments (arxiv.org) PriFT: Prior-Support Guided Supervised Fine-Tuning (arxiv.org) AutoTail-BSFGM: Class-Balance-Aware Fine-Tuning for Chinese Scholarly Text Classification (arxiv.org) Shortcuts in the Tail: Debiasing via Post-Hoc Spectral Compression of Fine-Tuning Updates (arxiv.org) Curvature-Guided LoRA: Matching Full Fine-Tuning in Function Space (arxiv.org) Domain-Adapted Small Language Models with Hybrid Post-Processing: Achieving Cost-Efficient, Low-Latency Multi-Label Structured Prediction via LoRA Fine-Tuning on Scarce Data (arxiv.org) SafeGene: Reusable Adapters for Transferable Safety Alignment (arxiv.org) Open-weight LLMs are increasingly fine-tuned into customized assistants, but downstream fine-tuning can weaken safety alignment and make models more vulnerable to malicious prompts, even when the training data is not intentionally harmful.…
The Fine-Tuning Trap: Evaluating Negative Transfer and the Role of PEFT in Sub-1B Mathematical Reasoning (arxiv.org) Deploying Small Language Models (SLMs) on edge devices requires efficient fine-tuning strategies that adapt models to new tasks without degrading their general capabilities. In this study, we benchmark five sub-1B models (135M-1B) on mathe…
Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines (arxiv.org) RASFT: Rollout-Adaptive Supervised Fine-Tuning for Reasoning (arxiv.org) SERNF: Sample-Efficient Real-World Dexterous Policy Fine-Tuning via Action-Chunked Critics and Normalizing Flows (arxiv.org) Multilingual Fine-Tuning via Localized Gradient Conflict Resolution (arxiv.org) The rapid evolution of Large Language Models (LLMs) has established cross-lingual versatility as a defining feature of modern systems. However, fine-tuning these models frequently induces negative interference across languages.
Emotion-Aware Image Generation from Korean Diary Text via LLM-based Prompt Translation and LoRA Fine-Tuning (arxiv.org) T2I models cannot effectively capture sentiment from various types of text, including diaries, as they primarily focus on visual object-related patterns rather than contextual emotional understanding. This paper proposes an emotion-aware t…
ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models (arxiv.org) (Mis)generalization of Helpful-only Fine-tuning (arxiv.org) Few Tokens, Big Leverage: Preserving Safety Alignment by Constraining Safety Tokens during Fine-tuning (arxiv.org) Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation (huggingface.co) Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation Motivation NVIDIA Cosmos Predict 2.5 is a large-scale world model capable of generating physically plausible videos conditioned on text, images, or video clips…
I drew the entire AI stack on one page... and it's mostly not models. (www.reddit.com) Most "AI progress" talk lives on one layer: models. Bigger model, smaller model, new benchmark, repeat.
Realistically, what is the best use of consumer hardware for AI? (www.reddit.com) I want to move past the "democratization" slogans. What is the most practical contribution consumer-grade hardware can make to the ecosystem right now?
MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required (huggingface.co) MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required The Idea Medical question answering is one of those tasks where the stakes are genuinely high. A model that confidently picks the wrong answer on a clinical MCQ isn't just wro…
Three lessons from fine-tuning a 5B code assistant — bad outputs from 5% → 0% (www.reddit.com) Spent a week doing LoRA fine-tuning on Gemma 4 E2B (gemma-4-e2b-it, ~5.1B total params, ~2B active in the text decoder) for a narrow Python code-generation task. Setup: Model: Gemma 4 E2B, bf16, language_model only (vision + audio towers f…
Hardware choice (www.reddit.com) We want to set up the following: A Local LLM environment for AI development, used by multiple software developers Infrastructure for training Vision AI models Capabilities for AI model fine-tuning I’m currently struggling to decide between…
We open-sourced Chaperone-Thinking-LQ-1.0 — a 4-bit GPTQ + QLoRA fine-tuned DeepSeek-R1-32B that hits 84% on MedQA in ~20GB (www.reddit.com) Hey everyone, We just open-sourced our reasoning model, Chaperone-Thinking-LQ-1.0, on Hugging Face. It's built on DeepSeek-R1-Distill-Qwen-32B but goes well beyond a simple quantization — here's what we actually did: The pipeline: 4-bit GP…
An Alignment Experiment: Native LLM vs. Custom Engine on Classical Naming. The statistical inertia is real. (www.reddit.com) [Project] I benchmarked my custom 2nd-order optimizer against AdamW across 1M, 5M, and 10M parameters. Here are the raw test results and scaling laws. (www.reddit.com) Distilled my AI Agents and Skills definitions (www.reddit.com) I have significantly distilled my AI Agents and Skills definitions. My goal is to reduce the context size and token usage without impacting the quality of my development team.
DGX Spark users: What's the easiest way to do multi-node vLLM clustering with a browser UI and training? (www.reddit.com) Hey r/LocalLLaMA, I've been running a small 4-node DGX Spark cluster on a 400µT fabric switch and got frustrated with the usual raw Ray/vLLM scripts and EXO basically ignoring pure NVIDIA paths. I started from the solid foundation in [eugr…
Curiosity about Chatterbox's architecture led me to fine-tune it for 8 Indian languages by LoRA, using 1.4% params (www.reddit.com) TL;DR: Fine-tuned Chatterbox-Multilingual for Telugu, Kannada, Bengali, Tamil, Malayalam, Marathi, Gujarati, and Hindi using LoRA adapters + tokenizer extension. Only 7.8M / 544M parameters trained.
I open-sourced media-tsunami — a tool that extracts your brand voice into a CLAUDE.md any LLM can load (www.reddit.com) Your brand voice is probably a PDF nobody reads, or it's trapped in one founder's head, or it's scattered across a thousand ChatGPT histories. I wanted to treat it like code instead — a file you can version, share, diff, and plug into any…
20x Faster TRL Fine-tuning with RapidFire AI (huggingface.co) (LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware (huggingface.co) Building smarter maps with GPT-4o vision fine-tuning (openai.com) Argilla 2.4: Easily Build Fine-Tuning and Evaluation Datasets on the Hub — No Code Required (huggingface.co) Introducing vision to the fine-tuning API (openai.com) Fine-tuning LLMs to 1.58bit: extreme quantization made easy (huggingface.co) Fine-tuning GPT-4o webinar (openai.com) LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning? (huggingface.co) Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models (huggingface.co) Introducing improvements to the fine-tuning API and expanding our custom models program (openai.com) Fine-Tuning Gemma Models in Hugging Face (huggingface.co) Make LLM Fine-tuning 2x faster with Unsloth and 🤗 TRL (huggingface.co) Fine-tuning Llama 2 70B using PyTorch FSDP (huggingface.co) OpenAI partners with Scale to provide support for enterprises fine-tuning models (openai.com) GPT-3.5 Turbo fine-tuning and API updates (openai.com) Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU (huggingface.co) Parameter-Efficient Fine-Tuning using 🤗 PEFT (huggingface.co) Fine-tuning GPT-3 to scale video creation (openai.com) Accelerating PyTorch distributed fine-tuning with Intel technologies (huggingface.co)