Show HN: A book that builds GPT-2, Llama 3, DeepSeek from scratch in PyTorch

hn · news.ycombinator.com ·2 pts·1 replies ↗ ·1d

I'm a software engineer who works with LLMs professionally (Forward Deployed Engineer at TrueFoundry). Over the past year I built up implementations of five LLM architectures from scratch and wrote a book around them.

deepseekllamaopenai

open →

← back to top