Essential Reading For Engineering Leaders

How I Use LLMs

#AI

tl;dr: “The example-driven, practical walkthrough of Large Language Models and their growing list of related features, as a new entry to my general audience series on LLMs. In this more practical followup, I take you through the many ways I use LLMs in my own life.”

featured in #596

Let's Reproduce GPT-2 (124M)

#GPT
#LLM

tl;dr: “We reproduce the GPT-2 (124M) from scratch. This video covers the whole process: First we build the GPT-2 network, then we optimize its training to be really fast, then we set up the training run following the GPT-2 and GPT-3 paper and their hyperparameters, then we hit run, and come back the next morning to see our results, and enjoy some amusing model generations.”

featured in #523

Let's Build The GPT Tokenizer

#GPT
#Video

tl;dr: “In this lecture we build from scratch the Tokenizer used in the GPT series from OpenAI. In the process, we will see that a lot of weird behaviors and problems of LLMs actually trace back to tokenization. We'll go through a number of these issues, discuss why tokenization is at fault, and why someone out there ideally finds a way to delete this stage entirely.”

featured in #491

Let's Build GPT: From Scratch, In Code, Spelled Out

#AI
#GPT
#Video

tl;dr: "We build a GPT, following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write a GPT."

featured in #382

/Andrej Karpathy