Essential Reading For Engineering Leaders

- Andrej Karpathy

GPT
LLM

tl;dr: “We reproduce the GPT-2 (124M) from scratch. This video covers the whole process: First we build the GPT-2 network, then we optimize its training to be really fast, then we set up the training run following the GPT-2 and GPT-3 paper and their hyperparameters, then we hit run, and come back the next morning to see our results, and enjoy some amusing model generations.”

featured in #523

What We’ve Learned From A Year of Building With LLMs

LLM

tl;dr: “We’ve spent the past year building, and have discovered many sharp edges along the way. While we don’t claim to speak for the entire industry, we’d like to share what we’ve learned to help you avoid our mistakes and iterate faster. These are organized into three sections: tactical, operational and strategic.”

featured in #520

Don't Worry About LLMs

- Vicki Boykis

LLM

tl;dr: Vicki shares challenges of working with LLMs, offering advice to focus on specific use cases, establishing clear evaluation metrics, building modular systems, and troubleshooting complex issues by getting "close to the metal."

featured in #518

Did GitHub Copilot Really Increase My Productivity?

- Yuxuan Shui

LLM
Productivity

tl;dr: “I had free access to GitHub Copilot for about a year, I used it, got used to it, and slowly started to take it for granted, until one day it was taken away. I had to re-adapt to a life without Copilot, but it also gave me a chance to look back at how I used Copilot, and reflect - had Copilot actually been helpful to me?”

featured in #513

What Can LLMs Never Do?

- Rohit Krishnan

LLM

tl;dr: “Over the past few weeks I have been obsessed by trying to figure out the failure modes of LLMs. This started off as an exploration of what I found. It is admittedly a little wonky but I think it is interesting. The failures of AI can teach us a lot more about what it can do than the successes.”

featured in #511

How Does ChatGPT Work? As Explained By The ChatGPT Team

- Gergely Orosz

LLM
AI

tl;dr: When you ask ChatGPT a question, several steps happen: (1) Input: We take your text from the text input. (2) Tokenization: We chunk it into tokens. A token roughly maps to a couple of unicode characters. You can think of it as a word. (3) Create embeddings: We turn each token into a vector of numbers. These are called embeddings. (4) Multiply embeddings by model weights: We then multiply these embeddings by hundreds of billions of model weights. (5) Sample a prediction.

featured in #508

How We Built Text-to-SQL At Pinterest

SQL
LLM

tl;dr: “We took the rise in availability of LLMs as an opportunity to explore whether we could assist our data users with this task by developing a Text-to-SQL feature which transforms these analytical questions directly into code.” The authors describe the tools evolution and implementation.

featured in #507

Lessons After A Half-Billion GPT Tokens

- Ken Kantzer

AI
LLM

tl;dr: “I thought I’d share some of the more “surprising” lessons after churning through just north of 500 million tokens, by my estimate.” Lessons include: (1) When it comes to prompts, less is more. (2) You don’t need langchain. You probably don’t even need anything else OpenAI has released in their API in the last year. (3) Improving the latency with streaming API and showing users variable-speed typed words is actually a big UX innovation with ChatGPT.

featured in #506

Notes On How To Use LLMs In Your Product

- Will Larson

LLM
Productivity

tl;dr: “I’ve been working fairly directly on meaningful applicability of LLMs to existing products for the last year, and wanted to type up some semi-disorganized notes. These notes are in no particular order, with an intended audience of industry folks building products.” Will discusses opportunities re-configuration, combining LLMs with unsophisticated algorithms to retrieve data. And more.

featured in #505

Developing Rapidly With Generative AI

- Shannon Phu

tl;dr: From the engineering team at Discord: “We break down the process of building with LLMs into a few stages. Starting with product ideation and defining requirements, we first need to figure out what we’re building and how it can benefit users. Next, we develop a prototype of our idea, learn from small-scale experiments, and repeat that process until our feature is in a good state. Finally, we fully launch and deploy our product at scale. In this post, we will dive deeper into each stage of this process.”

featured in #505

/LLM