Scaling The Instagram Explore Recommendations System
- Vladislav Vorotilov Ilnur Shugaepov tl;dr: Instagram has introduced a multi-stage approach to ranking, including retrieval, first-stage ranking, second-stage ranking, and final re-ranking. The system leverages caching and pre-computation with a Two Towers neural network, making it more flexible and scalable. Techniques like Two Tower retrieval, user interactions history, and parameters tuning - including Bayesian optimization and offline tuning - are employed. The article emphasizes the clever use of caching and pre-computation allowing for heavier models in ranking stages, and concludes with a note on the ongoing complexity and future improvements.featured in #439
Leveraging Real-Time User Actions To Personalize Etsy Ads
- Alaa Awad Denisa Roberts tl;dr: Etsy has introduced a unique, innovative approach to personalizing machine learning models based on encoding and learning from short-term sequences of user actions. This is achieved through a three-component deep learning module known as the adSformer Diversifiable Personalization Module (ADPM). The module is aimed at improving the relevance of sponsored listings to the user's intent and is applied to the clickthrough tate and post-click conversion rate prediction models.featured in #434
RLHF: Reinforcement Learning From Human Feedback
- Chip Huyen tl;dr: How exactly does RLHF work? Why does it work?” Chip discusses the answers to these questions. “RL has been notoriously difficult to work with, and therefore, mostly confined to gaming and simulated environments. Just five years ago, both RL and NLP were progressing pretty much orthogonally – different stacks, different techniques, and different experimentation setups. It’s impressive to see it work in a new domain at a massive scale.”featured in #414
Real World Recommendation System – Part 1
- Nikhil Garg tl;dr: “The goal of this publication is to start from the basics, explain nuances of all the moving layers, and describe this universal recommendation system architecture.”featured in #411
Twitter's Recommendation Algorithm
tl;dr: Twitter recommendation algorithm distills roughly 500 million tweets posted daily down to a handful of top tweets that show up on your device’s, specifically for you. This blog is an introduction to how the algorithm works.featured in #403
Demand And ETR Forecasting At Airports
tl;dr: The engineering team at Uber discuss how to tackle the undersupply / oversupply issue at airports to forecast supply balance and optimize resource allocation. The team built new models for demand-forecasting and effective queue length on the top of the Michelangelo platform, and integrated with current Driver app.featured in #400
Online Gradient Descent Written In SQL
- Max Halford tl;dr: Max implements a ML algorithm within a relational database, using SQL. Some databases allow doing inference with an already trained model. Training the model in the database would remove altogether the need for a separate inference / training service. Max attempts to do this with the Online Gradient Descent algorithm.featured in #398
Scaling Media Machine Learning At Netflix
tl;dr: Netlfix’s goal in building ML infrastructure is to reduce the time from ideation to productization for the company. The team built infrastructure to (1) Access and process media data (e.g. video, image, audio, and text) (2) Training large-scale models efficiently. (3) Productize models in a self-serve fashion. (4) Store and serve model outputs for consumption.featured in #396
What Is ChatGPT Doing … and Why Does It Work?
- Stephan Wolfram tl;dr: "My purpose here is to give a rough outline of what’s going on inside ChatGPT—and then to explore why it is that it can do so well in producing what we might consider to be meaningful text. I should say at the outset that I’m going to focus on the big picture of what’s going on—and while I’ll mention some engineering details, I won’t get deeply into them.”featured in #390
Accelerating Our A/B Experiments With Machine Learning
- Michael Wilson tl;dr: "Dropbox runs experiments that compare two product versions — A and B — against each other to understand what works best for our users. When a company generates revenue from selling advertisements, analyzing these A/B experiments can be done promptly; did a user click on an ad or not? However, at Dropbox we sell subscriptions, which makes analysis more complex. What is the best way to analyze A/B experiments when a user’s experience over several months can affect their decision to subscribe?"featured in #385