/ML

How Uber Optimizes The Timing Of Push Notifications Using ML And Linear Programming

tl;dr: "We introduced a system we call the Consumer Communication Gateway: a centralized intelligence layer to manage the quality, ranking, timing, and frequency of push notifications on a user level."

featured in #383


How GPT3 Works - Visualizations And Animations

- Jay Alammar tl;dr: "The dataset of 300 billion tokens of text is used to generate training examples for the model. For example, these are three training examples generated from the one sentence at the top. You can see how you can slide a window across all the text and make lots of examples."

featured in #378


Copilot Internals

- Parth Thakkar tl;dr: "In this post, I try to answer specific questions about the internals of Copilot, while also describing some interesting observations I made as I combed through the code. I will provide pointers to the relevant code for almost everything I talk about, so that interested folks can take a look at the code themselves."

featured in #376


Improving Instagram Notification Management With Machine Learning And Causal Inference

- Nailong Zhang tl;dr: "The key to solving this problem is figuring out the incremental value of sending a daily digest notification compared to not sending... For some cohorts, they would be active without receiving the daily digest notifications and thus the incremental values would be small; selecting these cohorts to send the digest notifications is inefficient and may even spam these users."

featured in #366


RecSysOps: Best Practices for Operating a Large-Scale Recommender System

- Ehsan Saberian Justin Basilico tl;dr: "In this blog post, we introduce RecSysOps a set of best practices and lessons that we learned while operating large-scale recommendation systems at Netflix. These practices helped us to keep our system healthy while: (1) reducing our firefighting time, (2) focusing on innovations and (3) building trust with our stakeholders."

featured in #360


What I Learned Building Platforms At Stitch Fix

tl;dr: "I was lucky enough to spend the last six years focusing on “engineering for data science” and learning to build great platforms." Stefan guides us through 5 lessons he learned: (1) Focus on adoption, not completeness. (2) Your users are not all equal. (3) Abstract away the internals of your system. (4) Live your users’ life cycle. (5) The two layer API trick. 

featured in #359


Machine Learning For Fraud Detection in Streaming Services

tl;dr: "Many users across many platforms make for a uniquely large attack surface that includes content fraud, account fraud, and abuse of terms of service. Detection of fraud and abuse at scale and in real-time is highly challenging."

featured in #355


How The New York Times Uses Machine Learning To Make Its Paywall Smarter

- Rohit Supekar tl;dr: "When the paywall was launched, the meter limit was the same for all users. However, as The Times has transformed into a data-driven digital company, we are now successfully using a causal machine learning model called the Dynamic Meter to set personalized meter limits and to make the paywall smarter."

featured in #345


Introducing Natural Language Search For Podcast Episodes

- Alexandre Tamborrino tl;dr: "To enable users to find more relevant content with less effort, we started investigating a technique called Natural Language Search, also known as Semantic Search. In a nutshell, Natural Language Search matches a query and a textual document that are semantically correlated instead of needing exact word matches. It matches synonyms, paraphrases, etc., and any variation of natural language that express the same meaning."  

featured in #336


The Berkeley Crossword Solver

tl;dr: "The BCS uses a two-step process to solve crossword puzzles. First, it generates a probability distribution over possible answers to each clue using a question answering (QA) model; second, it uses probabilistic inference, combined with local search and a generative language model, to handle conflicts between proposed intersecting answers."

featured in #331