featured in #405
The Data Science Interview Book
tl;dr: "This book does not cover the topics in depth, it covers just enough to get you ready for the interview. The assumption here is that the person using it is already familiar with the topic and is here to brush up on the same. Additonal resources for someone eager to explore the topic in depth is added. In short, don’t use this as text book, use it as a revision note."featured in #364
Introduction To Streaming For Data Scientists
- Chip Huyen tl;dr: "With luck you shouldn’t have to build or maintain a streaming system yourself. Your company should have infrastructure to help you with this. However, understanding where streaming is useful and why streaming is hard could help you evaluate the right tools and allocate sufficient resources for your needs."featured in #342
Data Mesh — A Data Movement and Processing Platform @ Netflix
tl;dr: "As the system evolves to solve more and more use cases, we have expanded its scope to handle not only the CDC use cases but also more general data movement and processing use cases:" (1) Events can be sourced from more generic applications. (2) Catalog of available DB connectors is growing. (3) More processing patterns such as filter, projection, union, join, etc...featured in #341
Stop Aggregating Away The Signal In Your Data
- Zan Armstrong tl;dr: "Aggregation is the standard best practice for analyzing time series data, but it can create problems by stripping away crucial context so that you’re not even aware of how much potential insight you’ve lost. In this article, I’ll start by discussing how aggregation can be problematic, before walking through three specific alternatives to aggregation with before / after examples."featured in #339
Organizing And Scaling An Effective Data Team
- Rob Dearborn tl;dr: The scope of a data team should include: (1) Ensuring focus on the right hierarchy of input & output metrics. (2) Steering the roadmap through insightful analysis & research. (3) Driving optimization through experimentation and ML. (4) Developing and maintaining data infrastructure. Rob outlines how the data team should evolve, and it's function within a startup, as it grows.featured in #302
Algorithms For Decision Making
- Mykel Kochenderfer Tim Wheeler Kyle Wray tl;dr: "This book provides a broad introduction to algorithms for decision making under uncertainty. We cover a wide variety of topics related to decision making, introducing the underlying mathematical problem formulations and the algorithms for solving them."featured in #299
featured in #293
Data To Engineers Ratio: US vs Europe
- Mikkel Dengsøe tl;dr: "The median data to engineers ratio for the US companies I looked at is 1:7 compared to 1:4 for European companies. And the design to engineers ratio is 1:9 for both groups. This post gives some answers to why this is but also leaves some questions unanswered."featured in #282
What Is The Right Level Of Specialization? For Data Teams And Anyone Else
- Erik Bernhardsson tl;dr: The specialization of data teams into many different roles e.g. data scientist, data engineer, analytics engineer, ML engineer etc is "generally a bad thing driven by the fact that tools are bad and too hard to use." He elaborates on this stance, here.featured in #255