/Data

Introducing Netflix’s Key-Value Data Abstraction Layer

tl;dr: “In this post, we dive deep into how Netflix’s KV abstraction works, the architectural principles guiding its design, the challenges we faced in scaling diverse use cases, and the technical innovations that have allowed us to achieve the performance and reliability required by Netflix’s global operations.”

featured in #552


Building And Scaling Notion’s Data Lake

tl;dr: “In the past three years Notion’s data has expanded 10x due to user and content growth, with a doubling rate of 6-12 months. Managing this rapid growth while meeting the ever-increasing data demands of critical product and analytics use cases, especially our recent Notion AI features, meant building and scaling Notion’s data lake. Here’s how we did it.”

featured in #533


Data Loaders For The Win

- Allison Horst tl;dr: Slow data apps hinder data exploration by viewers and developers, leaving insights on the table. See how data loaders can help you speed up data apps by pushing bulky data access, wrangling and analysis “behind the scenes” on build instead of on page load.

featured in #516


Data Fetching Patterns In Single-Page Applications

- Juntao Qiu tl;dr: “When a single-page application needs to fetch data from a remote source, it needs to do so while remaining responsive and providing feedback to the user during an often slow query. Five patterns help with this. Asynchronous State Handler wraps these queries with meta-queries for the state of the query. Parallel Data Fetching minimizes wait time. Fallback Markup specifies fallback displays in markup. Code Splitting loads only code that's needed. Prefetching gathers data before it may needed to reduce latency when it is.”

featured in #515


Building A Weather Data Warehouse Part I: Loading A Trillion Rows Of Weather Data Into TimescaleDB

- Ali Ramadhan tl;dr: “I think it would be cool to have historical weather data from around the world to analyze for signals of climate change we’ve already had rather than think about potential future change.” Ali discusses the implementation of this analysis tool. 

featured in #510


Struggling with Snowflake Costs? Try our Cost Optimization Calculator

tl;dr: Snowflake costs skyrocket for SaaS providers because the need to deliver real-time, interactive analytics is always on. If your Snowflake bill is spiraling, try our cost optimization calculator to discover your potential savings when using a Snowflake warehouse for ad-hoc queries. (No form required)

featured in #501


Top 5 Challenges of Designing Your Data Warehouse for Multi-Tenant Analytics

tl;dr: Data warehouses are built to store large volumes of data from numerous sources, not for SaaS platforms working with multi-tenant analytics where data security is vital. This guide helps you avoid the headaches that come with that architecture mismatch featuring solutions from our analytics experts.

featured in #499


Custom Data Models: The Key to Unlocking Powerful Embedded Analytics

- Brian Dreyer tl;dr: Without custom data models, even the most advanced analytics fail to deliver value, leading to customer churn. If you’re a SaaS leader, learn why custom data models are imperative for multi-tenant software platforms and four features of conventional data warehousing that are limiting your growth.

featured in #495


How DoorDash Used A Service Mesh To Manage Data Transfer, Reducing Hops And Cloud Spend

- Levon Stepanian Hochuen Wong tl;dr: There have been many benefits gained through DoorDash’s evolution from a monolithic application architecture to one based on microservices. The new architecture has reduced the time required for development, test, and deployment and at improved scalability and resiliency. DoorDash observed an uptick in data transfer costs, which prompted the engineering team to investigate alternative ways to provide the same level of service more efficiently. 

featured in #483


Data Quality Score: The Next Chapter Of Data Quality At Airbnb

- Clark Wright tl;dr: "With 1.4 billion cumulative guest arrivals as of year-end 2022, Airbnb’s growth pushed us to an inflection point where diminishing data quality began to hinder our data practitioners. Weekly metric reports were difficult to land on time. Seemingly basic metrics like “Active Listings” relied on a web of upstream dependencies. Conducting meaningful data work required significant institutional knowledge to overcome hidden caveats in our data." Clark discusses the implementation of a Data Quality Score.

featured in #471