/Scale

Database Sharding Explained

- Mahdi Yusuf tl;dr: Mahdi discusses when to use it, how it can be set up, why we shard data stores and various options you have before sharding.

featured in #401


How Discord Stores Trillions Of Messages

- Bo Ingram tl;dr: “Our Cassandra cluster exhibited serious performance issues that required increasing amounts of effort to just maintain, not improve.” Bo discusses the troubles with Cassandra and the migration to ScyllaDB, a Cassandra-compatible database written in C++.

featured in #396


From Postgres To Amazon DynamoDB

tl;dr: From the engineering team at Instacart, who have to manage and efficiently store and query hundreds of terabytes of data. The primary datastore of choice was Postgres - but once specific use cases began to outpace the largest Amazon EC2 instance size AWS offers - they chose Amazon DynamoDB. Here they discuss migrating existing tables from Postgres to DynamoDB.

featured in #394


What's Identity-Native Infrastructure Access?

tl;dr: Unlock all Teleport Connect sessions to learn about infrastructure access from DoorDash, Dropbox, Discord, Vonage, and others when you RSVP for the Feb 9th event.

featured in #386


Scaling PostgresML To 1 Million Requests Per Second

- Lev Kokotov tl;dr: "In this post, we'll discuss how we horizontally scale PostgresML to achieve more than 1 million XGBoost predictions per second on commodity hardware.

featured in #367


Atomic Commitment: The Unscalability Protocol

- Marc Brooker tl;dr: Marc describes the classic CS problem Atomic Commitment. "The classic solution to this classic problem is Two-phase commit, maybe the most famous of all distributed protocols. There's a lot we could say about atomic commitment, or even just about two-phase commit. In this post, I'm going to focus on just one aspect: Atomic Commitment has weird scaling behavior."

featured in #360


9 Enablement Practices To Achieve DevOps At Enterprise Scale

tl;dr: Christian Oestreich, a senior software engineering leader with experience at multiple Fortune 500 companies, shares how to adopt a well-planned metrics-driven strategy that yields better quality code and lowers support costs.

featured in #353


Supercharging A/B Testing At Uber

tl;dr: "While the statistical underpinnings of A/B testing are a century old, building a correct and reliable A/B testing platform and culture at a large scale is still a massive challenge... Uber went through a similar journey and this blog post describes why and how we rebuilt the A/B testing platform we had at Uber."

featured in #337


What Happens When You Swipe A Credit Card? 

- Alex Xu tl;dr: "Visa, Mastercard, and American Express act as card networks for clearing and settling funds. The card acquiring bank and the card issuing bank can be – and often are – different. If banks were to settle transactions one by one without an intermediary, each bank would have to settle the transactions with all the other banks. This is quite inefficient."

featured in #335


Data Teams Are Getting Larger, Faster

tl;dr: "But something happens when a data team grows past 10 people. You no longer know if the data you use is reliable, the lineage is too large to make sense of and end-users start complaining about data issues every other day." Mikkel discusses how to deal with scaling teams.

featured in #334