/Migration

How To Actually Migrate Complex Systems In Infrastructure

- Kyle Cascade tl;dr: “Whenever doing a big migration from system to system2, front-load as much work as possible into building migration scaffolding to system2. That way you can migrate to system2 as seamlessly as possible, without a huge manual migration effort, one piece of functionality at a time. Then you can undo the scaffolding from system. This “Strangler Fig Pattern” is the key to making large migrations successful without putting the burden on your customers.”

featured in #574


Migrating Billions Of Records: Moving Our Active DNS Database While It’s In Use

- Alex Fattouche Corey Horton tl;dr: “When initially measured in 2022, DNS data took up approximately 40% of the storage capacity in Cloudflare’s main database cluster (cfdb). This database cluster, consisting of a primary system and multiple replicas, is responsible for storing DNS zones, propagated to our data centers in over 330 cities via our distributed KV store.” 

featured in #565


The Perils Of Migrating A Large-Scale Service At Uber

tl;dr: Details of Uber's journey in migrating its invoice generation service, highlighting challenges and lessons learned. The initial service was written in Python and faced scalability issues due to early design choices, accumulated technical debt and a legacy software stack. The new service was developed in Go, chosen for its speed and flexibility. The migration strategy adopted was component-based, focusing on individual system components rather than entire flows. The migration led to a 97% reduction in computing requirements and enhanced self-serve capabilities, reducing engineers' support work from 60% to under 20%.

featured in #442


Migrating Critical Traffic At Scale With No Downtime

tl;dr: From the team at Netflix: “when undertaking system migrations, one of the main challenges is establishing confidence and seamlessly transitioning the traffic to the upgraded architecture without adversely impacting the customer experience. This blog series will examine the tools, techniques, and strategies we utilized to achieve this goal.”

featured in #415


How Discord Stores Trillions Of Messages

- Bo Ingram tl;dr: “Our Cassandra cluster exhibited serious performance issues that required increasing amounts of effort to just maintain, not improve.” Bo discusses the troubles with Cassandra and the migration to ScyllaDB, a Cassandra-compatible database written in C++.

featured in #396


From Postgres To Amazon DynamoDB

tl;dr: From the engineering team at Instacart, who have to manage and efficiently store and query hundreds of terabytes of data. The primary datastore of choice was Postgres - but once specific use cases began to outpace the largest Amazon EC2 instance size AWS offers - they chose Amazon DynamoDB. Here they discuss migrating existing tables from Postgres to DynamoDB.

featured in #394


Strategies And Tools For Performing Migrations On Platform

- Mariana Ardoino Raul Herbster tl;dr: The authors present the following challenges - or scenarios - faced during the project: (1) Defining the scope of the project. (2) Scaling up. (3) Competing priorities. Each scenarios comes with symptoms (“when”), what you should avoid when facing the situation (“Don’t”), and what we suggest that you do (“Do”)."

featured in #371


Real-World Engineering Challenges #6: Migrations

- Gergely Orosz tl;dr: Gergely covers examples of companies that have carried out large scale migrations, including: (1) Box: a zero downtime data migration using a 6-step plan. (2) Pinterest: data migration using double writes. (3) LinkedIn: navigating the migration chaos when 100+ engineers were needed to write code and 600+ use cases need to be moved. And more. 

featured in #359


How We Reduced Our Annual Server Costs By 80% — From $1M To $200k — By Moving Away From AWS

- Trey Huffine tl;dr: Prerender saved $800k by removing their reliance on AWS and building in-house infrastructure to handle traffic and cached data. This post discusses the 3 phased approach to tackle the migration - testing, technical set-up, implementation and scaling. 

featured in #356


Changing Tires At 100mph: A Guide To Zero Downtime Migrations

- Kiran Rao tl;dr: (1) Create the new empty table. (2) Write to both old and new table. (3) Copy data (in chunks) from old to new. (4) Validate consistency. (5) Switch reads to new table. (6) Stop writes to the old table. (7) Cleanup old table. This guide will go through the step-by-step process of migrating tables in PostgreSQL. 

featured in #315