/Performance

Response Time Is The System Talking

- Christoffer Stjernlöf tl;dr: In determining an appropriate request rate for HTTP scraping, Christoffer introduces the concept of system utilization. Rather than aiming for 100% utilization, he recommends a target below 40%. "Response time is a function of utilisation," allowing one to gauge system load. By comparing baseline response time when the system is idle, to loaded response time, one can estimate utilization. Christoffer provides a mathematical breakdown of this relationship, emphasizing its applicability in real-world scenarios.

featured in #447


This Is How Quora Shards MySQL To Handle 13+ Terabytes

tl;dr: With data storage requirements in the tens of terabytes and 100,000 queries per second, Quora chose MySQL for its improved read performance. To manage rapid data growth and high write queries, Quora implemented both vertical and horizontal sharding techniques. Vertical sharding involves moving different tables to different servers, improving write scalability. Horizontal sharding, on the other hand, splits a large table into multiple smaller tables. Quora opted to build its sharding solution instead of using third-party service for low latency and easy reuse of existing logic.

featured in #445


Keeping Figma Fast

- Slava Kim Laurel Woods tl;dr: Figma's journey in evolving its performance testing system as the company scaled. Initially, Figma used a single MacBook for all its in-house performance testing. However, as the codebase grew more complex and the team expanded, this approach became unsustainable. The article outlines the challenges Figma faced, such as the need for more granular performance tests and the limitations of running tests on a single piece of hardware. To address these issues, Figma adopted a two-system approach: a cloud-based system for mass testing and a hardware system for more targeted, precise tests. Both systems are connected by the same Continuous Integration system and aim to catch performance regressions early in the development cycle.

featured in #444


How We Reduced The Size Of Our JavaScript Bundles By 33%

- Umair Nadeem Rich Hong tl;dr: Dropbox reduced its JavaScript bundles by 33% by replacing its outdated module bundler with Rollup. The existing system led to large bundle sizes and performance issues. Rollup's features like automatic code-splitting and tree shaking optimized the bundling process. Despite challenges in implementation, the transition to Rollup significantly improved performance.

featured in #441


Optimizing Speed On eBay.com

- Addy Osmani tl;dr: Optimizations include: (1) Search Results Optimization: By sending the first 10 item images along with the header, eBay ensures quicker downloads, reducing the download start time for search result images. (2) Edge Caching for autosuggestion data: suggestions in the search box are cached and served from a CDN, reducing network latency and server processing time. (3) Edge caching for unrecognized homepage users: Content for unrecognized users is cached on eBay's edge network, allowing first-time users to receive content from a nearby server, reducing network latency and server processing time.

featured in #439


How We Improved Our Serverless API 300x

- Daniel Bot tl;dr: ePilot's team improved their serverless API response time by 300x using AWS DynamoDB. They initially faced issues due to a simplistic table design without understanding search patterns and a 10x increase in record data. By redesigning the table and storing more data, they optimized queries, making them 10-20 times faster. The experience emphasized the importance of understanding search patterns and proper design in DynamoDB.

featured in #438


Improving Performance With HTTP Streaming

tl;dr: “You may have heard a joke that the internet is a series of tubes. In this blog post, we’re going to talk about how we get a cool, refreshing stream of Airbnb.com bytes into your browser as quickly as possible using HTTP Streaming.”

featured in #421


300ms Faster: Reducing Wikipedia's Total Blocking Time

- Nicholas Ray tl;dr: “Wikipedia’s mobile site suffered from a piece of JS that could take over 600ms to execute during page load on low-end devices, effectively blocking user interactions.” Nicholas walks through the steps taken to reduce the execution time of this task by about 50%.

featured in #419


How Much Memory Do You Need To Run 1 Million Concurrent Tasks?

- Piotr Kołaczkowski tl;dr: “In this blog post, I delve into the comparison of memory consumption between asynchronous and multi-threaded programming across popular languages like Rust, Go, Java, C#, Python, Node.js and Elixir.”

featured in #418


Uptime Guarantees — A Pragmatic Perspective

- Itzy Sabo tl;dr: “The cost of building and operating a system in a way that guarantees 99.99% uptime is several times as expensive as 99.5%. This is in terms of system complexity, the number of engineers required, their specialisations, experience levels, and corresponding salaries, as well as significantly increased operational costs and arrangements.”

featured in #412