/Performance

The World's Smallest PNG

- Evan Hahn tl;dr: This post describes this file in more detail and tries to explain how PNGs work along the way. The smallest PNG file has four sections: (1) The PNG signature, the same for every PNG: 8 bytes. (2) The image’s metadata, which includes its dimensions: 25 bytes. (3) The image’s pixel data: 22 bytes. (4) An “end of image” marker: 12 bytes.  

featured in #477


4 Billion If Statements

- Andreas Karlsson tl;dr: "So I went to work to explore this idea of checking if a number is odd or even by only using comparisons to see how well it works in a real world scenario. Since I’m a great believer in performant code I decided to implement this in the C programming language as it’s by far the fastest language on the planet to this day..."

featured in #476


Getting Started With Web Performance

- Alistair Shepherd tl;dr: "We’ll be diving into the river of load times and exploring what web performance is, why it’s important, how to measure it and finally my click-baity “Ten Wild Web Performance Tips! You’ll be saving number 5 for later!”. If you already know your CLS’ from your FCPs, lab from field data, and are well familiar with Lighthouses (not the ones with big lights) then you can jump straight to the tips."

featured in #476


Building A Faster Hash Table For High Performance SQL Joins

- Andrei Pechkurov tl;dr: Andrei delves into QuestDB’s unique hash table, FastMap, designed to enhance SQL execution for JOIN and GROUP BY operations. FastMap employs open addressing and linear probing, optimized for high performance in database environments. It supports variable-size keys and fixed-size values, facilitating efficient data handling and updates. Notably, FastMap operates off-heap, reducing garbage collection pressure to improve performance.

featured in #475


Parsing 8-Bit Integers Quickly

- Daniel Lemire tl;dr: Daniel discusses efficient methods for parsing 8-bit integers from ASCII / UTF-8 strings. The post begins by describing a basic C function for parsing 8-bit integers. This function checks if each character in the string is a digit and then calculates the integer value. It works well for strings with predictable lengths but can be inefficient when the length varies due to branch misprediction in the processor. The post also discusses an alternative version, which is presented as potentially faster under some compilers.

featured in #470


Analyzing Data 170,000x Faster With Python

- Sidney Radcliffe tl;dr: The article elaborates on the process of optimizing a function in Python. Sydney begins by explaining the initial challenges faced, such as slow performance and high memory usage. To address these issues, the article suggests using libraries like NumPy and Cython. Throughout the piece, the author provides code snippets to showcase the optimization steps. By the end, performance comparisons are presented, highlighting the significant improvements achieved through the optimization process.

featured in #461


Maxjourney: Pushing Dicord's Limits With A Million+ Online Users In A Single Server

- Yuliy Pisetsky tl;dr: "With that growth, those servers started slowing down and creeping ever closer to their throughput limits. As that’s happened, we’ve continued to find many improvements to keep making them faster and pushing the limits out further. In this post we’ll talk about some of the ways we’ve scaled individual Discord servers from tens of thousands of concurrent users to approaching two million concurrent users in the past few years."

featured in #460


The Surprising Impact Of Medium-Size Texts On PostgreSQL Performance

- Haki Benita tl;dr: Haki’s article delves into the intricacies of text field sizes and their impact on PostgreSQL query performance. He classifies text fields into "small", "medium", and "large", highlighting the unexpected performance implications of medium-sized texts. Through the lens of PostgreSQL's TOAST mechanism, which compresses and/or breaks up large field values, Haki demonstrates that medium texts can sometimes lead to slower queries than even larger texts. "The main problem with medium-size texts is that they make the rows very wide," affecting performance due to increased IO.

featured in #453


How DoorDash Fosters Meaningful Engineering Career Development

tl;dr: “In Q2 2023, we revisited our performance expectations for all engineers at DoorDash. We started by gathering a group of engineers to discuss which existing expectations were still relevant, and which ones were no longer serving us. We defined what we see as the traits of our most successful engineers at each level based on our three pillars: (1) Business Outcome: how engineers deliver impact based on our direction and goals. (2) People: how well we collaborate as a team and invest in each other’s development and success. (3) Engineering Excellence: the quality of our products and systems, how fast we can move, and how efficiently our systems use resources.” The team shares these performance expectations publicly, in this post.

featured in #451


Selective Column Reduction For DataLake Storage Cost Efficiency

tl;dr: “As Uber continues to grow, the sheer volume of data we handle and the associated demands for data access have multiplied. This rapid expansion in data size has given rise to escalating costs in terms of storage and compute resources. Consequently, we have encountered various challenges such as increased hardware requirements, heightened resource consumption, and potential performance issues like out-of-memory errors and prolonged garbage collection pauses.”

featured in #450