Essential Reading For Engineering Leaders

How To Get Or Create In PostgreSQL

#PostgreSQL

tl;dr: "Get or create" is a very common operation for syncing data in the database, but implementing it correctly may be trickier than you may expect. If you ever had to implement it in a real system with real-life load, you may have overlooked potential race conditions, concurrency issues and even bloat.

featured in #540

The Surprising Impact Of Medium-Size Texts On PostgreSQL Performance

#PostgreSQL
#Performance

tl;dr: Haki’s article delves into the intricacies of text field sizes and their impact on PostgreSQL query performance. He classifies text fields into "small", "medium", and "large", highlighting the unexpected performance implications of medium-sized texts. Through the lens of PostgreSQL's TOAST mechanism, which compresses and/or breaks up large field values, Haki demonstrates that medium texts can sometimes lead to slower queries than even larger texts. "The main problem with medium-size texts is that they make the rows very wide," affecting performance due to increased IO.

featured in #453

The Unexpected Find That Freed 20GB Of Unused Index Space

#PostgreSQL

tl;dr: Haki’s team managed to free up more than 70GB of database storage without dropping any indexes or deleting data. They initially used conventional techniques like rebuilding indexes and tables to clear up space. However, a surprising discovery allowed them to free an additional ~20GB. They realized that PostgreSQL indexes NULL values, which led them to create a partial index that excludes these NULL values, thereby significantly reducing the index size. The article also delves into the concept of "bloat" in PostgreSQL tables and indexes, offering solutions like using the REINDEX command and the pg\_repack extension to manage it. Haki suggests that partial indexes are particularly useful for fields with a high percentage of NULL values.

featured in #443

Lesser Known PostgreSQL Features

#PostgreSQL

tl;dr: 20 features including: (1) Get the number of updated and inserted rows in an Upsert. (2) Grant permissions on specific columns. (3) Match against multiple patterns, and more.

featured in #268

Some SQL Tricks of an Application DBA

#SQL

tl;dr: Haki shares "non-trivial tips" around database development with explanations, such as (1) update only what needs updating (2) disable constraints and indexes during bulk loads, and more

featured in #196

Stop Using datetime.now!

#Python

tl;dr: Discusses dependency injection as a design pattern, which has the main benefit of decoupling modules, functions & objects.

featured in #184

/Haki Benita