How We Built A Streaming SQL Engine
- Maor Kern tl;dr: “So you probably wake up every morning asking yourself three of life’s most pertinent questions- how do I build a streaming SQL engine, what even is a streaming SQL engine, and can our Lord drop tables owned by another user. I too found myself asking these questions, sometimes even dreaming about them- often in the form of various SQL operators pointing and laughing at my incompetence as I beg them to answer me.” A streaming SQL engine keeps queries’ results up to date without ever having to recalculate them and Maor discusses topic like data ingestion, query optimization, and stream processing.featured in #458
featured in #450
Is 20M Of Rows Still A Valid Soft Limit Of MySQL Table In 2023?
- Yisheng Gong tl;dr: “There’s rumor around the internet that we should avoid having > 20M rows in a single MySQL table. Otherwise, the table’s performance will be downgraded, you will find SQL query much slower than usual when it’s above the soft limit. These judgements were made on HDD many years ago. I’m wondering if it’s still true for MySQL on SSD in 2023, and if true, why is that?”featured in #417
Online Gradient Descent Written In SQL
- Max Halford tl;dr: Max implements a ML algorithm within a relational database, using SQL. Some databases allow doing inference with an already trained model. Training the model in the database would remove altogether the need for a separate inference / training service. Max attempts to do this with the Online Gradient Descent algorithm.featured in #398
SQL Should Be Your Default Choice For Data Engineering Pipelines
- Robin Linacre tl;dr: "SQL should be the first option considered for new data engineering work. It’s robust, fast, future-proof and testable. With a bit of care, it’s clear and readable. A new SQL engine - DuckDB - makes SQL competitive with other high performance dataframe libraries, making SQL a good candidate for data of all sizes."featured in #387
Writing A Python SQL Engine From Scratch
- Toby Mao tl;dr: "This post will cover why I went through the effort of creating a Python SQL engine and how a simple query goes from a string to actually transforming data." Toby covers tokenizing, parsing, optimizing, planning and executing.featured in #381
Postgres Auditing In 150 lines Of SQL
- Oliver Rice tl;dr: "Data auditing is a system that tracks changes to tables' contents over time. PostgreSQL has a robust set of features which we can leverage to create a generic auditing solution in 150 lines of SQL."featured in #303
Working With SQL Server Identity Columns
- Greg Larsen tl;dr: "SQL Server identity columns are easy to add to a table, but you must understand how they work to use them effectively. In this article, Greg Larsen explains the nuances of SQL Server identity columns."featured in #260
featured in #245
featured in #236