SELECT 1 Touches 5,583 Lines of Postgres Source Code

I ran SELECT 1 in Postgres under gdb and traced where the execution path goes through the source code.

But before that, here’s why I wanted to know:

Background

A while back, I had run into an obscure bug in Postgres that I am sure has affected no one and will affect no one due to the absurdity of its necessary conditions. I had tried to fix it and failed at the time. Recently I revisited the bug and again failed to find the root cause, despite spending many many hours deep in Postgres source code.

Though I learned a little more about Postgres in the process (e.g., how it implements re-parenting heap-allocated memory when the memory needs to live longer than the context in which it was allocated), I came away empty-handed again.

I am by no means a Postgres internals expert, but I know its general structure, individual subsystems, and how various common patterns such as polymorphism and exceptions are implemented in C. I also know the C API exposed by Postgres very well, as I have used it extensively in my contributions to TimescaleDB, LanternDB, other Lantern projects, pgvector, and pg_cron.

So, I thought I should be well-positioned to chase down a 100% reproducible bug in Postgres source that requires ~10 lines of very simple code with almost no data to reproduce. I was trying to narrow down my search with gdb and memory tracing tools, but the number of Postgres subsystems I had to understand was growing rapidly with no end in sight.

I tabled the bug for now… again. Although it was easy to reproduce the issue, the amount of context involved and the number of Postgres source code lines touched by the reproducing code was too large to be comprehensible in the time I had allotted for the task.

Since my simple-looking bug reproducer took surprisingly complex paths through Postgres source, I decided to find out what is involved in running the simplest query ever: SELECT 1.

Approach

The task is to run SELECT 1 and record all the lines of Postgres source code that get executed in the process. I wrote a gdb extension to automate this (I’m so glad I didn’t do this by hand). The core idea is simple:

class PCTracer:
    def on_stop(self, event):
        # Get current location in source code
        pc = gdb.selected_frame().pc()
        filename, line = self.get_source_location()

        # Record it
        self.trace_data.append({
            "pc": f"0x{pc:x}",
            "file": filename,
            "line": line,
            "func": frame.name()
        })

        # Continue stepping
        gdb.execute("step")

# Usage:
# (gdb) source tracer.py
# (gdb) start-trace
# (gdb) stop-trace

The extension hooks into gdb’s stop events, records the source location at each step, and continues single-stepping through the entire execution. See the full code here.

Results

Over the course of just over 2 minutes of single-stepping through the code, the execution touched 112,800 individual steps across 5,583 unique source code locations.

The breadth of the codebase involved was also surprising. The query touched 143 different source files — from the query planner (planner.c with 364 unique lines executed) to transaction management (xact.c with 226 lines), memory allocation (aset.c with 217 lines), error logging (elog.c with 193 lines), and the main Postgres loop (postgres.c with 187 lines). Even seemingly peripheral systems like string formatting (snprintf.c with 145 lines) and snapshot management (snapmgr.c with 125 lines) were involved.

Looking at function-level statistics, the trace touched 640 unique functions.1

Comparison with SQLite

I ran the same experiment on SQLite.

SQLite executed SELECT 1 in just 100 traced steps — over 1,000 times fewer than Postgres. It touched only 56 unique source lines.

While SQLite is indeed simpler and lacks several Postgres subsystems (access control, full MVCC, buffer management), I would not think it is 1000x or even 100x simpler. I don’t know for sure what accounts for the difference, but having spent some quality time in the Postgres source code, I think it is designed with ultimate generality in mind. So even though the query at hand looks extremely simple, it likely goes through all the same code paths that a significantly more general query would traverse.

What Can I Conclude About My Bug Hunt?

Seeing these numbers helps explain why my attempts to debug the PL/pgSQL cache invalidation bug through gdb tracing were futile. If a trivial SELECT 1 touches 5,583 unique code locations across 143 files, my slightly more complex bug reproducer — involving polymorphic functions, table alterations, and cache invalidation — likely touches orders of magnitude more code. I think I have a few more tricks up my sleeve to try on that bug, however. Hopefully, when I revisit it next time, it will be the last time, and I will have my so sought-after inconsequential contribution to upstream Postgres.


I’m currently in the job market for database systems and infrastructure roles. If you’re working on interesting problems in this space, I’d love to chat narekg at berkeley dot edu !


  1. Full trace results available for Postgres and SQLite ↩︎