SELECT 1 Touches 5,583 Lines of Postgres Source Code

Mon, Nov 10, 2025

I ran SELECT 1 in Postgres under gdb and traced the execution path through the source code.

But before that, a few words on why I wanted to know.

Background

A while back, I had run into an obscure bug in Postgres that I am sure has affected no one and will affect no one, due to the absurdity of its necessary conditions. I had tried to fix it and failed at the time. Recently I revisited the bug and again failed to find the root cause, despite spending many many hours deep in Postgres source code. Though I learned a little more about Postgres in the process (e.g., how it implements re-parenting heap-allocated memory when the memory needs to live longer than the context in which it was allocated), I came away empty-handed.

I am by no means a Postgres internals expert, but I know its general structure, individual subsystems, and how various common patterns such as polymorphism and exceptions are implemented in C. I also know the C API exposed by Postgres very well, as I have used it extensively in my contributions to TimescaleDB, LanternDB, other Lantern projects, pgvector, and pg_cron.

So, I thought I should be well-positioned to chase down a 100% reproducible bug in Postgres source that requires ~10 lines of very simple code with almost no data to reproduce. I was trying to narrow down my search with gdb and memory tracing tools, but the number of Postgres subsystems I had to understand was growing rapidly with no end in sight.

I tabled the bug for now… again. Although it was easy to reproduce the issue, the amount of context involved and the number of Postgres source code lines touched by the reproducing code was too large to be comprehensible in the time I had allotted for the task.

Since my simple-looking bug reproducer took surprisingly complex paths through Postgres source, I decided to find out what is involved in running the simplest query ever: SELECT 1.

Approach

The task is to run SELECT 1 and record all the lines of Postgres source code that get executed in the process. I used used Postgres 18+ at commit e510378.

Since the goal is to find out how much context a human developer would need to understand everything involved when running the query, I am building Postgres in debug mode with optimizations disabled. In release mode a lot of the lines would be optimized away, but typical human developers do not run optimization passes in their heads, so compiling without optimizations will get us closer to the goal.

I wrote a gdb extension to automate this (I’m so glad I didn’t do this by hand). The core idea is simple:

class PCTracer:
    def on_stop(self, event):
        # Get current location in source code
        pc = gdb.selected_frame().pc()
        filename, line = self.get_source_location()

        # Record it
        self.trace_data.append({
            "pc": f"0x{pc:x}",
            "file": filename,
            "line": line,
            "func": frame.name()
        })

        # Continue stepping
        gdb.execute("step")

# Usage:
# (gdb) source tracer.py
# (gdb) start-trace
# (gdb) stop-trace

The extension hooks into gdb’s stop events, records the source location at each step, and continues single-stepping through the entire execution. See the full code here.

Results

Over the course of just over 2 minutes of single-stepping through the code, the execution touched 112,800 individual steps across 5,583 unique source code locations. This DOES NOT include any third-party library, system library calls or autogenerated code. All 5,583 lines are in the postgres source repository.

Note that this is not a result of unrelated library loads and background tasks. I ensured the DB was empty, auto-vacuum was off (just for good measure), and I had run SELECT 1 several times in the same process before recording. All these lines are involved in a typical execution of SELECT 1.

The breadth of the codebase involved was also surprising. The query touched 143 different source files — from the query planner (planner.c with 364 unique lines executed) to transaction management (xact.c with 226 lines), memory allocation (aset.c with 217 lines), error logging (elog.c with 193 lines), and the main Postgres loop (postgres.c with 187 lines). Even seemingly peripheral systems such as snapshot management (snapmgr.c with 125 lines) were involved.

Looking at function-level statistics, the trace touched 640 unique functions.¹

Comparison with SQLite

I ran the same experiment on SQLite.

SQLite executed SELECT 1 in just 100 traced steps — over 1,000 times fewer than Postgres. It touched only 56 unique source lines.

While SQLite is indeed simpler and lacks several Postgres subsystems (access control, full MVCC, buffer management), I would not think it is 1000x or even 100x simpler. I don’t know for sure what accounts for the difference, but having spent some quality time in the Postgres source code, I think it is designed with ultimate generality in mind. So even though the query at hand looks extremely simple, it likely goes through all the same code paths that a significantly more general query would traverse.

I suspect SQLite’s bytecode approach to query execution greatly reduces the number of lines touched in this SQLite experiment, making the Postgres <-> SQLite comparison quite unfair ². SQLite bytecode effectively hides a lot of the code as bytecode data, and my gdb stepping approach does not count these bytecode executions as additional lines. The bytecode does contribute to the complexity of the query, however, so I am sure understanding everything involved in SELECT 1 is not 100x easier than Postgres.

What Can I Conclude About My Bug Hunt?

Seeing these numbers helps explain why my attempts to debug the PL/pgSQL cache invalidation bug through gdb tracing were futile. If a trivial SELECT 1 touches 5,583 unique code locations across 143 files, my slightly more complex bug reproducer — involving polymorphic functions, table alterations, and cache invalidation — likely touches orders of magnitude more code.

I have a few more tricks up my sleeve to try on that bug. Hopefully, when I revisit it next time, it will be the last time, and I will have my so sought-after inconsequential contribution to upstream Postgres.

I detailed the steps to reproduce the bug in my previous post. Please reach out (narekg at berkeley dot edu) if you have any suggestions for finding the root cause of the bug.

I’m currently in the job market for database systems and infrastructure roles. If you’re working on interesting problems in this space, I’d love to chat narekg at berkeley dot edu !

Full trace results available for Postgres and SQLite ↩︎
If I was able to write a C interpreter in e.g., 4k lines of code, and ran Postgres in it, it would look like I touched <4k lines as actual postgres source woul dbe data. SQLite source code is in a way such an interpreter. ↩︎