Fixing the lack of DBT dry-run option

Fixing the lack of DBT dry-run option

2025, May 01    

Table of Contents

Problem:

Lack of dry-run option in DBT:

One of the frustrating limitations in dbt today is the lack of a proper dry-run mode — something that could check:

  • if your models reference columns or tables that actually exist/don’t exist in the database tables/models
  • if the syntax is correct
  • if there is a set of duplicating columns

without actually materializing the models.

The existing options, which look like possible options, don’t do that:

dbt compile - only checks if the models themselves exist, Jinja syntax and compiles the existing models into the sql files in target/compiled directory

dbt empty - runs the code against database creating empty views/tables and it’s only if you don’t use ref model syntax. syntax, here dbt breaks with limit 0 error

dbt build - runs the actual code

dbt parse - gives the performance info

dbt debug - only tests the connection and the environment

It is the main limiting factor as we can’t merge models into production without knowing if they are going to break. The re is no hope dbt fixing it, as according to their issues in github, that idea is removed from the roadmap, hence I have to figure out how to solve it by myself.

The Solution:

Fixing the Lack of a dbt Dry-Run

The Hacky but Effective Solution

I built a dry-run-like pipeline using a combination of dbt compile, psql, and some Bash scripting. Here’s what it does:

Compiles the models: Runs dbt compile to render SQL for the modified models into the target/compiled directory.

Adds LIMIT 0: Modifies the compiled SQL files for the models in order to append LIMIT 0 to the end of each SELECT statement — this forces Postgres to validate the query structure and column existence without executing it.

Runs the queries: Executes the altered files against the real database.

Catches errors: Any missing columns, typos, or duplicates will cause the queries to fail fast — just like a dry-run should.

Bonus Feature

The script also:

Extracts column names from each result set.

Checks for duplicate column names (a common silent failure in joins and complicated models)

Caveats

You can’t check multiple interdependent models together unless they already exist in the database, need to run first the base/bronze level models, then silver, then gold.

I need to add LIMIT 0 in compiled files.

The full code for adding LIMIT 0 and running database check is here.

Alternative: Use a Dummy Database

One workaround we found was to spin up a dummy Postgres database seeded with just the schema (no data) from the actual production. Then dbt runs against that to validate the models structurally. That avoids most risks of polluting production.

The production schema from the primary database could be copied and loaded into the new database like that:

# Export the schema
PGPASSWORD=password pg_dump \
  --host=PRODUCTION_HOST \
  --username=PRODUCTION_USER \
  --dbname=PRODUCTION_DB \
  --schema=public \
  --schema-only \
  --no-owner \
  --no-privileges \
  > public_schema.sql

# Creating target DB
createdb --host=EMPTY_DB_HOST --username=EMPTY_DB_USER EMPTY_DB

# Import the schema
PGPASSWORD=password psql \
  --host=EMPTY_DB_HOST \
  --username=EMPTY_DB_USER \
  --dbname=EMPTY_DB \
  < public_schema.sql

The caveat here: you need to update the schema from your primary instance either by cloning the schema file from your app repo, or running schema updates with these commands with each migration (need to be automated).

If you’ve hit issues where dbt run fails halfway through because of a missing column, or you want to validate transformations during CI/CD without building the models—this method is for you.