dbt-coretutorialwarehouse

Running dbt-core with PostgreSQL: Complete Setup Guide

Step-by-step guide to setting up dbt-core with PostgreSQL — user creation, schema permissions, profiles.yml configuration, and common issues.

ModelDock TeamFebruary 17, 202610 min read

PostgreSQL is one of the most popular databases to use with dbt-core. It's open-source, runs anywhere, and most teams already have one. Whether you're working with a local dev instance or a managed service like Amazon RDS or Google Cloud SQL, the setup is straightforward once you know the steps.

This guide walks through everything: installing the adapter, creating a dedicated dbt user with the right permissions, configuring profiles.yml, handling SSL for remote connections, and fixing the issues you'll almost certainly run into.

Prerequisites

You'll need two things before starting:

A running PostgreSQL instance (local, Docker, RDS, Cloud SQL, or any other hosted option). Version 12 or higher is recommended.
Python 3.9+ and pip installed on your machine.

If you don't have PostgreSQL yet, the quickest way to get one running locally is with Docker:

docker run -d \
  --name postgres-dbt \
  -e POSTGRES_USER=admin \
  -e POSTGRES_PASSWORD=admin123 \
  -e POSTGRES_DB=analytics \
  -p 5432:5432 \
  postgres:16

That gives you a PostgreSQL 16 instance on localhost:5432 with a database called analytics.

Installing dbt-postgres

The PostgreSQL adapter for dbt-core is a separate package. Install it with pip:

pip install dbt-postgres

This installs both dbt-core and the PostgreSQL adapter. You don't need to install dbt-core separately.

Verify the installation:

dbt --version

You should see something like:

Core:
  - installed: 1.9.x
  - latest:    1.9.x

Plugins:
  - postgres: 1.9.x

If you prefer isolated environments (and you should), use a virtual environment:

python -m venv dbt-env
source dbt-env/bin/activate
pip install dbt-postgres

Database and User Setup

You can run dbt with your admin or superuser account, but you shouldn't. Create a dedicated user with only the permissions dbt needs. This is cleaner, safer, and closer to what you'll want in production.

Connect to your PostgreSQL instance as a superuser (or your admin account):

psql -h localhost -U admin -d analytics

Then run the following SQL to set up the dbt user, schema, and permissions:

-- Create a dedicated dbt user
CREATE USER dbt_user WITH PASSWORD 'a_strong_password_here';

-- Create the schema dbt will write to
CREATE SCHEMA IF NOT EXISTS dbt_prod;

-- Grant usage and creation rights on the schema
GRANT USAGE ON SCHEMA dbt_prod TO dbt_user;
GRANT CREATE ON SCHEMA dbt_prod TO dbt_user;

-- Grant access to read from source tables (adjust schema as needed)
GRANT USAGE ON SCHEMA public TO dbt_user;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO dbt_user;

-- Make sure future tables in public are also readable
ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT SELECT ON TABLES TO dbt_user;

A few things to note here:

**dbt_prod** is the schema where dbt will create models (tables and views). You can call it whatever you want -- analytics, transform, dbt -- just be consistent.
The public schema grants assume your raw/source data lives there. Adjust if your sources are in a different schema.
ALTER DEFAULT PRIVILEGES ensures that any new tables added to public in the future will automatically be readable by the dbt user.

If dbt also needs to create schemas on the fly (for example, for custom schema names), grant the CREATE privilege on the database:

GRANT CREATE ON DATABASE analytics TO dbt_user;

profiles.yml Configuration

dbt uses a profiles.yml file to know how to connect to your database. By default, it lives at ~/.dbt/profiles.yml.

Local PostgreSQL

For a local instance (no SSL, standard port):

my_project:
  target: dev
  outputs:
    dev:
      type: postgres
      host: localhost
      port: 5432
      user: dbt_user
      password: "a_strong_password_here"
      dbname: analytics
      schema: dbt_dev
      threads: 4

A few notes:

**target** controls which output is active. You can define multiple (dev, staging, prod) and switch between them.
**schema** is the default schema dbt will write models to. For local development, use something like dbt_dev or dbt_<your_name> to keep things separate.
**threads** controls parallelism. 4 is a good starting point. Increase it if you have many independent models and your database can handle the load.

Remote PostgreSQL

For a remote instance (e.g., RDS, Cloud SQL, or a VPS):

my_project:
  target: prod
  outputs:
    prod:
      type: postgres
      host: my-db-instance.abc123.us-east-1.rds.amazonaws.com
      port: 5432
      user: dbt_user
      password: "a_strong_password_here"
      dbname: analytics
      schema: dbt_prod
      threads: 4
      keepalives_idle: 0
      connect_timeout: 10
      sslmode: require

The keepalives_idle: 0 setting prevents idle connection timeouts, which is common with cloud-hosted databases behind load balancers. connect_timeout gives you a faster failure if the host is unreachable.

Using Environment Variables

Hardcoding passwords in YAML files is not great. Use environment variables instead:

my_project:
  target: prod
  outputs:
    prod:
      type: postgres
      host: "{{ env_var('DBT_PG_HOST') }}"
      port: 5432
      user: "{{ env_var('DBT_PG_USER') }}"
      password: "{{ env_var('DBT_PG_PASSWORD') }}"
      dbname: "{{ env_var('DBT_PG_DBNAME') }}"
      schema: dbt_prod
      threads: 4

Then set the variables in your shell or CI/CD environment:

export DBT_PG_HOST=localhost
export DBT_PG_USER=dbt_user
export DBT_PG_PASSWORD=a_strong_password_here
export DBT_PG_DBNAME=analytics

SSL/TLS Configuration for Remote PostgreSQL

If you're connecting to a managed PostgreSQL service (RDS, Cloud SQL, Azure Database for PostgreSQL, or really any remote host), you'll likely need SSL.

The sslmode parameter controls this. The most common values:

Mode	Behavior
`disable`	No SSL. Don't use this for remote connections.
`allow`	Try without SSL first, fall back to SSL.
`prefer`	Try SSL first, fall back to non-SSL. (default)
`require`	Require SSL, but don't verify the server certificate.
`verify-ca`	Require SSL and verify the server certificate against a CA.
`verify-full`	Like `verify-ca`, plus verify the hostname matches.

For most cloud setups, require is the minimum. If you need stricter security (and you should for production), use verify-ca or verify-full and provide the CA certificate:

prod:
  type: postgres
  host: my-db-instance.abc123.us-east-1.rds.amazonaws.com
  port: 5432
  user: dbt_user
  password: "{{ env_var('DBT_PG_PASSWORD') }}"
  dbname: analytics
  schema: dbt_prod
  threads: 4
  sslmode: verify-ca
  sslrootcert: /path/to/rds-combined-ca-bundle.pem

For Amazon RDS, download the CA bundle from [the AWS documentation](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html). For Google Cloud SQL, the Auth Proxy handles SSL for you, so you typically connect to localhost with sslmode: disable.

Creating Your First Model and Running It

If you don't have a dbt project yet, create one:

dbt init my_project

This scaffolds a project directory with a sample model. Navigate into it:

cd my_project

The sample model is at models/example/my_first_dbt_model.sql. Let's replace it with something that actually uses your database. Create a file at models/staging/stg_orders.sql:

with source as (
    select * from {{ source('raw', 'orders') }}
)

select
    id as order_id,
    customer_id,
    order_date,
    amount,
    status
from source
where status != 'cancelled'

Define the source in models/staging/sources.yml:

version: 2

sources:
  - name: raw
    schema: public
    tables:
      - name: orders

Now run it:

dbt run

If everything is configured correctly, you'll see output like:

Running with dbt=1.9.x
Found 1 model, 1 source...

Concurrency: 4 threads (target='dev')

1 of 1 START sql view model dbt_dev.stg_orders ............. [RUN]
1 of 1 OK created sql view model dbt_dev.stg_orders ........ [CREATE VIEW in 0.12s]

Finished running 1 view model in 0.34s.
Completed successfully.

You can also run dbt debug first to verify your connection before running models:

dbt debug

This checks that dbt can connect to your database and that your project configuration is valid.

Common Errors and Fixes

Here are the issues you're most likely to hit, and how to fix them.

Connection refused

could not connect to server: Connection refused
  Is the server running on host "localhost" and accepting TCP/IP connections on port 5432?

PostgreSQL isn't running, or it's not accepting connections on the expected host/port. Check that the service is up (pg_isready -h localhost -p 5432) and that postgresql.conf has listen_addresses = '*' if you need remote access.

SSL required

FATAL: no pg_hba.conf entry for host "x.x.x.x", user "dbt_user", database "analytics", SSL off

The server requires SSL but your profiles.yml doesn't have sslmode set (or it's set to disable). Add sslmode: require to your profile.

Permission denied

permission denied for schema dbt_prod

The dbt user doesn't have CREATE privileges on the target schema. Run GRANT CREATE ON SCHEMA dbt_prod TO dbt_user; as a superuser.

Schema doesn't exist

schema "dbt_dev" does not exist

dbt doesn't create schemas by default unless you grant CREATE on the database. Either create the schema manually (CREATE SCHEMA dbt_dev; and grant permissions) or grant CREATE ON DATABASE analytics TO dbt_user; so dbt can create it.

pg_hba.conf issues

FATAL: no pg_hba.conf entry for host "172.17.0.1", user "dbt_user"

PostgreSQL uses pg_hba.conf to control which hosts can connect and with what authentication method. If you're connecting from a different machine or container, you need to add an entry. Edit pg_hba.conf (usually at /etc/postgresql/16/main/pg_hba.conf or /var/lib/postgresql/data/pg_hba.conf):

# Allow dbt_user from any host with password auth
host    analytics    dbt_user    0.0.0.0/0    md5

Then reload PostgreSQL: SELECT pg_reload_conf(); or systemctl reload postgresql.

Password authentication failed

Double-check the password. Also check that the user exists (\du in psql) and that pg_hba.conf uses md5 or scram-sha-256 (not peer or ident) for the connection method.

Docker Gotcha: localhost vs Container Networking

This one catches a lot of people. If PostgreSQL runs in a Docker container and you're running dbt from another container (or from a Docker-based CI pipeline), localhost won't work.

Inside a container, localhost refers to that container itself, not the host machine and not other containers. You need to use Docker networking.

**If both dbt and Postgres are in the same docker-compose.yml:**

services:
  postgres:
    image: postgres:16
    environment:
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: admin123
      POSTGRES_DB: analytics
    ports:
      - "5432:5432"

  dbt:
    build: .
    depends_on:
      - postgres
    environment:
      DBT_PG_HOST: postgres  # Use the service name, not localhost
      DBT_PG_USER: dbt_user
      DBT_PG_PASSWORD: a_strong_password_here
      DBT_PG_DBNAME: analytics

The key is DBT_PG_HOST: postgres. Docker Compose creates a network where services can reach each other by name. So from the dbt container, the Postgres host is postgres, not localhost.

If dbt runs on the host and Postgres is in Docker, then localhost:5432 works fine (assuming you've mapped the port with -p 5432:5432).

If dbt runs in a container but Postgres is on the host, use host.docker.internal as the host (works on Docker Desktop for Mac/Windows). On Linux, use --add-host=host.docker.internal:host-gateway or connect via the host's IP on the Docker bridge network.

Production Considerations

Once you're past local development, there are a few things to tighten up:

Use a connection pooler. PgBouncer sits between dbt and PostgreSQL, reducing connection overhead. This matters when you have many models running in parallel.
**Set threads carefully.** More threads means more parallel queries. PostgreSQL handles this well up to a point, but if your instance is small, too many threads can cause memory pressure. Start with 4, increase gradually.
Monitor long-running queries. dbt models that build large tables can hold locks and slow down other workloads. Use statement_timeout in your PostgreSQL config or per-session to prevent runaway queries.
Use separate schemas for dev/staging/prod. dbt makes this easy with target-based schema names in profiles.yml.
Rotate credentials. Don't let your dbt user password sit unchanged for months. Automate rotation if possible.
Back up before full refreshes. A dbt run --full-refresh drops and recreates tables. If something goes wrong mid-run, you'll want a way to recover.

Skip the Setup Hassle

Getting dbt-core connected to PostgreSQL isn't hard, but keeping it running in production is where things add up: credential management, scheduling, SSL certificates, Docker networking, monitoring failures, and handling retries.

[ModelDock](https://modeldock.run) handles all of that for you. Connect your PostgreSQL credentials, point it at your dbt project, and it takes care of scheduling, execution, and deployment. No Airflow, no Docker configs, no profiles.yml to maintain.

Try ModelDock free and get your dbt models running against PostgreSQL in minutes.