dbt-core Scheduling Options Compared: Cron, Airflow, GitHub Actions, and Managed Services
A practical comparison of every way to schedule dbt-core runs in production — cron, Airflow, Prefect, GitHub Actions, Dagster, and managed platforms.
dbt-core doesn't come with a scheduler. That's by design — it's a command-line tool that transforms data, not an orchestration platform. But sooner or later, every team that adopts dbt-core needs to answer the same question: how do we schedule dbt runs automatically?
The answer depends on your team size, your existing infrastructure, how much you care about observability, and honestly, how much time you want to spend on plumbing instead of data modeling.
This article compares every major option for scheduling dbt-core in production. Real code, honest trade-offs, no fluff.
Why Scheduling Matters
Running dbt build manually is fine during development. In production, you need your models to refresh on a predictable cadence — after source data lands, before dashboards get checked, and without anyone remembering to press a button.
A good dbt scheduler should handle:
- Triggering runs on a time-based or event-based schedule
- Managing credentials securely (not hardcoded in scripts)
- Retrying on failure without human intervention
- Logging and alerting so you know when things break
- Environment isolation so production runs don't conflict with development
Some tools do all of this. Some do almost none of it. Let's walk through them.
Option 1: Cron
The oldest trick in the book. Install dbt on a Linux server, write a shell script, and add a crontab entry.
Setup
#!/bin/bash# /opt/dbt/scheduled_run.shset -euo pipefailcd /opt/dbt/my-projectsource /opt/dbt/venv/bin/activateexport DBT_PROFILES_DIR=/opt/dbt/profilesLOG_FILE="/var/log/dbt/run-$(date +%Y%m%d-%H%M%S).log"dbt build --target prod 2>&1 | tee "$LOG_FILE"EXIT_CODE=${PIPESTATUS[0]}if [ $EXIT_CODE -ne 0 ]; thencurl -X POST "https://hooks.slack.com/services/YOUR/WEBHOOK/URL" \-H 'Content-type: application/json' \-d "{\"text\": \"dbt run failed. Check $LOG_FILE\"}"fi
# Run dbt every day at 5:30 AM UTC30 5 * * * /opt/dbt/scheduled_run.sh
Pros
- Zero dependencies beyond a Linux server
- Easy to understand — anyone who's touched Unix can read a crontab
- No additional infrastructure to maintain
- Fast to set up (15 minutes if you already have a server)
Cons
- No built-in retry logic — if a run fails at 5:30 AM, you won't know until you check
- Credential management is on you (environment variables, files on disk)
- No UI for run history or logs
- Scaling to multiple dbt projects means managing multiple cron entries
- No dependency awareness — cron doesn't know if your source data has landed yet
- Server goes down, runs stop silently
When it works
Solo data engineers or very small teams with a single dbt project and low complexity. If you're just getting started and want something running today, cron is fine as a stopgap.
Option 2: GitHub Actions
GitHub Actions has become a surprisingly popular way to schedule dbt runs, especially for teams that already live in GitHub.
Setup
# .github/workflows/dbt-scheduled.ymlname: Scheduled dbt Buildon:schedule:- cron: '30 5 * * *' # 5:30 AM UTC dailyworkflow_dispatch: # Allow manual triggersjobs:dbt-build:runs-on: ubuntu-lateststeps:- name: Checkout repouses: actions/checkout@v4- name: Set up Pythonuses: actions/setup-python@v5with:python-version: '3.11'cache: 'pip'- name: Install dbtrun: |pip install dbt-core dbt-snowflake- name: Create profiles.ymlrun: |mkdir -p ~/.dbtcat > ~/.dbt/profiles.yml << 'EOF'my_project:target: prodoutputs:prod:type: snowflakeaccount: ${{ secrets.SNOWFLAKE_ACCOUNT }}user: ${{ secrets.SNOWFLAKE_USER }}password: ${{ secrets.SNOWFLAKE_PASSWORD }}role: TRANSFORMERdatabase: ANALYTICSwarehouse: TRANSFORMINGschema: prodthreads: 4EOF- name: Run dbt buildrun: dbt build --target prod- name: Upload artifactsif: always()uses: actions/upload-artifact@v4with:name: dbt-artifacts-${{ github.run_id }}path: target/retention-days: 30
You can also add Slack notifications on failure:
- name: Notify on failureif: failure()uses: slackapi/slack-github-action@v1.25.0with:payload: |{"text": "dbt build failed in GitHub Actions. Run: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"}env:SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
Pros
- No infrastructure to manage — GitHub runs the compute
- Secrets management is built in (repository secrets)
- Run history and logs are in the Actions tab
- Easy to trigger manually via
workflow_dispatch - Free for public repos, generous free tier for private repos
- Your dbt project and its scheduler live in the same repository
Cons
- GitHub Actions cron is not precise — runs can be delayed by 5-15 minutes during peak times
- No dependency awareness (can't wait for upstream data)
- Limited execution time (6 hours max per job on free tier)
- Debugging failures means reading through Actions logs, which isn't the best experience
- No built-in retry for specific dbt models — it's all or nothing
- Costs can creep up if you run frequently or have long-running projects
- Not designed for orchestration — it's a CI/CD tool being used as a scheduler
When it works
Small to mid-size teams with straightforward scheduling needs. If your dbt project runs in under 30 minutes and you don't need event-based triggers or complex dependency chains, GitHub Actions is a solid, low-maintenance choice.
Option 3: Apache Airflow
Airflow is the heavyweight option. It's the most widely used orchestrator in data engineering, and it has first-class support for dbt through the Cosmos provider.
Setup with Cosmos
# dags/dbt_scheduled.pyfrom datetime import datetimefrom cosmos import DbtDag, ProjectConfig, ProfileConfig, ExecutionConfigfrom cosmos.profiles import SnowflakeUserPasswordProfileMappingdbt_dag = DbtDag(project_config=ProjectConfig(dbt_project_path="/opt/airflow/dbt/my_project",),profile_config=ProfileConfig(profile_name="my_project",target_name="prod",profile_mapping=SnowflakeUserPasswordProfileMapping(conn_id="snowflake_default",profile_args={"database": "ANALYTICS","schema": "prod",},),),execution_config=ExecutionConfig(dbt_executable_path="/opt/airflow/dbt-venv/bin/dbt",),schedule_interval="30 5 * * *",start_date=datetime(2024, 1, 1),catchup=False,dag_id="dbt_scheduled_build",default_args={"retries": 2,"retry_delay": 300,},)
Cosmos is nice because it breaks your dbt DAG into individual Airflow tasks — one per model. That means you get per-model retries, parallelism, and visibility in the Airflow UI.
Setup with BashOperator
If you want something simpler (or can't install Cosmos), the BashOperator works:
from datetime import datetime, timedeltafrom airflow import DAGfrom airflow.operators.bash import BashOperatordefault_args = {"retries": 2,"retry_delay": timedelta(minutes=5),}with DAG(dag_id="dbt_build",schedule="30 5 * * *",start_date=datetime(2024, 1, 1),catchup=False,default_args=default_args,) as dag:dbt_build = BashOperator(task_id="dbt_build",bash_command=("cd /opt/airflow/dbt/my_project && ""source /opt/airflow/dbt-venv/bin/activate && ""dbt build --target prod"),env={"DBT_PROFILES_DIR": "/opt/airflow/dbt/profiles",},)
Pros
- Industry standard with a massive community
- Powerful dependency management — chain dbt after data ingestion tasks
- Built-in retries, alerting, SLAs, and run history
- Cosmos gives per-model visibility and retries
- Scales to complex, multi-step data pipelines
- Extensive UI for monitoring and debugging
Cons
- Significant setup and maintenance overhead (database, scheduler, workers, webserver)
- Learning curve is steep, especially for teams without platform engineers
- Managing Airflow itself becomes a job — upgrades, scaling, debugging the scheduler
- Getting dbt and Airflow to play nicely together (virtual environments, dependencies) takes work
- Overkill if dbt is the only thing you're orchestrating
When it works
Teams with existing Airflow infrastructure, or teams that need dbt to run as part of a larger pipeline (e.g., after Fivetran ingestion, before reverse ETL). If you already have Airflow, adding dbt to it makes sense. If you don't, think carefully before introducing it just for dbt.
Option 4: Prefect
Prefect takes a Python-native approach to orchestration. If your team thinks in Python rather than YAML, Prefect feels more natural than Airflow.
Setup
# flows/dbt_flow.pyfrom prefect import flow, taskfrom prefect.tasks import task_input_hashfrom datetime import timedeltaimport subprocess@task(retries=2, retry_delay_seconds=300)def run_dbt_build():result = subprocess.run(["dbt", "build", "--target", "prod"],cwd="/opt/dbt/my_project",capture_output=True,text=True,env={"DBT_PROFILES_DIR": "/opt/dbt/profiles","PATH": "/opt/dbt/venv/bin:/usr/bin:/bin",},)if result.returncode != 0:raise Exception(f"dbt build failed:\n{result.stderr}")return result.stdout@flow(name="dbt-scheduled-build", log_prints=True)def dbt_scheduled_build():output = run_dbt_build()print(output)if __name__ == "__main__":dbt_scheduled_build.serve(name="dbt-daily-build",cron="30 5 * * *",)
Prefect also has a prefect-dbt integration for tighter coupling:
from prefect import flowfrom prefect_dbt.cli.commands import DbtCoreOperation@flowdef dbt_build_flow():result = DbtCoreOperation(commands=["dbt build --target prod"],project_dir="/opt/dbt/my_project",profiles_dir="/opt/dbt/profiles",)result.run()if __name__ == "__main__":dbt_build_flow.serve(name="dbt-daily-build",cron="30 5 * * *",)
Pros
- Pure Python — no YAML configuration files or custom DSLs
- Much easier to set up than Airflow (especially with Prefect Cloud)
- Built-in retries, caching, and concurrency controls
- Prefect Cloud offers a hosted UI, alerting, and scheduling
- Lightweight — a single Python process can serve as your scheduler
Cons
- Smaller community than Airflow (though growing)
- Self-hosted Prefect server still requires maintenance
- Prefect Cloud pricing can add up for high-frequency runs
- The dbt integration is less mature than Cosmos for Airflow
- Less battle-tested at scale compared to Airflow
When it works
Python-heavy teams that want orchestration without the Airflow overhead. Especially appealing if you're already using Prefect for other data workflows or if your team doesn't have dedicated platform engineers.
Option 5: Dagster
Dagster is interesting because it's the most "dbt-aware" orchestrator. It understands dbt assets natively and can represent dbt models as first-class objects in its own asset graph.
Setup
# definitions.pyfrom dagster import Definitionsfrom dagster_dbt import DbtCliResource, dbt_assets, DbtProjectmy_project = DbtProject(project_dir="/opt/dbt/my_project",)@dbt_assets(manifest=my_project.manifest_path)def my_dbt_assets(context, dbt: DbtCliResource):yield from dbt.cli(["build"], context=context).stream()defs = Definitions(assets=[my_dbt_assets],resources={"dbt": DbtCliResource(project_dir=my_project,profiles_dir="/opt/dbt/profiles",),},)
Scheduling in Dagster uses ScheduleDefinition or AutoMaterializePolicy:
from dagster import ScheduleDefinition, define_asset_jobdbt_build_job = define_asset_job(name="dbt_build_job",selection=[my_dbt_assets],)dbt_schedule = ScheduleDefinition(job=dbt_build_job,cron_schedule="30 5 * * *",)defs = Definitions(assets=[my_dbt_assets],schedules=[dbt_schedule],resources={"dbt": DbtCliResource(project_dir=my_project,profiles_dir="/opt/dbt/profiles",),},)
Pros
- First-class dbt integration — models appear as assets in Dagster's UI
- Asset-based approach aligns naturally with how dbt thinks about data
- Auto-materialization can trigger dbt models based on upstream changes
- Solid local development experience with
dagster dev - Good testing framework for pipelines
Cons
- Steeper learning curve than Prefect (the asset/op/job model takes time to internalize)
- Smaller adoption than Airflow — fewer Stack Overflow answers, fewer tutorials
- Self-hosting requires a daemon, webserver, and database (similar to Airflow)
- Tighter coupling to Dagster's abstractions can feel constraining
- Migrating away from Dagster is harder than migrating away from a simple BashOperator
When it works
Teams that want a modern, asset-oriented approach and are willing to invest in learning Dagster's model. Particularly strong if you're building a new data stack from scratch and want dbt to be deeply integrated with your orchestration layer.
Option 6: Managed Services
Instead of running your own scheduler, you can use a platform that handles it for you.
dbt Cloud
The official managed service from dbt Labs. It's the most polished option if budget isn't a constraint.
# dbt Cloud handles scheduling via its UI# No code to write — configure jobs through the web interface# Supports cron schedules, CI on PR, and API-triggered runs
Pros: Seamless integration, IDE included, built-in docs hosting, managed infrastructure.
Cons: Can get expensive at scale ($100+/seat/month on Team plan), less flexibility for custom orchestration, vendor lock-in on proprietary features.
ModelDock
ModelDock takes a different approach — it runs Airflow under the hood but abstracts away the infrastructure. You connect your Git repo and warehouse credentials, and it generates and manages the Airflow DAGs for you.
Pros: Full Airflow power without managing Airflow, per-model visibility, bring your own dbt project.
Cons: Newer platform, smaller community than dbt Cloud.
Comparison Table
| Feature | Cron | GitHub Actions | Airflow | Prefect | Dagster | dbt Cloud | ModelDock |
|---|---|---|---|---|---|---|---|
| Setup time | Minutes | ~1 hour | Hours to days | ~1 hour | ~2 hours | Minutes | Minutes |
| Scheduling | Cron only | Cron only | Cron + event | Cron + event | Cron + asset | Cron + CI | Cron + event |
| Retries | Manual | Manual | Built-in | Built-in | Built-in | Built-in | Built-in |
| Per-model visibility | No | No | With Cosmos | No | Yes | Yes | Yes |
| Credential management | DIY | Repo secrets | Connections | Blocks/Secrets | Resources | Built-in | Built-in |
| Run history/UI | Log files | Actions tab | Airflow UI | Prefect UI | Dagster UI | dbt Cloud UI | Dashboard |
| Alerting | DIY | DIY | Built-in | Built-in | Built-in | Built-in | Built-in |
| Infra maintenance | Server | None | Significant | Moderate | Moderate | None | None |
| Cost | Server cost | Free tier + usage | Self-hosted | Free tier + Cloud | Free tier + Cloud | $100+/seat/month | Free tier available |
| Best for | Solo/prototype | Small teams | Complex pipelines | Python teams | Asset-oriented | Enterprise | Teams wanting Airflow without the ops |
Decision Framework
Choosing a dbt scheduler comes down to three factors: team size, pipeline complexity, and how much infrastructure you want to manage.
Choose cron if you're a solo data engineer, your project is simple, and you just need something running today. Plan to migrate off it as soon as your needs grow.
Choose GitHub Actions if your team is small (1-5 people), your dbt project runs in under 30 minutes, you don't need event-based triggers, and you want zero infrastructure to maintain. It's the best "good enough" option for many teams.
Choose Airflow if you already run Airflow for other workloads, or you need dbt to execute as part of a larger data pipeline with complex dependencies. Don't adopt Airflow just for dbt unless you're prepared for the operational commitment.
Choose Prefect if your team is Python-native, you want something lighter than Airflow, and you're comfortable with a smaller ecosystem. Prefect Cloud removes most of the operational burden.
Choose Dagster if you're building a new data platform from scratch and want the tightest possible integration between your orchestrator and dbt. The asset-based model is powerful, but it's a commitment.
Choose a managed service if you'd rather spend your time on data modeling than infrastructure. dbt Cloud is the safe enterprise choice. ModelDock gives you Airflow's power without Airflow's operational burden — it generates and manages the DAGs so you can focus on your dbt project, not your scheduler.
The Real Question
Most teams spend way too long agonizing over scheduler choice. Here's the honest truth: for a straightforward dbt project that runs once or twice a day, nearly any of these options will work. The differences only start to matter when you need event-based triggers, per-model retries, complex multi-step pipelines, or integration with other tools in your stack.
Pick the simplest option that meets your requirements today. You can always migrate later — your dbt project is just SQL and YAML, and it'll run the same way regardless of what triggers it.
If you don't want to manage any orchestration infrastructure but still want the power of Airflow under the hood, give ModelDock a try. Connect your repo, set your schedule, and let it handle the rest.