Using dbt Cloud? Check out the dbt Cloud with Dagster guide!
In this tutorial, we'll walk you through integrating dbt with Dagster using dbt's example jaffle shop project, the dagster-dbt library, and a DuckDB database.
By the end of this tutorial, you will:
Dagster’s software-defined assets (SDAs) bear several similarities to dbt models. A software-defined asset contains an asset key, a set of upstream asset keys, and an operation that is responsible for computing the asset from its upstream dependencies. Models defined in a dbt project are similar to Dagster SDAs in that:
ref
or source
calls within the model's definition.These similarities make it natural to interact with dbt models as SDAs. Let’s take a look at a dbt model and an SDA, in code:
Here's what's happening in this example:
orders
raw_orders
orders
raw_orders
is provided as an argument to the asset, defining it as a dependencyTo complete this tutorial, you'll need:
To install dbt, Dagster, and Dagit. Run the following to install everything using pip:
pip install dbt-core dagster dagit
Refer to the dbt and Dagster installation docs for more info.
To download the tutorial_dbt_dagster
Dagster example. We'll walk you through this in the first step. This example includes:
dbt's example jaffle shop project. You can follow along using a different dbt project, but you won't be able to use the code examples in this tutorial as-is.
A blank template version of the tutorial project, which you can use to follow along with the tutorial.
A finished version of the tutorial project, which you can use to check out the final version of the work you'll do in the tutorial.
Dependencies for the following libraries:
dagster-dbt. This library allows you to integrate dbt with Dagster.
dbt-duckdb. This tutorial uses DuckDB as the database backing dbt.
dagster-duckdb. This library provides a resource that enables the materialization of Dagster assets as DuckDB tables.
dagster-duckdb-pandas. This library allows you to store pandas DataFrames in DuckDB.
pandas. This tutorial uses pandas to fetch raw data.
plotly. This tutorial uses plotly to create a histogram asset.
When you've fulfilled all the prerequisites for the tutorial, you can get started by setting up the dbt project.