Migrating an Airflow BashOperator to Dagster
In this page, we'll explain migrating an Airflow BashOperator
to Dagster.
If using the BashOperator
to execute dbt commands, see "Migrating an Airflow BashOperator (dbt) to Dagster".
About the Airflow BashOperator
The Airflow BashOperator
is a common operator used to execute bash commands as part of a data pipeline.
from airflow.operators.bash import BashOperator
execute_script = BashOperator(
task_id="execute_script",
bash_command="python /path/to/script.py",
)
The BashOperator
's functionality is very general since it can be used to run any bash command, and there exist richer integrations in Dagster for many common BashOperator use cases. We'll explain how 1-1 migration of the BashOperator to execute a bash command in Dagster, and how to use the dagster-airlift
library to proxy the execution of the original task to Dagster. We'll also provide a reference for richer integrations in Dagster for common BashOperator use cases.
Dagster equivalent
The direct Dagster equivalent to the BashOperator
is to use the PipesSubprocessClient
to execute a bash command in a subprocess.
Migrating the operator
Migrating the operator breaks down into a few steps:
- Ensure that the resources necessary for your bash command are available to both your Airflow and Dagster deployments.
- Write an
asset
that executes the bash command using thePipesSubprocessClient
. - Use
dagster-airlift
to proxy execution of the original task to Dagster. - (Optional) Implement a richer integration for common BashOperator use cases.
Step 1: Ensure shared bash command access
First, you'll need to ensure that the bash command you're running is available for use in both your Airflow and Dagster deployments. What this entails will vary depending on the command you're running. For example, if you're running a Python script, it's as simple as ensuring the Python script exists in a shared location accessible to both Airflow and Dagster, and all necessary env vars are set in both environments.
Step 2: Writing an @asset
-decorated function
You can write a Dagster asset
-decorated function that runs your bash command. This is quite straightforward using the PipesSubprocessClient
.
from dagster import AssetExecutionContext, PipesSubprocessClient, asset
@asset
def script_result(context: AssetExecutionContext):
return (
PipesSubprocessClient()
.run(context=context, command="python /path/to/script.py")
.get_results()
)