Skip to main content

Migrating an Airflow BashOperator to Dagster

In this page, we'll explain migrating an Airflow BashOperator to Dagster.

note

If using the BashOperator to execute dbt commands, see "Migrating an Airflow BashOperator (dbt) to Dagster".

About the Airflow BashOperator

The Airflow BashOperator is a common operator used to execute bash commands as part of a data pipeline.

from airflow.operators.bash import BashOperator

execute_script = BashOperator(
task_id="execute_script",
bash_command="python /path/to/script.py",
)

The BashOperator's functionality is very general since it can be used to run any bash command, and there exist richer integrations in Dagster for many common BashOperator use cases. We'll explain how 1-1 migration of the BashOperator to execute a bash command in Dagster, and how to use the dagster-airlift library to proxy the execution of the original task to Dagster. We'll also provide a reference for richer integrations in Dagster for common BashOperator use cases.

Dagster equivalent

The direct Dagster equivalent to the BashOperator is to use the PipesSubprocessClient to execute a bash command in a subprocess.

Migrating the operator

Migrating the operator breaks down into a few steps:

  1. Ensure that the resources necessary for your bash command are available to both your Airflow and Dagster deployments.
  2. Write an asset that executes the bash command using the PipesSubprocessClient.
  3. Use dagster-airlift to proxy execution of the original task to Dagster.
  4. (Optional) Implement a richer integration for common BashOperator use cases.

Step 1: Ensure shared bash command access

First, you'll need to ensure that the bash command you're running is available for use in both your Airflow and Dagster deployments. What this entails will vary depending on the command you're running. For example, if you're running a Python script, it's as simple as ensuring the Python script exists in a shared location accessible to both Airflow and Dagster, and all necessary env vars are set in both environments.

Step 2: Writing an @asset-decorated function

You can write a Dagster asset-decorated function that runs your bash command. This is quite straightforward using the PipesSubprocessClient.

from dagster import AssetExecutionContext, PipesSubprocessClient, asset


@asset
def script_result(context: AssetExecutionContext):
return (
PipesSubprocessClient()
.run(context=context, command="python /path/to/script.py")
.get_results()
)

Step 3: Using dagster-airlift to proxy execution

Finally, you can use dagster-airlift to proxy the execution of the original task to Dagster. For more information, see "Migrating from Airflow to Dagster.

Step 4: Implementing richer integrations

For many of the use cases that you might be using the BashOperator for, Dagster might have better options. We'll detail some of those here.

Running a Python script

As mentioned above, you can use the PipesSubprocessClient to run a Python script in a subprocess. But you can also modify this script to send additional information and logging back to Dagster. See the Dagster Pipes tutorial for more information.

Running a dbt command

We have a whole guide for switching from the BashOperator to the dbt integration in Dagster. For more information, see "Migrating an Airflow BashOperator (dbt) to Dagster".

Running S3 Sync or other AWS CLI commands

Dagster has a rich set of integrations for AWS services. For example, you can use the s3.S3Resource to interact with S3 directly.