Ask AI

Asset selection syntax#

Using a simple query syntax, you can specify an asset selection as a string. In this guide, we'll cover where the syntax is supported, how it works, and some examples.


Supported locations#

The asset selection syntax can be used in:

  • In the selection parameter of define_asset_job. Alternatively, this parameter accepts an AssetSelection object, which supports more complex selections built from compositions of Python objects. For example:

    taxi_zones_job = define_asset_job(name="taxi_zones_job", selection="taxi_zones_file")
    
  • The asset command-line interface with the list and materialize commands. For example:

    dagster asset list --select taxi_zones_file
    
  • In the Dagster UI, in the search box on the Global asset lineage page. The Examples section demonstrates how asset selection queries work in the UI.


Usage#

A query includes a list of clauses. Clauses are separated by commas, except in the case of the selection parameter of define_asset_job, materialize, and materialize_to_memory, where each clause is a separate element in a list.

Clause syntaxDescription
ASSET_KEYSelects a single asset by asset key
COMPONENT/COMPONENTSelects an asset key with multiple components, such as a prefix, where slashes (/) are inserted between components. For example, to select an asset with an AssetKey in Python of AssetKey(["manhattan", "manhattan_stats"]), the query would be manhattan/manhattan_stats
*ASSET_KEYAn asterisk (*) preceding an asset key selects an asset and all of its upstream dependencies
ASSET_KEY*An asterisk (*) following an asset key selects an asset and all of its downstream dependencies
+ASSET_KEYA plus sign (+) preceding an asset key selects an asset and one layer upstream of the asset.

Including multiple +s will select that number of upstream layers from the asset. For example, ++ASSET_KEY will select the asset and two upstream layers of dependencies. Any number of +s is supported.
ASSET_KEY+A plus sign (+) following an asset key selects an asset and one layer downstream of the asset.

Including multiple +s will select that number of downstream layers from the asset. For example, ASSET_KEY++ will select the asset and two downstream layers of dependencies. Any number of +s is supported.

Examples#

To demonstrate how to use the asset selection syntax, we'll use the following asset graph from the Dagster University Essentials project:

Global asset lineage for the Dagster University Essentials project

Selecting a single asset#

To select a single asset, use the asset's asset key. In this example, we want to select the taxi_zones_file asset:

raw_data_job = define_asset_job(name="raw_data_job", selection="taxi_zones_file")

Selecting assets with multiple key components#

To select an asset with a key containing multiple components, such as a prefix, insert slashes (/) between the components.

In this example, we want to select the manhattan/manhattan_stats asset. The asset is defined as follows - note the key_prefix:

@asset(
    deps=[AssetKey(["taxi_trips"]), AssetKey(["taxi_zones"])], key_prefix="manhattan"
)
def manhattan_stats(database: DuckDBResource):
 ...

manhattan_job = define_asset_job(name="manhattan_job", selection="manhattan/manhattan_stats")

Selecting multiple assets#

To select multiple assets, use a list of the assets' asset keys. The assets don't have to be dependent on each other.

In this example, we want to select the taxi_zones_file and taxi_trips_file assets:

raw_data_job = define_asset_job(
    name="taxi_zones_job", selection=["taxi_zones_file", "taxi_trips_file"]
)

Selecting an asset's entire lineage#

To select an asset's entire lineage, add an asterisk (*) before and after the asset key in the query.

In this example, we want to select the entire lineage for the taxi_zones asset:

taxi_zones_job = define_asset_job(name="taxi_zones_job", selection="*taxi_zones*")

Selecting upstream dependencies#

Selecting all upstream dependencies#

To select an asset and all its upstream dependencies, add an asterisk (*) before the asset key in the query.

In this example, we want to select the manhattan_map asset and all its upstream dependencies:

manhattan_job = define_asset_job(name="manhattan_job", selection="*manhattan_map")

Selecting a specific number of upstream layers#

To select an asset and multiple upstream layers, add a plus sign (+) for each layer you want to select before the asset key in the query.

In this example, we want to select the manhattan_map asset and two upstream layers:

manhattan_job = define_asset_job(name="manhattan_job", selection="++manhattan_map")

Selecting downstream dependencies#

Selecting all downstream dependencies#

To select an asset and all its downstream dependencies, add an asterisk (*) after the asset key in the query.

In this example, we want to select the taxi_zones_file asset and all its downstream dependencies:

taxi_zones_job = define_asset_job(name="taxi_zones_job", selection="taxi_zones_file*")

Specific a number of downstream layers#

To select an asset and multiple downstream layers, add plus sign (+) for each layer you want to select after the asset key in the query.

In this example, we want to select the taxi_trips_file asset and two downstream layers:

taxi_zones_job = define_asset_job(name="taxi_zones_job", selection="taxi_zones_file++")