Using asset checks, you can define and execute different types of checks on your data assets directly in Dagster. Each asset check tests some property of a data asset, such as:
ID
, doesn't contain null valuesAssets, their checks, and the results of those checks can be viewed in the Dagster UI, providing you with a unified view of your pipeline's health. For example:
Using asset checks helps you:
Before continuing, you should be familiar with:
Defined in code, asset checks are used to test some property of one or more Dagster assets. Asset checks can be defined by:
Asset checks and their results are visible in the UI, allowing you to communicate useful information about data quality, data freshness, and other issues to stakeholders. Asset check results can also be used to create conditional steps in your pipelines - for example, if a quality check fails, execution can be halted to prevent issues spreading downstream.
Using schedules and sensors, you can automate the execution of jobs that include asset checks and the assets that they target. Checks can also be executed on a one-off basis using the Dagster UI. Refer to the Executing checks section of the Defining and executing asset checks guide for more info.
Check out these guides to get started with asset checks:
From here, you can:
The following table lists Dagster's built-in utility methods for creating asset checks.
API | Description |
---|---|
build_metadata_bounds_checks | Builds asset checks that pass if numeric metadata value falls within a particular range |
build_column_schema_change_checks | Builds asset checks that pass if an asset's columns are the same, compared with its prior materialization |
build_last_update_freshness_checks | Builds asset checks that pass if not too much time has elapsed since the latest time an asset was updated |
build_time_partition_freshness_checks | Builds asset checks that pass if an asset's most recent partition has been materialized before a deadline |
Dagster's UI is tested with a maximum of 1,000 checks per asset. It's designed with the expectation that most assets will have fewer than 50 checks. If you have a use case that doesn't fit these limits, reach out to Dagster support to discuss.
Checks are currently only supported per-asset, not per-partition. See this issue for updates.