SageMakerProcessingOperatorAsync

Astronomer Providers

SageMakerProcessingOperatorAsync is used to analyze data and evaluate machine learning models on Amazon SageMaker. With SageMakerProcessingOperatorAsync, you can use a simplified, managed experience on SageMaker to run your data processing workloads, such as feature engineering, data validation, model evaluation, and model interpretation.

Access Instructions

Install the Astronomer Providers provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

configThe configuration necessary to start a processing job (templated). For details of the configuration parameter see :ref:SageMaker.Client.create_processing_job
aws_conn_idThe AWS connection ID to use.
wait_for_completionEven if wait is set to False, in async we will defer and the operation waits to check the status of the processing job.
print_logif the operator should print the cloudwatch log during processing
check_intervalif wait is set to be true, this is the time interval in seconds which the operator will check the status of the processing job
max_ingestion_timeThe operation fails if the processing job doesn’t finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
action_if_job_existsBehaviour if the job name already exists. Possible options are “increment” (default) and “fail”.

Documentation

SageMakerProcessingOperatorAsync is used to analyze data and evaluate machine learning models on Amazon SageMaker. With SageMakerProcessingOperatorAsync, you can use a simplified, managed experience on SageMaker to run your data processing workloads, such as feature engineering, data validation, model evaluation, and model interpretation.

See also

For more information on how to use this operator, take a look at the guide: howto/operator:SageMakerProcessingOperator

Was this page helpful?