SageMakerTransformOperator
AmazonStarts a transform job. A transform job uses a trained model to get inferences on a dataset and saves these results to an Amazon S3 location that you specify.
Access Instructions
Install the Amazon provider package into your Airflow environment.
Import the module into your DAG file and instantiate it with your desired params.
Parameters
configRequiredThe configuration necessary to start a transform job (templated). If you need to create a SageMaker transform job based on an existed SageMaker model: config = transform_config If you need to create both SageMaker model and SageMaker Transform job: config = { 'Model': model_config, 'Transform': transform_config } For details of the configuration parameter of transform_config see SageMaker.Client.create_transform_job() For details of the configuration parameter of model_config, See: SageMaker.Client.create_model()
aws_conn_idThe AWS connection ID to use.
wait_for_completionSet to True to wait until the transform job finishes.
check_intervalIf wait is set to True, the time interval, in seconds, that this operation waits to check the status of the transform job.
max_ingestion_timeIf wait is set to True, the operation fails if the transform job doesn’t finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
check_if_job_existsIf set to true, then the operator will check whether a transform job already exists for the name in the config.
action_if_job_existsBehaviour if the job name already exists. Possible options are “timestamp” (default), “increment” (deprecated) and “fail”. This is only relevant if check_if_job_exists is True.
Dict
Documentation
Starts a transform job. A transform job uses a trained model to get inferences on a dataset and saves these results to an Amazon S3 location that you specify.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon SageMaker transform job