SageMakerAutoMLOperator

Amazon

Creates an auto ML job, learning to predict the given column from the data provided through S3. The learning output is written to the specified S3 location.

View on GitHub

Last Updated: Feb. 27, 2023

Access Instructions

Install the Amazon provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

job_nameRequiredName of the job to create, needs to be unique within the account.
s3_inputRequiredThe S3 location (folder or file) where to fetch the data. By default, it expects csv with headers.
target_attributeRequiredThe name of the column containing the values to predict.
s3_outputRequiredThe S3 folder where to write the model artifacts. Must be 128 characters or fewer.
role_arnRequiredThe ARN of the IAM role to use when interacting with S3. Must have read access to the input, and write access to the output folder.
compressed_inputSet to True if the input is gzipped.
time_limitThe maximum amount of time in seconds to spend training the model(s).
autodeploy_endpoint_nameIf specified, the best model will be deployed to an endpoint with that name. No deployment made otherwise.
extrasUse this dictionary to set any variable input variable for job creation that is not offered through the parameters of this function. The format is described in: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_auto_ml_job
wait_for_completionWhether to wait for the job to finish before returning. Defaults to True.
check_intervalInterval in seconds between 2 status checks when waiting for completion.

Documentation

Creates an auto ML job, learning to predict the given column from the data provided through S3. The learning output is written to the specified S3 location.

See also

For more information on how to use this operator, take a look at the guide: Launch an AutoML experiment

Was this page helpful?