MLEngineStartTrainingJobOperator

Google

Operator for launching a MLEngine training job.

View on GitHub

Last Updated: Feb. 25, 2023

Access Instructions

Install the Google provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

job_idRequiredA unique templated id for the submitted Google MLEngine training job. (templated)
regionRequiredThe Google Compute Engine region to run the MLEngine training job in (templated).
package_urisA list of Python package locations for the training job, which should include the main training program and any additional dependencies. This is mutually exclusive with a custom image specified via master_config. (templated)
training_python_moduleThe name of the Python module to run within the training job after installing the packages. This is mutually exclusive with a custom image specified via master_config. (templated)
training_argsA list of command-line arguments to pass to the training program. (templated)
scale_tierResource tier for MLEngine training job. (templated)
master_typeThe type of virtual machine to use for the master worker. It must be set whenever scale_tier is CUSTOM. (templated)
master_configThe configuration for the master worker. If this is provided, master_type must be set as well. If a custom image is specified, this is mutually exclusive with package_uris and training_python_module. (templated)
runtime_versionThe Google Cloud ML runtime version to use for training. (templated)
python_versionThe version of Python used in training. (templated)
job_dirA Google Cloud Storage path in which to store training outputs and other data needed for training. (templated)
service_accountOptional service account to use when running the training application. (templated) The specified service account must have the iam.serviceAccounts.actAs role. The Google-managed Cloud ML Engine service account must have the iam.serviceAccountAdmin role for the specified service account. If set to None or missing, the Google-managed Cloud ML Engine service account will be used.
project_idRequiredThe Google Cloud project name within which MLEngine training job should run.
gcp_conn_idThe connection ID to use when fetching connection info.
delegate_toThe account to impersonate using domain-wide delegation of authority, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
modeCan be one of ‘DRY_RUN’/’CLOUD’. In ‘DRY_RUN’ mode, no real training job will be launched, but the MLEngine training job request will be printed out. In ‘CLOUD’ mode, a real MLEngine training job creation request will be issued.
labelsa dictionary containing labels for the job; passed to BigQuery
hyperparametersOptional HyperparameterSpec dictionary for hyperparameter tuning. For further reference, check: https://cloud.google.com/ai-platform/training/docs/reference/rest/v1/projects.jobs#HyperparameterSpec
impersonation_chainOptional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
cancel_on_killFlag which indicates whether cancel the hook’s job or not, when on_kill is called
deferrableRun operator in the deferrable mode

Documentation

Operator for launching a MLEngine training job.

See also

For more information on how to use this operator, take a look at the guide: Launching a Job

Was this page helpful?