DataprocCreateClusterOperatorAsync

Astronomer Providers

Create a new cluster on Google Cloud Dataproc Asynchronously.

Access Instructions

Install the Astronomer Providers provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

project_idThe ID of the google cloud project in which to create the cluster. (templated)
cluster_nameName of the cluster to create
labelsLabels that will be assigned to created cluster
cluster_configRequired. The cluster config to create. If a dict is provided, it must be of the same form as the protobuf message ClusterConfig
virtual_cluster_configOptional. The virtual cluster config, used when creating a Dataproc cluster that does not directly control the underlying compute resources, for example, when creating a Dataproc-on-GKE cluster
regionThe specified region where the dataproc cluster is created.
delete_on_errorIf true the cluster will be deleted if created with ERROR state. Default value is true.
use_if_existsIf true use existing cluster
request_idOptional. A unique id used to identify the request. If the server receives two DeleteClusterRequest requests with the same id, then the second request will be ignored and the first google.longrunning.Operation created and stored in the backend is returned.
retryA retry object used to retry requests. If None is specified, requests will not be retried.
timeoutThe amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadataAdditional metadata that is provided to the method.
gcp_conn_idThe connection ID to use connecting to Google Cloud.
impersonation_chainOptional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
polling_intervalTime in seconds to sleep between checks of cluster status

Documentation

Create a new cluster on Google Cloud Dataproc Asynchronously.

Was this page helpful?