Starts Dataflow SQL query.

View on GitHub

Last Updated: Feb. 25, 2023

Access Instructions

Install the Google provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.


job_nameThe unique name to assign to the Cloud Dataflow job.
queryThe SQL query to execute.
optionsJob parameters to be executed. It can be a dictionary with the following keys. For more information, look at: command reference
locationThe location of the Dataflow job (for example europe-west1)
project_idThe ID of the GCP project that owns the job. If set to None or missing, the default project_id from the GCP connection is used.
gcp_conn_idThe connection ID to use connecting to Google Cloud Platform.
delegate_toThe account to impersonate, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
drain_pipelineOptional, set to True if want to stop streaming job by draining it instead of canceling during killing task instance. See:
impersonation_chainOptional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).


Starts Dataflow SQL query.

See also

For more information on how to use this operator, take a look at the guide: Dataflow SQL


This operator requires gcloud command (Google Cloud SDK) must be installed on the Airflow worker <>`__

Was this page helpful?