CloudDataTransferServiceGCSToGCSOperator

Google

Copies objects from a bucket to another using the Google Cloud Storage Transfer Service.

View on GitHub

Last Updated: Feb. 25, 2023

Access Instructions

Install the Google provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

source_bucketRequiredThe source Google Cloud Storage bucket where the object is. (templated)
destination_bucketRequiredThe destination Google Cloud Storage bucket where the object should be. (templated)
source_pathOptional root path where the source objects are. (templated)
destination_pathOptional root path for transferred objects. (templated)
project_idThe ID of the Google Cloud Console project that owns the job
gcp_conn_idOptional connection ID to use when connecting to Google Cloud Storage.
delegate_toGoogle account to impersonate using domain-wide delegation of authority, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
descriptionOptional transfer service job description
scheduleOptional transfer service schedule; If not set, run transfer job once as soon as the operator runs See: https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs. With two additional improvements: dates they can be passed as datetime.date times they can be passed as datetime.time
object_conditionsOptional transfer service object conditions; see https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#ObjectConditions
transfer_optionsOptional transfer service transfer options; see https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#TransferOptions
waitWait for transfer to finish. It must be set to True, if ‘delete_job_after_completion’ is set to True.
timeoutTime to wait for the operation to end in seconds. Defaults to 60 seconds if not specified.
google_impersonation_chainOptional Google service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
delete_job_after_completionIf True, delete the job after complete. If set to True, ‘wait’ must be set to True.

Documentation

Copies objects from a bucket to another using the Google Cloud Storage Transfer Service.

Warning

This operator is NOT idempotent. If you run it many times, many transfer jobs will be created in the Google Cloud.

See also

For more information on how to use this operator, take a look at the guide: GCSToGCSOperator

Example:

gcs_to_gcs_transfer_op = GoogleCloudStorageToGoogleCloudStorageTransferOperator(
task_id="gcs_to_gcs_transfer_example",
source_bucket="my-source-bucket",
destination_bucket="my-destination-bucket",
project_id="my-gcp-project",
dag=my_dag,
)

Was this page helpful?