GoogleApiToS3Operator

Amazon

Basic class for transferring data from a Google API endpoint into a S3 Bucket.

View on GitHub

Last Updated: Apr. 10, 2023

Access Instructions

Install the Amazon provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

google_api_service_nameRequiredThe specific API service that is being requested.
google_api_service_versionRequiredThe version of the API that is being requested.
google_api_endpoint_pathRequiredThe client libraries path to the api call’s executing method. For example: ‘analyticsreporting.reports.batchGet’ Note See https://developers.google.com/apis-explorer for more information on which methods are available.
google_api_endpoint_paramsRequiredThe params to control the corresponding endpoint result.
s3_destination_keyRequiredThe url where to put the data retrieved from the endpoint in S3.
google_api_response_via_xcomCan be set to expose the google api response to xcom.
google_api_endpoint_params_via_xcomIf set to a value this value will be used as a key for pulling from xcom and updating the google api endpoint params.
google_api_endpoint_params_via_xcom_task_idsTask ids to filter xcom by.
google_api_paginationIf set to True Pagination will be enabled for this request to retrieve all data. Note This means the response will be a list of responses.
google_api_num_retriesDefine the number of retries for the google api requests being made if it fails.
s3_overwriteSpecifies whether the s3 file will be overwritten if exists.
gcp_conn_idThe connection ID to use when fetching connection info.
delegate_toGoogle account to impersonate using domain-wide delegation of authority, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
aws_conn_idThe connection id specifying the authentication information for the S3 Bucket.
google_impersonation_chainOptional Google service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

Documentation

Basic class for transferring data from a Google API endpoint into a S3 Bucket.

This discovery-based operator use GoogleDiscoveryApiHook to communicate with Google Services via the Google API Python Client. Please note that this library is in maintenance mode hence it won’t fully support Google Cloud in the future. Therefore it is recommended that you use the custom Google Cloud Service Operators for working with the Google Cloud Platform.

See also

For more information on how to use this operator, take a look at the guide: Google Sheets to Amazon S3 transfer operator

Was this page helpful?