GCSTimeSpanFileTransformOperator
GoogleDetermines a list of objects that were added or modified at a GCS source location during a specific time-span, copies them to a temporary location on the local file system, runs a transform on this file as specified by the transformation script and uploads the output to the destination bucket.
Access Instructions
Install the Google provider package into your Airflow environment.
Import the module into your DAG file and instantiate it with your desired params.
Parameters
Documentation
Determines a list of objects that were added or modified at a GCS source location during a specific time-span, copies them to a temporary location on the local file system, runs a transform on this file as specified by the transformation script and uploads the output to the destination bucket.
See also
For more information on how to use this operator, take a look at the guide: GCSTimeSpanFileTransformOperator
The locations of the source and the destination files in the local filesystem is provided as an first and second arguments to the transformation script. The time-span is passed to the transform script as third and fourth argument as UTC ISO 8601 string.
The transformation script is expected to read the data from source, transform it and write the output to the local destination file.