LocalFilesystemToADLSOperator

Microsoft Azure

Upload file(s) to Azure Data Lake

View on GitHub

Last Updated: Oct. 23, 2022

Access Instructions

Install the Microsoft Azure provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

local_pathRequiredlocal path. Can be single file, directory (in which case, upload recursively) or glob pattern. Recursive glob patterns using ** are not supported
remote_pathRequiredRemote path to upload to; if multiple files, this is the directory root to write within
nthreadsNumber of threads to use. If None, uses the number of cores.
overwriteWhether to forcibly overwrite existing files/directories. If False and remote path is a directory, will quit regardless if any files would be overwritten or not. If True, only matching filenames are actually overwritten
buffersizeint [2**22] Number of bytes for internal buffer. This block cannot be bigger than a chunk and cannot be smaller than a block
blocksizeint [2**22] Number of bytes for a block. Within each chunk, we write a smaller block for each API call. This block cannot be bigger than a chunk
extra_upload_optionsExtra upload options to add to the hook upload method
azure_data_lake_conn_idReference to the Azure Data Lake connection

Documentation

Upload file(s) to Azure Data Lake

See also

For more information on how to use this operator, take a look at the guide: LocalFilesystemToADLSOperator

Was this page helpful?