DynamoDBToS3Operator

Amazon

Replicates records from a DynamoDB table to S3. It scans a DynamoDB table and writes the received records to a file on the local filesystem. It flushes the file to S3 once the file size exceeds the file size limit specified by the user.

View on GitHub

Last Updated: Mar. 14, 2023

Access Instructions

Install the Amazon provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

dynamodb_table_nameRequiredDynamodb table to replicate data from
s3_bucket_nameRequiredS3 bucket to replicate data to
file_sizeRequiredFlush file to s3 if file size >= file_size
dynamodb_scan_kwargskwargs pass to
s3_key_prefixPrefix of s3 object key
process_funcHow we transforms a dynamodb item to bytes. By default we dump the json
aws_conn_idThe Airflow connection used for AWS credentials. If this is None or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).

Documentation

Replicates records from a DynamoDB table to S3. It scans a DynamoDB table and writes the received records to a file on the local filesystem. It flushes the file to S3 once the file size exceeds the file size limit specified by the user.

Users can also specify a filtering criteria using dynamodb_scan_kwargs to only replicate records that satisfy the criteria.

See also

For more information on how to use this operator, take a look at the guide: Amazon DynamoDB To Amazon S3 transfer operator

Was this page helpful?