HiveToDynamoDBOperator

Amazon

Moves data from Hive to DynamoDB, note that for now the data is loaded into memory before being pushed to DynamoDB, so this operator should be used for smallish amount of data.

View on GitHub

Last Updated: Oct. 23, 2022

Access Instructions

Install the Amazon provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

sqlRequiredSQL query to execute against the hive database. (templated)
table_nameRequiredtarget DynamoDB table
table_keysRequiredpartition key and sort key
pre_processimplement pre-processing of source data
pre_process_argslist of pre_process function arguments
pre_process_kwargsdict of pre_process function arguments
region_nameaws region name (example: us-east-1)
schemahive database schema
hiveserver2_conn_idReference to the :ref: Hive Server2 thrift service connection id .
aws_conn_idaws connection

Documentation

Moves data from Hive to DynamoDB, note that for now the data is loaded into memory before being pushed to DynamoDB, so this operator should be used for smallish amount of data.

See also

For more information on how to use this operator, take a look at the guide: Apache Hive to Amazon DynamoDB transfer operator

Was this page helpful?