HiveToMySqlOperator

Apache Hive

Moves data from Hive to MySQL, note that for now the data is loaded into memory before being pushed to MySQL, so this operator should be used for smallish amount of data.

View on GitHub

Last Updated: Jan. 11, 2023

Access Instructions

Install the Apache Hive provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

sqlRequiredSQL query to execute against Hive server. (templated)
mysql_tableRequiredtarget MySQL table, use dot notation to target a specific database. (templated)
mysql_conn_idsource mysql connection
hiveserver2_conn_idReference to the Hive Server2 thrift service connection id.
mysql_preoperatorsql statement to run against mysql prior to import, typically use to truncate of delete in place of the data coming in, allowing the task to be idempotent (running the task twice won’t double load data). (templated)
mysql_postoperatorsql statement to run against mysql after the import, typically used to move data from staging to production and issue cleanup commands. (templated)
bulk_loadflag to use bulk_load option. This loads mysql directly from a tab-delimited text file using the LOAD DATA LOCAL INFILE command. The MySQL server must support loading local files via this command (it is disabled by default).
hive_conf

Documentation

Moves data from Hive to MySQL, note that for now the data is loaded into memory before being pushed to MySQL, so this operator should be used for smallish amount of data.

Was this page helpful?