HiveCliHook
Apache HiveSimple wrapper around the hive CLI.
Access Instructions
Install the Apache Hive provider package into your Airflow environment.
Import the module into your DAG file and instantiate it with your desired params.
Parameters
hive_cli_conn_idReference to the Hive CLI connection id.
mapred_queuequeue used by the Hadoop Scheduler (Capacity or Fair)
mapred_queue_prioritypriority within the job queue. Possible settings include: VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW
mapred_job_nameThis name will appear in the jobtracker. This can make monitoring easier.
hive_cli_paramsSpace separated list of hive command parameters to add to the hive command.
Documentation
Simple wrapper around the hive CLI.
It also supports the beeline
a lighter CLI that runs JDBC and is replacing the heavier traditional CLI. To enable beeline
, set the use_beeline param in the extra field of your connection as in { "use_beeline": true }
Note that you can also set default hive CLI parameters by passing hive_cli_params
space separated list of parameters to add to the hive command.
The extra connection parameter auth
gets passed as in the jdbc
connection string as is.