Apache Hive

Simple wrapper around the hive CLI.

Last Updated: Mar. 21, 2023

Access Instructions

Install the Apache Hive provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.


hive_cli_conn_idReference to the Hive CLI connection id.
mapred_queuequeue used by the Hadoop Scheduler (Capacity or Fair)
mapred_queue_prioritypriority within the job queue. Possible settings include: VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW
mapred_job_nameThis name will appear in the jobtracker. This can make monitoring easier.
hive_cli_paramsSpace separated list of hive command parameters to add to the hive command.


It also supports the beeline a lighter CLI that runs JDBC and is replacing the heavier traditional CLI. To enable beeline, set the use_beeline param in the extra field of your connection as in { "use_beeline": true }

Note that you can also set default hive CLI parameters by passing hive_cli_params space separated list of parameters to add to the hive command.

The extra connection parameter auth gets passed as in the jdbc connection string as is.

