HiveCliHook

Apache Hive

Simple wrapper around the hive CLI.

View on GitHub

Last Updated: Mar. 21, 2023

Access Instructions

Install the Apache Hive provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

hive_cli_conn_idReference to the Hive CLI connection id.
mapred_queuequeue used by the Hadoop Scheduler (Capacity or Fair)
mapred_queue_prioritypriority within the job queue. Possible settings include: VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW
mapred_job_nameThis name will appear in the jobtracker. This can make monitoring easier.
hive_cli_paramsSpace separated list of hive command parameters to add to the hive command.

Documentation

Simple wrapper around the hive CLI.

It also supports the beeline a lighter CLI that runs JDBC and is replacing the heavier traditional CLI. To enable beeline, set the use_beeline param in the extra field of your connection as in { "use_beeline": true }

Note that you can also set default hive CLI parameters by passing hive_cli_params space separated list of parameters to add to the hive command.

The extra connection parameter auth gets passed as in the jdbc connection string as is.

Was this page helpful?