SqoopHook

Apache Sqoop

This hook is a wrapper around the sqoop 1 binary. To be able to use the hook it is required that “sqoop” is in the PATH.

View on GitHub

Last Updated: Feb. 16, 2023

Access Instructions

Install the Apache Sqoop provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

conn_idReference to the sqoop connection.
verboseSet sqoop to verbose.
num_mappersNumber of map tasks to import in parallel.
propertiesProperties to set via the -D argument
libjarsOptional Comma separated jar files to include in the classpath.

Documentation

This hook is a wrapper around the sqoop 1 binary. To be able to use the hook it is required that “sqoop” is in the PATH.

Additional arguments that can be passed via the ‘extra’ JSON field of the sqoop connection:

  • job_tracker: Job tracker local|jobtracker:port.

  • namenode: Namenode.

  • files: Comma separated files to be copied to the map reduce cluster.

  • password_file: Path to file containing the password.

param conn_id

Reference to the sqoop connection.

param verbose

Set sqoop to verbose.

param num_mappers

Number of map tasks to import in parallel.

param properties

Properties to set via the -D argument

param libjars

Optional Comma separated jar files to include in the classpath.

Was this page helpful?