DataprocSubmitSparkJobOperator
GoogleStart a Spark Job on a Cloud DataProc cluster.
Access Instructions
Install the Google provider package into your Airflow environment.
Import the module into your DAG file and instantiate it with your desired params.
Parameters
main_jarThe HCFS URI of the jar file that contains the main class (use this or the main_class, not both together).
main_className of the job class. (use this or the main_jar, not both together).
argumentsArguments for the job. (templated)
archivesList of archived files that will be unpacked in the work directory. Should be stored in Cloud Storage.
filesList of files to be copied to the working directory
Documentation
Start a Spark Job on a Cloud DataProc cluster.