BeamRunJavaPipelineOperator

Apache Beam

Launching Apache Beam pipelines written in Java.

View on GitHub

Last Updated: Jan. 23, 2023

Access Instructions

Install the Apache Beam provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

jarRequiredThe reference to a self executing Apache Beam jar (templated).
job_classThe name of the Apache Beam pipeline class to be executed, it is often not the main class configured in the pipeline jar file.

Documentation

Launching Apache Beam pipelines written in Java.

Note that both default_pipeline_options and pipeline_options will be merged to specify pipeline execution parameter, and default_pipeline_options is expected to save high-level pipeline_options, for instances, project and zone information, which apply to all Apache Beam operators in the DAG.

See also

For more information on how to use this operator, take a look at the guide: Run Java Pipelines in Apache Beam

See also

For more detail on Apache Beam have a look at the reference: https://beam.apache.org/documentation/

You need to pass the path to your jar file as a file reference with the jar parameter, the jar needs to be a self executing jar (see documentation here: https://beam.apache.org/documentation/runners/dataflow/#self-executing-jar). Use pipeline_options to pass on pipeline_options to your job.

Was this page helpful?