Orchestrate Databricks Jobs with Airflow
Example DAG from the Astronomer Databricks tutorial.
Big Data & Analytics
Providers:
Modules:
Run this DAG
1. Install the Astronomer CLI:Skip if you already have our CLI
2. Download the repository:
3. Navigate to where the repository was cloned and start the DAG:
airflow-databricks-tutorial
This repo contains an Astronomer project with multiple example DAGs showing how to use Airflow to orchestrate Databricks jobs. A guide discussing the DAGs and concepts in depth can be found here.
Tutorial Overview
This tutorial has one DAGs showing how to use the following Databricks Operators:
- DatabricksRunNowOperator
- DatabricksSubmitRunOperator
Getting Started
The easiest way to run these example DAGs is to use the Astronomer CLI to get an Airflow instance up and running locally:
- Install the Astronomer CLI
- Clone this repo somewhere locally and navigate to it in your terminal
- Initialize an Astronomer project by running
astro dev init
- Start Airflow locally by running
astro dev start
- Navigate to localhost:8080 in your browser and you should see the tutorial DAGs there