DatabricksHook
DatabricksInteract with Databricks.
Access Instructions
Install the Databricks provider package into your Airflow environment.
Import the module into your DAG file and instantiate it with your desired params.
Parameters
databricks_conn_idReference to the Databricks connection.
timeout_secondsThe amount of time in seconds the requests library will wait before timing-out.
retry_limitThe number of times to retry the connection in case of service outages.
retry_delayThe number of seconds to wait between retries (it might be a floating point number).
retry_argsAn optional dictionary with arguments passed to tenacity.Retrying class.
Documentation
Interact with Databricks.
Example DAGs
Execute ML Pipelines in Databricks
Run an end-to-end pipeline from BigQuery data ingest, feature engineering, model training, to publishing all within a Databricks cluster.
AI + Machine Learning
Retrain ML Model in Databricks
Demonstrates orchestrating ML retrain pipelines executed on Databricks with Airflow.
AI + Machine Learning