Using Hightouch for Reverse ETL in Airflow

An ELT pipeline which extracts Salesforce customer data to Snowflake for transformation and enrichment, and synced back to Salesforce via Hightouch.

DatabasesETL/ELTStorageBig Data & Analytics


Run this DAG

1. Install the Astronomer CLI:Skip if you already have our CLI

2. Download the repository:

3. Navigate to where the repository was cloned and start the DAG:

Modern ELT: Salesforce to Snowflake

An ELT data pipeline from extracting Salesforce data to loading and transforming in Snowflake.

The DAG is this repo demonstrates a use case in which an analyst needs to blend Salesforce CRM and webpage analytics data within their Snowflake data warehouse for reporting. The data extracted from Salesforce is landed in an AWS S3 bucket for ingestion, copied directly from S3 to Snowflake, and finally transformed for analytics. In this example, S3 is also used as a data lake so the landed Salesforce data is also then persisted in a "raw data" S3 bucket.

Airflow Version

2.1.0

Providers Used

apache-airflow-providers-amazon==2.1.0
apache-airflow-providers-http==2.0.1
apache-airflow-providers-salesforce==3.1.0
apache-airflow-providers-snowflake==2.0.0
airflow-provider-hightouch==2.0.3

Connections

  • Salesforce
  • AWS S3
  • Snowflake
  • Hightouch (when using the reverse ETL DAG)