S3ToSqlOperator

Amazon

Loads Data from S3 into a SQL Database. You need to provide a parser function that takes a filename as an input and returns an iterable of rows

View on GitHub

Last Updated: Jan. 23, 2023

Access Instructions

Install the Amazon provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

schemareference to a specific schema in SQL database
tableRequiredreference to a specific table in SQL database
s3_bucketRequiredreference to a specific S3 bucket
s3_keyRequiredreference to a specific S3 key
sql_conn_idreference to a specific SQL database. Must be of type DBApiHook
aws_conn_idreference to a specific S3 / AWS connection
column_listlist of column names to use in the insert SQL.
commit_everyThe maximum number of rows to insert in one transaction. Set to 0 to insert all rows in one transaction.
parserRequiredparser function that takes a filepath as input and returns an iterable. e.g. to use a CSV parser that yields rows line-by-line, pass the following function: def parse_csv(filepath):import csv with open(filepath, newline=””) as file:yield from csv.reader(file)
parse_csv(filepath):

Documentation

Loads Data from S3 into a SQL Database. You need to provide a parser function that takes a filename as an input and returns an iterable of rows

See also

For more information on how to use this operator, take a look at the guide: Amazon S3 To SQL Transfer Operator

Was this page helpful?