S3ToSqlOperator
AmazonLoads Data from S3 into a SQL Database. You need to provide a parser function that takes a filename as an input and returns an iterable of rows
Access Instructions
Install the Amazon provider package into your Airflow environment.
Import the module into your DAG file and instantiate it with your desired params.
Parameters
schemareference to a specific schema in SQL database
tableRequiredreference to a specific table in SQL database
s3_bucketRequiredreference to a specific S3 bucket
s3_keyRequiredreference to a specific S3 key
sql_conn_idreference to a specific SQL database. Must be of type DBApiHook
aws_conn_idreference to a specific S3 / AWS connection
column_listlist of column names to use in the insert SQL.
commit_everyThe maximum number of rows to insert in one transaction. Set to 0 to insert all rows in one transaction.
parserRequiredparser function that takes a filepath as input and returns an iterable. e.g. to use a CSV parser that yields rows line-by-line, pass the following function: def parse_csv(filepath):import csv with open(filepath, newline=””) as file:yield from csv.reader(file)
parse_csv(filepath):
Documentation
Loads Data from S3 into a SQL Database. You need to provide a parser function that takes a filename as an input and returns an iterable of rows
See also
For more information on how to use this operator, take a look at the guide: Amazon S3 To SQL Transfer Operator