FromXLSXOperator
XLSXConvert an XLSX/XLS file into Parquet or CSV file
Access Instructions
Install the XLSX provider package into your Airflow environment.
Import the module into your DAG file and instantiate it with your desired params.
Parameters
sourcestrSource filename (XLSX or XLS, templated)
targetstrTarget filename (templated)
worksheetstr or intWorksheet title or number (zero-based, templated)
skip_rowsintNumber of input lines to skip (default: 0, templated)
limitintRow limit (default: None, templated)
drop_columnslist of strList of columns to be dropped
add_columnslist of str or dictionary of string key/value pairColumns to be added (dict or list column=value)
typesstr or dictionary of string key/value pairforce Parquet column types (dict or list column=’str’, ‘int64’, ‘double’, ‘datetime64[ns]’)
column_nameslist of strforce columns names (list)
file_formatstrOutput file format (parquet, csv, json, jsonl)
csv_delimiterstrCSV delimiter (default: ‘,’)
csv_headerstrConvert CSV output header case (‘lower’, ‘upper’, ‘skip’)
float_formatstrFormat string for floating point numbers (default ‘%g’)
nullable_intboolnullable integer data type support
Documentation
Convert an XLSX/XLS file into Parquet or CSV file
Read an XLSX or XLS file and convert it into Parquet, CSV, JSON, JSON Lines(one line per record) file.