
ElasticsearchTaskHandler
ElasticsearchElasticsearchTaskHandler is a python log handler that reads logs from Elasticsearch. Note that Airflow does not handle the indexing of logs into Elasticsearch. Instead, Airflow flushes logs into local files. Additional software setup is required to index the logs into Elasticsearch, such as using Filebeat and Logstash. To efficiently query and sort Elasticsearch results, this handler assumes each log message has a field log_id consists of ti primary keys: log_id = {dag_id}-{task_id}-{execution_date}-{try_number} Log messages with specific log_id are sorted based on offset, which is a unique integer indicates log message’s order. Timestamps here are unreliable because multiple log messages might have the same timestamp.
Access Instructions
Install the Elasticsearch provider package into your Airflow environment.
Import the module into your DAG file and instantiate it with your desired params.
Parameters
Documentation
ElasticsearchTaskHandler is a python log handler that reads logs from Elasticsearch. Note that Airflow does not handle the indexing of logs into Elasticsearch. Instead, Airflow flushes logs into local files. Additional software setup is required to index the logs into Elasticsearch, such as using Filebeat and Logstash. To efficiently query and sort Elasticsearch results, this handler assumes each log message has a field log_id consists of ti primary keys: log_id = {dag_id}-{task_id}-{execution_date}-{try_number} Log messages with specific log_id are sorted based on offset, which is a unique integer indicates log message’s order. Timestamps here are unreliable because multiple log messages might have the same timestamp.