Owen-CH-Leung commented on code in PR #53821:
URL: https://github.com/apache/airflow/pull/53821#discussion_r2244246066
##########
providers/elasticsearch/src/airflow/providers/elasticsearch/log/es_task_handler.py:
##########
@@ -677,26 +716,17 @@ def _write_to_es(self, log_lines: list[dict[str, Any]])
-> bool:
:param log_lines: the log_lines to write to the ElasticSearch.
"""
+ es_kwargs = get_es_kwargs_from_config()
+
+ client = elasticsearch.Elasticsearch(self.host, **es_kwargs)
# Prepare the bulk request for Elasticsearch
bulk_actions = [{"_index": self.target_index, "_source": log} for log
in log_lines]
try:
- _ = helpers.bulk(self.client, bulk_actions)
+ _ = helpers.bulk(client, bulk_actions)
return True
except Exception as e:
self.log.exception("Unable to insert logs into Elasticsearch.
Reason: %s", str(e))
return False
-
-def getattr_nested(obj, item, default):
- """
- Get item from obj but return default if not found.
-
- E.g. calling ``getattr_nested(a, 'b.c', "NA")`` will return
- ``a.b.c`` if such a value exists, and "NA" otherwise.
-
- :meta private:
- """
- try:
- return attrgetter(item)(obj)
- except AttributeError:
- return default
+ def read(self, relative_path: str, ti: RuntimeTI) -> tuple[LogSourceInfo,
LogMessages]: # type: ignore[empty-body]
+ pass
Review Comment:
I'm still able to render logs from ES even the `read()` in
`ElasticsearchRemoteLogIO` is not defined.
I think the reason is that I'm still setting ElasticsearchTaskHandler as the
logging handler:
```
DEFAULT_LOGGING_CONFIG["handlers"].update(ELASTIC_REMOTE_HANDLERS)
```
Are you suggesting that both remote read & write should be handled in
`ElasticsearchRemoteLogIO` ? And that we should get rid of
`ElasticsearchTaskHandler` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]