featzhang created FLINK-39367:
---------------------------------

             Summary: [connector-http] Add configurable error logging with 
detailed context
                 Key: FLINK-39367
                 URL: https://issues.apache.org/jira/browse/FLINK-39367
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / HTTP
            Reporter: featzhang


  h2. Background

  Currently, when the HTTP Connector encounters errors (request failures, 
unexpected status codes, etc.), the log output provides minimal information,
  making it difficult to diagnose issues in production environments. The 
existing {{http.logging.level}} mechanism only controls the verbosity of
  request/response content logging (MIN/REQ_RESP/MAX) and is fixed at DEBUG 
level, which does not address the following needs:
  - Logging errors at WARN or INFO level instead of only DEBUG
  - Capturing richer context at error time (HTTP method, URL, response status 
code, retry attempt number)
  - Optionally including request/response bodies in error logs, with a 
configurable size limit to prevent log flooding
  - Automatically masking sensitive headers (e.g., Authorization, Cookie, 
api-key) to avoid credential exposure

  h2. Requirements

  Introduce a new {{HttpErrorLogger}} class providing configurable error log 
level and verbosity

  Add the following three new configuration options:

  ** {{http.error.log.level}} — log level for error events; one of 
{{ERROR}}/{{WARN}}/{{INFO}}, default {{ERROR}}
  ** {{http.error.log.include.body}} — whether to include request/response body 
in error logs, default {{false}}
  ** {{http.error.log.body.max.size}} — maximum number of characters to log 
from body content, default {{1024}}
  Sensitive headers (authorization, cookie, api-key, x-api-key) must be 
automatically masked in all error log output

  Refactor {{GenericJsonAndUrlQueryCreator}} to replace the regex-based 
{{http.request.body-template}} with a more explicit and type-safe approach using
  {{http.request.body-fields}} and {{http.request.additional-body-json}}

  h2. Proposed Solution

  - Add {{HttpErrorLogLevel}} enum (ERROR / WARN / INFO)
  - Add {{HttpErrorLogger}} class implementing {{Serializable}} for use in 
distributed Flink environments; reads configuration from {{Properties}} at
  construction time
  - Integrate {{HttpErrorLogger}} into both {{JavaNetHttpPollingClient}} 
(lookup source) and {{JavaNetSinkHttpClient}} (sink)
  - Refactor {{GenericJsonAndUrlQueryCreatorFactory}}:
  ** Replace {{http.request.body-template}} (regex placeholder substitution) 
with:
  *** {{http.request.body-fields}}: explicit list of column names to include in 
the request body for PUT/POST operations
  *** {{http.request.additional-body-json}}: a static JSON object string whose 
fields are merged into the generated request body; parsed once at factory
  initialization for runtime efficiency
  ** Validate at factory creation time that {{additional-body-json}} fields do 
not conflict with join key fields defined in {{body-fields}}

  h2. Notes

  {{HttpErrorLogger}} and the existing {{HttpLogger}} serve complementary 
purposes:
  - {{HttpLogger}}: controls the verbosity of HTTP content logging (MIN / 
REQ_RESP / MAX), always at DEBUG level — intended for development and
  troubleshooting
  - {{HttpErrorLogger}}: controls the severity level at which errors are logged 
(ERROR / WARN / INFO) — intended for production operational monitoring

  These two mechanisms are independent and do not conflict. Users may configure 
both as needed.

  h2. Related PR

  https://github.com/apache/flink-connector-http/pull/35




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to