jadami10 opened a new pull request, #17452: URL: https://github.com/apache/pinot/pull/17452
This is a potential bugfix addressing #16801. I know we've been saying to use log4j, but query logs need to be correlated since we log once when the query is received and once when it is completed. Also, this feature already exists, and log4j is tedious to configure/test. #15264 switched the received query logs to `.info` and to not respect rate limiting. Instead, "received" logs just consume rate limiting which, for high QPS clusters, leads to the majority of logs just being "received" logs. In this PR, the rate limiting is determined up front when the query received is logged. We choose whether to log "received", and that determines whether we log "completed" as well. This way, you should always get both logs for a given query rather than some arbitrary mix of both. This is slightly backwards incompatible. For clusters with default settings or low RPS, they won't notice a differences. They will continue to see all logs. For clusters with higher RPS, `CONFIG_OF_BROKER_QUERY_LOG_MAX_RATE_PER_SECOND` is semantically changing to control the number of queries logged per second rather than the number of logs per second. For clusters where rate limit == RPS, they may see 2x the logs since this change will effectively cause "completed" to show up for each received query. For clusters where RPS >> rate limit, they will see a reduction in logs since the "received" query logs will not be rate limited, but the trade off is they will consistently see received/completed per query. I specifically tested this on an internal cluster that only saw intermittent rate limiting every hour. <img width="1603" height="415" alt="image" src="https://github.com/user-attachments/assets/22d6da32-5c67-44ad-934b-0abc8610c750" /> After my change, you can see the log volume doubled during rate limiting, but there's no longer a mismatch between the number of "received" vs "completed" logs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
