npawar commented on issue #8247: URL: https://github.com/apache/pinot/issues/8247#issuecomment-1064563608
Hey @hristo-stripe , could you share a few more details? 1) Does this always happen, or was this a one off? 2) Which metrics are these exactly - server side (for the exact server in question) or broker side? 3) Could you attach some other metrics such as partial responses, timeouts for all the servers? 4) Any logs you can share from around the incident? Why was the server killed? Was there some ongoing issue on the server (OOM, expensive query/hot segment) that already caused resource constraints, resulting in the queries being served from a bad server before it was actually killed and took segments offline? Because afaik, as soon as the segment gets marked ERROR/OFFLINE, the query won't be routed to it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org