Joe McDonnell created IMPALA-14933:
--------------------------------------
Summary: KrpcDataStreamSender should provide information about
skew in partitioned exchanges
Key: IMPALA-14933
URL: https://issues.apache.org/jira/browse/IMPALA-14933
Project: IMPALA
Issue Type: Task
Components: Distributed Exec
Affects Versions: Impala 5.0.0
Reporter: Joe McDonnell
Partitioned exchanges with high skew can send most rows to a single receiver.
When diagnosing other causes of slow exchanges, it can be useful to know easily
if the exchange is skewed. Adding some simple statistics can make it clear
whether skew is happening. The most basic version is to track the number of
rows sent to each receiver and add a highwatermark counter with the maximum
number of rows sent to a single receiver. This can also be used as the basis of
a derived counter the computes the percentage. A profile showing that 95% of
the rows went to the top receiver would be a clear indication of skew.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)