Joe McDonnell created IMPALA-14933:
--------------------------------------

             Summary: KrpcDataStreamSender should provide information about 
skew in partitioned exchanges
                 Key: IMPALA-14933
                 URL: https://issues.apache.org/jira/browse/IMPALA-14933
             Project: IMPALA
          Issue Type: Task
          Components: Distributed Exec
    Affects Versions: Impala 5.0.0
            Reporter: Joe McDonnell


Partitioned exchanges with high skew can send most rows to a single receiver. 
When diagnosing other causes of slow exchanges, it can be useful to know easily 
if the exchange is skewed. Adding some simple statistics can make it clear 
whether skew is happening. The most basic version is to track the number of 
rows sent to each receiver and add a highwatermark counter with the maximum 
number of rows sent to a single receiver. This can also be used as the basis of 
a derived counter the computes the percentage. A profile showing that 95% of 
the rows went to the top receiver would be a clear indication of skew.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to