[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated MAPREDUCE-7536:
--------------------------------------
    Labels: pull-request-available  (was: )

> ShuffleChannelHandler logs ERROR for client disconnects during shuffle
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7536
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7536
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Konstantin Bereznyakov
>            Priority: Major
>              Labels: pull-request-available
>
> ShuffleChannelHandler.operationComplete() logs at ERROR level when a 
> ChannelFuture completes unsuccessfully, even for expected conditions like 
> client disconnections. This creates unnecessary noise in shuffle service logs.
>   *Current Behavior*
>   When a shuffle client disconnects during data transfer, the handler logs:
>   ERROR Future is unsuccessful. channel='...' Cause: <ClosedChannelException 
> or connection reset>
>   These are expected during normal operation when reducers timeout, get 
> killed, or disconnect early.
>   E[xpected 
> Behavior|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleChannelHandler.java#L679]
>   Expected disconnection scenarios should be logged at DEBUG level:
>   - ClosedChannelException
>   - Connection reset by peer
>   - Other ignorable I/O errors matching existing IGNORABLE_ERROR_MESSAGE 
> pattern
>   Only unexpected failures should be logged at ERROR level.
>   *Impact*
>   - Log noise in production NodeManager shuffle service logs
>   - Difficult to identify real errors among expected disconnections
>   - Unnecessary alerting in monitoring systems that track ERROR log volume
>   [Affected 
> Component|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleChannelHandler.java#L679]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to