[ 
https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822816#comment-13822816
 ] 

Chris Nauroth commented on HADOOP-10090:
----------------------------------------

Here is a thought regarding the system source issue and reintroducing 
synchronization around {{MetricsSource#getMetrics}} calls.

My understanding of the HADOOP-8050 deadlock is that we had a lock ordering 
conflict between a JMX thread (locking {{MetricsSourceAdapter}} and then 
{{MetricsSystemImpl}}) and a snapshotting thread (locking {{MetricsSystemImpl}} 
and then {{MetricsSourceAdapter}}).  HADOOP-8050 resolved the deadlock by 
releasing the lock on the {{MetricsSourceAdapter}} before calling 
{{MetricsSource#getMetrics}}.

What if instead we do the following:

# Change {{MetricsSourceAdapter#getMetrics}} as follows:
{code}
  Iterable<MetricsRecordImpl> getMetrics(MetricsBuilderImpl builder,
                                         boolean all) {
    synchronized (source) {
      synchronized (this) {
        // existing method logic here
      }
    }
  }
{code}
# Change {{MetricsSystemImpl}} so that it implements {{MetricsSource}} directly 
instead of using an anonymous inner class.

The first part synchronizes {{getMetrics}} calls using a locking order that's 
consistent with the snapshotting threads.  The second part is required so that 
the first part's synchronization on the source is really synchronizing on the 
{{MetricsSystemImpl}} instance instead of the separate anonymous inner class 
instance.

> Jobtracker metrics not updated properly after execution of a mapreduce job
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-10090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10090
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 1.2.1
>            Reporter: Ivan Mitic
>            Assignee: Ivan Mitic
>         Attachments: HADOOP-10090.branch-1.patch, OneBoxRepro.png
>
>
> After executing a wordcount mapreduce sample job, jobtracker metrics are not 
> updated properly. Often times the response from the jobtracker has higher 
> number of job_completed than job_submitted (for example 8 jobs completed and 
> 7 jobs submitted). 
> Issue reported by Toma Paunovic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to