As an update to this thread, it seems my MTree wasn't completely hanging, it was just much slower in 4.10.
If I replace 4.9.0 with 4.10 in my jar the MTree merge stage is 6x (or more) slower (in my case, 20 min becomes 2 hours). I hope to bisect this in the future, but the jobs I'm running take a long time. I haven't tried to see if the issue shows on smaller jobs yet (does 1 minute become 6 minutes?). Brett On Tue, Sep 16, 2014 at 12:54 PM, Brett Hoerner <br...@bretthoerner.com> wrote: > I have a very weird problem that I'm going to try to describe here to see > if anyone has any "ah-ha" moments or clues. I haven't created a small > reproducible project for this but I guess I will have to try in the future > if I can't figure it out. (Or I'll need to bisect by running long Hadoop > jobs...) > > So, the facts: > > * Have been successfully using Solr mapred to build very large Solr > clusters for months > * As of Solr 4.10 *some* job sizes repeatably hang in the MTree merge > phase in 4.10 > * Those same jobs (same input, output, and Hadoop cluster itself) succeed > if I only change my Solr deps to 4.9 > * The job *does succeed* in 4.10 if I use the same data to create more, > but smaller shards (e.g. 12x as many shards each 1/12th the size of the job > that fails) > * Creating my "normal size" shards (the size I want, that works in 4.9) > the job hangs with 2 mappers running, 0 reducers in the MTree merge phase > * There are no errors or warning in the syslog/stderr of the MTree > mappers, no errors ever echo'd back to the "interactive run" of the job > (mapper says 100%, reduce says 0%, will stay forever) > * No CPU being used on the boxes running the merge, no GC happening, JVM > waiting on a futex, all threads blocked on various queues > * No disk usage problems, nothing else obviously wrong with any box in the > cluster > > I diff'ed around between 4.10 and 4.9 and barely see any changes in mapred > contrib, mostly some test stuff. I didn't see any transitive dependency > changes in Solr/Lucene that look like they would affect me. >