Re: Problems using MapReduceIndexerTool with multiple reducers

Erick Erickson Mon, 11 Jan 2016 17:22:53 -0800

Hmm, it looks like you created your collection with the "implicit"
router. Does the same thing happen when you use the default
compositeId router?


Note, this should be OK with either, this is just to gather more info.

Other questions:
1> Are you running MRIT over Solr indexes that are actually hosted on HDFS?
2> Are you using the --go-live option?

Actually, can you show us the entire command you use to invoke MRIT?

Best,
Erick

On Mon, Jan 11, 2016 at 4:18 PM, Douglas Rapp <dougma...@gmail.com> wrote:
> Hello,
>
> I am using Solr 4.10.4 in SolrCloud mode, but so far with only a single
> instance (so just a single shard - not very cloud-like..).
>
> I have been experimenting using the MapReduceIndexerTool to handle batch
> indexing of CSV files in HDFS. I got it working on a weaker single-node
> Hadoop test system, so I have been trying to do some performance testing on
> a 4-node Hadoop cluster (1 NameNode, 3 DataNode) with better hardware. The
> issue that I have come across is that the job will only finish successfully
> if I specify a single reducer (using the "--reducers 1" option upon
> invoking the tool).
>
> If the tool is invoked without specifying a number for mappers/reducers, it
> appears that it tries to utilize the maximum number available. In my case,
> it tries to use 16 mappers and 6 reducers. I have tried specifying many
> different combinations, and what I have found is that I can tweak the
> number of mappers to just about anything, but reducers must stay at "1" or
> else the job fails. Also explains why I never saw this pop up on the first
> system - looking closer at it, it defaults to only 1 reducer there. If I
> try to increase it, I get the same failure. When the job fails, I get the
> following stack trace:
>
> 6602 [main] WARN  org.apache.hadoop.mapred.YarnChild  - Exception running
> child : org.kitesdk.morphline.api.MorphlineRuntimeException:
> java.lang.IllegalStateException: No matching slice found! The slice seems
> unavailable. docRouterClass: org.apache.solr.common.cloud.ImplicitDocRouter
>         at
> org.kitesdk.morphline.base.FaultTolerance.handleException(FaultTolerance.java:73)
>         at
> org.apache.solr.hadoop.morphline.MorphlineMapRunner.map(MorphlineMapRunner.java:213)
>         at
> org.apache.solr.hadoop.morphline.MorphlineMapper.map(MorphlineMapper.java:86)
>         at
> org.apache.solr.hadoop.morphline.MorphlineMapper.map(MorphlineMapper.java:54)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.IllegalStateException: No matching slice found! The
> slice seems unavailable. docRouterClass:
> org.apache.solr.common.cloud.ImplicitDocRouter
>         at
> org.apache.solr.hadoop.SolrCloudPartitioner.getPartition(SolrCloudPartitioner.java:120)
>         at
> org.apache.solr.hadoop.SolrCloudPartitioner.getPartition(SolrCloudPartitioner.java:49)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
>         at
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>         at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>         at
> org.apache.solr.hadoop.morphline.MorphlineMapper$MyDocumentLoader.load(MorphlineMapper.java:138)
>         at
> org.apache.solr.morphlines.solr.LoadSolrBuilder$LoadSolr.doProcess(LoadSolrBuilder.java:129)
>         at
> org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)
>         at org.kitesdk.morphline.base.Connector.process(Connector.java:64)
>         at
> org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)
>         at
> org.apache.solr.morphlines.solr.SanitizeUnknownSolrFieldsBuilder$SanitizeUnknownSolrFields.doProcess(SanitizeUnknownSolrFieldsBuilder.java:94)
>         at
> org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)
>         at org.kitesdk.morphline.base.Connector.process(Connector.java:64)
>         at
> org.kitesdk.morphline.stdio.ReadCSVBuilder$ReadCSV.doProcess(ReadCSVBuilder.java:124)
>         at
> org.kitesdk.morphline.stdio.AbstractParser.doProcess(AbstractParser.java:93)
>         at
> org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)
>         at
> org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)
>         at
> org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)
>         at
> org.apache.solr.hadoop.morphline.MorphlineMapRunner.map(MorphlineMapRunner.java:201)
>         ... 10 more
>
> When I try searching online for "No matching slice found", the only results
> I get back are of the source code.. I can't seem to find anything to lead
> me in the right direction.
>
> Looking at the MapReduceIndexerTool more closely, it says that when using
> more than one reducer per output shard (so in my case, >1) it will utilize
> the "mtree" merge algorithm to merge the results held among several
> mini-shards. I'm guessing this might have something to do with it, but I
> can't find any other information on how this might be further tweaked or
> debugged.
>
> I can provide any additional information (environment settings, config
> files, etc) on request. Any help would be appreciated.
>
> Thanks,
> Doug

Re: Problems using MapReduceIndexerTool with multiple reducers

Reply via email to