Hi Greg, Did you get an answer? I'm interested in the same question. More generally, what are the benefits of HdfsDirectoryFactory, besides the transparent restore of the shard contents in case of a disk failure, and the ability to rebuild index using MR? Is the next statement exact? blocks of a particular shard, which are replicated to another node, will be never queried, since there is no solr core configured to read them.
On Wed, Aug 7, 2013 at 8:46 PM, Greg Walters <gwalt...@sherpaanalytics.com>wrote: > While testing Solr's new ability to store data and transaction directories > in HDFS I added an additional core to one of my testing servers that was > configured as a backup (active but not leader) core for a shard elsewhere. > It looks like this extra core copies the data into its own directory rather > than just using the existing directory with the data that's already > available to it. > > Since HDFS likely already has redundancy of the data covered via the > replicationFactor is there a reason for non-leader cores to create their > own data directory rather than doing reads on the existing master copy? I > searched Jira for anything that suggests this behavior might change and > didn't find any issues; is there any intent to address this? > > Thanks, > Greg >