I wonder why need to use SolrCloud replication on HDFS at all, given HDFS
already provides replication and availability? The way to optimize
performance and scalability should be tweaking shards, just like tweaking
regions on HBase - which doesn't provide "region replication" too, isn't
it?

I have this question for a while and I didn't find clear answer about it.
Could some experts please explain a bit?

Best regards,
Mao Geng

On Thu, Apr 9, 2015 at 8:41 AM Erick Erickson <erickerick...@gmail.com>
wrote:

> Yes. 3 replicas and an HDFS replication factor of 3 means 9 copies of
> the index are laying around. You can change your HDFS replication
> factor, but that affects other applications using HDFS, so that may
> not be an option.
>
> Best,
> Erick
>
> On Thu, Apr 9, 2015 at 2:31 AM, Vijaya Narayana Reddy Bhoomi Reddy
> <vijaya.bhoomire...@whishworks.com> wrote:
> > Hi,
> >
> > Can anyone please tell me how does shard replication work when the
> indexes
> > are stored in HDFS? i..e with HDFS, the default replication factor is 3.
> > Now, for the Solr shards, if I set the replication factor to 3 again,
> does
> > that mean, internally index data is replicated thrice and then HDFS
> > replication works on top it again and duplicates the data across HDFS
> > cluster?
> >
> >
> > Thanks & Regards
> > Vijay
> >
> > --
> > The contents of this e-mail are confidential and for the exclusive use of
> > the intended recipient. If you receive this e-mail in error please delete
> > it from your system immediately and notify us either by e-mail or
> > telephone. You should not copy, forward or otherwise disclose the content
> > of the e-mail. The views expressed in this communication may not
> > necessarily be the view held by WHISHWORKS.
>

Reply via email to