Replication on the storage layer will provide a reliable storage for the
index and other data of Solr. In particular, this replication does not
guarantee your index files are consistent at any time as there may be
intermediate states that are only partially replicated. Replication is
only a convergent process, not an instant, atomic operation. With
frequent changes, this becomes an issue.

Replication inside SolrCloud as an application will not only maintain
the consistency of the search-level interfaces to your indexes, but also
scale in the sense of the application (query throughput).

Imagine a database: if you change one record, this may also result in an
index change. If the record and the index are stored in different
storage blocks, one will get replicated first. However, the replication
target will only be consistent again when both have been replicated. So,
you would have to suspend all accesses until the entire replication has
completed. That's undesirable. If you replicate on the application
(database management system) level, the application will employ a more
fine-grained approach to replication, guaranteeing application consistency.

Consequently, HDFS will allow you to scale storage and possibly even
replicate static indexes that won't change, but it won't help much with
live index replication. That's where SolrCloud jumps in.

Cheers,
--Jürgen

On 18.04.2015 08:44, gengmao wrote:
> I wonder why need to use SolrCloud replication on HDFS at all, given HDFS
> already provides replication and availability? The way to optimize
> performance and scalability should be tweaking shards, just like tweaking
> regions on HBase - which doesn't provide "region replication" too, isn't
> it?
>
> I have this question for a while and I didn't find clear answer about it.
> Could some experts please explain a bit?
>
> Best regards,
> Mao Geng
>
>


-- 

Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С
уважением
*i.A. Jürgen Wagner*
Head of Competence Center "Intelligence"
& Senior Cloud Consultant

Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany
Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543
E-Mail: juergen.wag...@devoteam.com
<mailto:juergen.wag...@devoteam.com>, URL: www.devoteam.de
<http://www.devoteam.de/>

------------------------------------------------------------------------
Managing Board: Jürgen Hatzipantelis (CEO)
Address of Record: 64331 Weiterstadt, Germany; Commercial Register:
Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071


Reply via email to