Is the behavior of document being indexed independently on each node in a SolrCloud cluster new in 5.x or is that true in 4.x also?
If the document is indexed independently on each node, then if I query the document from each node directly, a timestamp could hold different values since the document is indexed independently, right? <field name="timestamp" type="date" indexed="true" stored="true" default="NOW" /> Bill On Fri, May 8, 2015 at 6:39 PM, Vincenzo D'Amore <v.dam...@gmail.com> wrote: > I have just added a comment to the CWiki. > Thanks again for your prompt answer Erick. > > Best, > Vincenzo > > On Fri, May 8, 2015 at 12:39 AM, Erick Erickson <erickerick...@gmail.com> > wrote: > > > bq: ...forwards the index notation to itself and any replicas... > > > > That's just odd phrasing. > > > > All that means is that the document sent through the indexing process > > on the leader and all followers for a shard and > > is indexed independently on each. > > > > This is as opposed to the old master/slave situation where the master > > indexed the doc, but the slave got the indexed > > version as part of a segment when it replicated. > > > > Could you add a comment to the CWiki calling the phrasing out? It > > really is a bit mysterious. > > > > Best, > > Erick > > > > On Thu, May 7, 2015 at 2:18 PM, Vincenzo D'Amore <v.dam...@gmail.com> > > wrote: > > > Thanks Shawn. > > > > > > Just to make the picture more clear, I'm trying to understand why a 3 > > node > > > solrcloud cluster and a old style solr server take same time to index > > same > > > documents. > > > > > > But in the wiki is written: > > > > > > If the machine is a leader, SolrCloud determines which shard the > document > > >> should go to, forwards the document the leader for that shard, indexes > > the > > >> document for this shard, and *forwards the index notation to itself > and > > >> any replicas*. > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud > > > > > > > > > Could you please explain what does it mean "forwards the index > notation" > > ? > > > > > > On the other hand, on solrcloud I have 3 shards and 2 replicas for each > > > shard. So, every node is indexing all the documents and this explains > why > > > solrcloud consumes same time compared to an old-style solr server. > > > > > > > > > > > > On Thu, May 7, 2015 at 3:08 PM, Shawn Heisey <apa...@elyograg.org> > > wrote: > > > > > >> On 5/7/2015 3:04 AM, Vincenzo D'Amore wrote: > > >> > Thanks Erick. I'm not sure I got your answer. > > >> > > > >> > I try to recap, when the raw document has to be indexed, it will be > > >> > forwarded to shard leader. Shard leader indexes the document for > that > > >> > shard, and then forwards the indexed document to any replicas. > > >> > > > >> > I want just be sure that when the raw document is forwarded from the > > >> leader > > >> > to the replicas it will be indexed only one time on the shard > leader. > > >> From > > >> > what I understand replicas do not indexes, only the leader indexes. > > >> > > >> The document is indexed by all replicas. There is no way to forward > the > > >> indexed document, it can only forward the source document ... so each > > >> replica must index it independently. > > >> > > >> The old-style master-slave replication (which existed long before > > >> SolrCloud) copies the finished Lucene segments, so only the master > > >> actually does indexing. > > >> > > >> SolrCloud doesn't have a master, only multiple replicas, one of which > is > > >> elected leader, and replication only comes into the picture if > there's a > > >> serious problem and Solr determines that it can't use the transaction > > >> log to recover the index. > > >> > > >> Thanks, > > >> Shawn > > >> > > >> > > > > > > > > > -- > > > Vincenzo D'Amore > > > email: v.dam...@gmail.com > > > skype: free.dev > > > mobile: +39 349 8513251 > > > > > > -- > Vincenzo D'Amore > email: v.dam...@gmail.com > skype: free.dev > mobile: +39 349 8513251 >