Indexes don't synch when node with old data is returned to cluster

2017-09-20 Thread Joe Heasly
Hello,

We have just moved from solr 4.6 master/slave to 6.4.2 SolrCloud.  We have 
three collections, each with a single shard and a varying number of replicas, 
all kept by an ensemble of three zooKeepers (on their own hosts).  As an 
ecommerce site, our capacity needs vary so we add and remove replicas with some 
frequency.  The basic topology is like this:

solr1
  |- collection1
|- shard1 - replica1
  |- collection2
|- shard1 - replica1
  |- collection3
|- shard1 - replica1
 .
 .
 .
solrN
  |- collection1
|- shard1 - replicaN
  |- collection2
|- shard1 - replicaN
  |- collection3
|- shard1 - replicaN

Where N varies between three and six most of the time.

During a recent test, we ran our indexing processes to a set of nodes, and then 
two nodes were removed from our configuration.  Subsequently the remaining 
nodes were reindexed, without problems.  The two nodes that had been previously 
removed (by simply stopping solr on those boxes) were brought back into the 
cluster by starting solr with the appropriate zkHost strings.  (These were the 
same zkHosts as when the instances were stopped.)  We found that the indexes 
did not synch up until we re-indexed the entire cluster.

What are we missing?  We need the re-added indexes to synchronize with those 
already active in the cluster.  If we have to re-index the whole cluster, we 
risk inconsistent results being served from the new nodes while indexing is 
going on.  In reviewing the Reference Guide and doing various searches, I 
haven't found anything that clearly references adding replicas to a cluster 
when the cores already contain data.

Thank you for any insights,
Joe

Joe Heasly, Systems Analyst I
L.L.Bean, Inc. ~ Direct Channel Business & Technology Team
Office: 207.552.2254
Cell:207.756.9250



RE: How different is solr 4.7 from latest version.

2018-01-12 Thread Joe Heasly
Srini,

We upgraded from Solr 4.6 to 6.4 last summer.  There are fundamental 
differences between those versions in the way the default Boolean operator and 
'minimum should match' functions interact.  Here's an excellent discussion of 
the change here (Jason Hellman does it more justice than I could hope to):

http://blog.innoventsolutions.com/innovent-solutions-blog/2017/02/solr-edismax-boolean-query.html

If you're just starting out and you're starting with a recent version, this 
won't matter.  But if you're upgrading, it's critical to be aware.

Regards,
Joe

{ Joe Heasly | L.L.Bean, Inc. | [O] 207 552-2254 [M] 207 756-9250 }

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Friday, January 12, 2018 10:02 AM
To: solr-user@lucene.apache.org
Subject: Re: How different is solr 4.7 from latest version.

On 1/12/2018 5:58 AM, srini sampath wrote:
> I am reading a book (Solr in action
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.amazon.in_So
> lr-2DAction-2DTrey-2DGrainger_dp_1617291021&d=DwICaQ&c=uC6H3HqR7J0hkle
> XqZF0oA&r=LGfOV9gkzZFmyXgI5jYqo5FeO_fORxZZyF8winHfJ8s&m=xoJLS_ZP0u4l7P
> AGZslEYaLCEBqnoJfoXeneaibCb-8&s=rPpw1EkXBvtuGGJXE7VSTHcgViWw_6X0Au7zZW
> Af4iw&e=>) to understand how to work with different features in solr. It uses 
> solr 4.7 to explain features. But I don't find any better material (IMHO, 
> documentation has many looped references which makes it too difficult to 
> understand for a newbie).
> 
> Does it cover all the features related to new version (like important
> features) or is it better to follow some other resource?

The latest version is 7.2, and the 7.2.1 release is being finalized now.

That's three major versions newer.  Most of the info in that book will still be 
relevant, but there is quite a bit of new functionality.

Here's the reference guide that is published as official documentation. 
You can download this as a PDF using the "Other Formats" link at the top of the 
page:

https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_7-5F2_&d=DwICaQ&c=uC6H3HqR7J0hkleXqZF0oA&r=LGfOV9gkzZFmyXgI5jYqo5FeO_fORxZZyF8winHfJ8s&m=xoJLS_ZP0u4l7PAGZslEYaLCEBqnoJfoXeneaibCb-8&s=ISiRaa2CCBrg1Y3ElGGY8_I0fCvSGGKiJGjU8TcsAas&e=

Full disclosure: There isn't very much available for extreme beginners. 
This lack is something the project is aware of, but writing documentation for 
the uninitiated is a difficult task.  The reference guide isn't awful, but it 
could be a lot better.

For differences between versions, there is the CHANGES.txt file included in 
every download.  The reference guide also has a section about big differences 
from the previous major version.

Thanks,
Shawn