Re: solr reads whole index on startup

2018-12-21 Thread lstusr 5u93n4
Hey Kevin, Sure! We were using the default HDFS blockcache settings and -Xmx6g -XX:MaxDirectMemorySize=6g Thanks! Kyle On Thu, 20 Dec 2018 at 13:15, Kevin Risden wrote: > Kyle - Thanks so much for the followup on this. Rarely do we get to > see results compared with detail. > > Can you share

Re: solr reads whole index on startup

2018-12-20 Thread Kevin Risden
Kyle - Thanks so much for the followup on this. Rarely do we get to see results compared with detail. Can you share the Solr HDFS configuration settings that you tested with? Blockcache and direct memory size? I'd be curious just as a reference point. Kevin Risden On Thu, Dec 20, 2018 at 10:31 A

Re: solr reads whole index on startup

2018-12-20 Thread lstusr 5u93n4
Hi All, To close this off, I'm sad to report that we've come to a end with Solr on HDFS. Here's what we finally did: - created two brand-new identical Solr cloud clusters, one on HDFS and one on local disk. - 1 replica per node. Each node 16GB ram. - Added documents. - Compared start-up times

Re: solr reads whole index on startup

2018-12-10 Thread lstusr 5u93n4
Hi Guys, > What OS is it on? CentOS 7 > With your indexes in HDFS, the HDFS software running > inside Solr also needs heap memory to operate, and is probably going to > set aside part of the heap for caching purposes. We still have the solr.hdfs.blockcache.slab.count parameter set to the defaul

Re: solr reads whole index on startup

2018-12-10 Thread Shawn Heisey
On 12/7/2018 8:54 AM, Erick Erickson wrote: Here's the trap:_Indexing_ doesn't take much memory. The memory is bounded by ramBufferSizeMB, which defaults to 100. This statement is completely true.  But it hides one detail:  A large amount of indexing will allocate this buffer repeatedly.  So

Re: solr reads whole index on startup

2018-12-07 Thread Erick Erickson
bq: The network is not a culprit. We have hbase servers deployed on the same ESXi hosts that access the same target for HDFS storage. These can (and regularly do) push up to 2Gb/s easily Let's assume you're moving 350G, that's still almost 3 minutes. Times however many replicas need to do a full r

Re: solr reads whole index on startup

2018-12-06 Thread lstusr 5u93n4
Hi Erick, First off: " Whether that experience is accurate or not is certainly debatable." Just want to acknowledge the work you put in on these forums, and how much we DO appreciate you helping us out. I've been in this game long enough to know when listening to the experts is a good thing... W

Re: solr reads whole index on startup

2018-12-06 Thread Erick Erickson
First, your indexing rate _probably_ isn't the culprit if it's as slow as you indicate, although testing will tell. bq. could it be that we're waiting TOO LONG between stopping the solr processes on the different servers? At your query rate this is probably not an issue. One thing you might do is

Re: solr reads whole index on startup

2018-12-05 Thread lstusr 5u93n4
Hey Erick, Some thoughts: Solr should _not_ have to replicate the index or go into peer sync on > startup. > Okay, that's good to know! Tells us that this is a problem that can be solved. > > > are you stopping indexing before you shut your servers down? > By indexing, you mean adding new ent

Re: solr reads whole index on startup

2018-12-05 Thread Erick Erickson
Solr should _not_ have to replicate the index or go into peer sync on startup. > are you stopping indexing before you shut your servers down? > Be very sure you have passed your autocommit interval after you've stopped > indexing and before you stop Solr. > How are you shutting down? bin/solr s

Re: solr reads whole index on startup

2018-12-05 Thread lstusr 5u93n4
I just repeated the procedure, same effect. I'm an hour in and it's still recovering. Looked at the autoscaling API, but it's configured not to do anything, which makes sense given the previous output. One thing I did see, just now: solr | 2018-12-05 20:02:37.922 INFO (qtp213195234

Re: solr reads whole index on startup

2018-12-05 Thread lstusr 5u93n4
Hi Kevin, We do have logs. Grepping for peersync, I can see solr | 2018-12-05 03:31:41.301 INFO (coreZkRegister-1-thread-2-processing-n:solr.node2.metaonly01.eca.local:8983_solr) [c:iglshistory s:shard3 r:core_node12 x:iglshistory_shard3_replica_n10] o.a.s.u.PeerSync PeerSync: core=

Re: solr reads whole index on startup

2018-12-05 Thread Kevin Risden
Do you have logs right before the following? "we notice that the nodes go into "Recovering" state for about 10-12 hours before finally coming alive." Is there a peersync failure or something else in the logs indicating why there is a full recovery? Kevin Risden On Wed, Dec 5, 2018 at 12:53 PM