Re: SOLR not starting after restart 2 node cloud setup

Erick Erickson Tue, 02 Dec 2014 06:57:11 -0800

Glad you found a solution!

Best,
Erick


On Tue, Dec 2, 2014 at 4:30 AM, Doss <itsmed...@gmail.com> wrote:
> Dear Erick,
>
> Thanks for your thoughts, it helped me a lot. In my instances no solr logs
> are appended in to catalina.out.
>
> Now I placed the log4j.properties file. Solr logs are captured in solr.log
> file with the help of it I found the reason for the issue.
>
> I am starting tomcat with the option -Dbootstrap_conf=true which made solr
> to look for core configuration files in a wrong directory, after removing
> this it started without any issues.
>
> I also commented suggester component which made solr to load fast.
>
> Thanks,
> Doss.
>
>
>
>
> On Thu, Nov 20, 2014 at 9:47 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> Doss:
>>
>> Tomcat often puts things in "catalina.out", you might check there,
>> I've often seen logging information from Solr go there by
>> default.
>>
>> Without having some idea what kinds of problems Solr is
>> reporting when you see this situation, it's really hard to say.
>>
>> Some things I'd check first though, in order of what
>> I _guess_ is most likely.
>>
>> > There have been anecdotal reports (in fact, I'm trying
>> to understand the why of it right now) of the suggester
>> taking a long time to initialize, even if you don't use it!
>> So if you're not using the suggest component, try
>> commenting out those sections in solrconfig.xml for
>> the cores in question. I like this explanation since it
>> fits with your symptoms, but I don't like it since the
>> index you are using isn't all that big. So it's something
>> of a shot in the dark. I expect that the core will
>> _eventually_ come up, but I've seen reports of 10-15
>> minutes being required, far beyond my patience! That
>> said, this would also explain why deleting the index
>> works.
>>
>> > OutOfMemory errors. You might be able to attach
>> jConsole (part of the standard Java stuff) to the process
>> and monitor the memory usage. If it's being pushed near
>> the 5G limit that's the first thing I'd suspect.
>>
>> > If you're using the default setups, then the Zookeeper
>> timeout may be too low, I think the default (not sure about
>> whether it's been changed in 4.9) is 15 seconds, 30-60
>> is usually much better.
>>
>> Best,
>> Erick
>>
>>
>> On Thu, Nov 20, 2014 at 3:47 AM, Doss <itsmed...@gmail.com> wrote:
>> > Dear Erick,
>> >
>> > Forgive my ignorance.
>> >
>> > Please find some of the details you required.
>> >
>> > *have you looked at the solr logs?*
>> >
>> >  > Sorry I haven't defined the log4j.properties file, so I don't have
>> solr
>> > logs. Since it requires tomcat restart I am planning to do it in next
>> > restart.
>> >
>> > But found the following in tomcat log
>> >
>> > 18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2]
>> > org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The
>> web
>> > application [/mima] appears to have started a thread named
>> > [localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to
>> > stop it. This is very likely to create a memory leak. Stack trace of
>> thread:
>> >  sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>> >  sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>> >  sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>> >  sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>> >  sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>> >
>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
>> >  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
>> >
>> >
>> > *How big are the cores?*
>> >
>> >> We have 16 cores, out of it only 5 are big ones. Total size of all 16
>> > cores is 10+ GB
>> >
>> > *How many docs in the cores when the problem happens?*
>> >
>> > 1 core with 163 fields and 33,00,000 documents (Index size 2+ GB)
>> >  4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5
>> GB)
>> > remaining cores are 1,00,000 to 40,00,000 documents
>> >
>> > *How much memory are you allocating the JVM? *
>> >
>> > 5GB for JVM, Total RAM available in the systems is 30 GB
>> >
>> > *can you restart Tomcat without a problem?*
>> >
>> > This problem is occurring in production, I never tried.
>> >
>> >
>> > Thanks,
>> > Doss.
>> >
>> >
>> > On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson <erickerick...@gmail.com
>> >
>> > wrote:
>> >
>> >> You've really got to provide details for us to say much
>> >> of anything. There are about a zillion things that it could be.
>> >>
>> >> In particular, have you looked at the solr logs? Are there
>> >> any interesting things in them? How big are the cores?
>> >> How much memory are you allocating the JVM? How
>> >> many docs in the cores when the problem happens?
>> >> Before the nodes stop responding, can you restart
>> >> Tomcat without a problem?
>> >>
>> >> You might review:
>> >> http://wiki.apache.org/solr/UsingMailingLists
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >>
>> >> On Wed, Nov 19, 2014 at 1:04 AM, Doss <itsmed...@gmail.com> wrote:
>> >> > I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At
>> times
>> >> > SOLR in Node 1 stops responding, to fix the issue I am restarting
>> tomcat
>> >> in
>> >> > Node 1, but SOLR not starting up, but if I remove the solr cores in
>> both
>> >> > nodes and try restarting it starts working, and then I have to reindex
>> >> the
>> >> > whole data again. We are using this setup in production because of
>> this
>> >> > issue we are having 1 to 1.30 hours of service down time. Any
>> suggestions
>> >> > would be greatly appreciated.
>> >> >
>> >> > Thanks,
>> >> > Doss.
>> >>
>>

Re: SOLR not starting after restart 2 node cloud setup

Reply via email to