On 7/27/2017 1:30 AM, Atita Arora wrote:
> What OS is Solr running on?  I'm only asking because some additional
> information I'm after has different gathering methods depending on OS.
> Other questions:
>
> /*OpenJDK 64-Bit Server VM (25.141-b16) for linux-amd64 JRE
> (1.8.0_141-b16), built on Jul 20 2017 21:47:59 by "mockbuild" with gcc
> 4.4.7 20120313 (Red Hat 4.4.7-18)*/
> /*Memory: 4k page, physical 264477520k(92198808k free), swap 0k(0k free)*/

Linux is the easiest to get good information from.  Run the "top"
program in a commandline session.  Press shift-M to sort by memory size,
and grab a screenshot.  Share that screenshot with a file sharing site
and give us the URL.

> Is there only one Solr process per machine, or more than one?
> /*On an average yes , one solr process per machine , however , we do
> have a machine (where this log is taken) has two solr processes
> (master and slave)*/

Running a master and a slave on one machine does nothing for
redundancy.  They need to be on separate machines for that to really
help.  As for multiple processes per machine, tou can have many indexes
in one Solr instance -- you don't need more than one in most cases.

> How many total documents are managed by one machine?
> */About 220945 per machine ( and double for this machine as it has
> instance of master as well as other slave)/*
>
> How big is all the index data managed by one machine?
> */The index is about 4G./*

If less than a quarter of a million documents results in a 4GB index,
those documents must be ENORMOUS, or else there is something strange
going on.

> What is the max heap on each Solr process?
> */Max heap is 25G for each Solr Process. (Xms 25g Xmx 25g)/*
> */
> /*
> The reason of choosing RAMDirectory was that it was used in the
> similar manner while the production Solr was on Version 4.3.2, so no
> particular reason but just replicated how it was working , never
> thought this may give troubles.

Set up the slaves just like the masters, with
NRTCachingDirectoryFactory.  For a couple hundred thousand docs, you
probably only need a 2GB heap, possibly even less.

> I had included a pastebin of GC snapshot (the complete log was too big
> to be included in the pastebin , so pasted a sampler)

I asked for the full log because that's what I need to look deeper.  A
sampler won't be enough.  There are file sharing websites for sharing
larger content, and if you compress the file before uploading it, you
should be able to achieve a fairly impressive compression ratio. 
Dropbox is generally a good choice for sharing fairly large content. 
Dropbox also works for image data, like the "top" screenshot I asked for
above.

> Another thing is as we observed the CPU cycles yesterday in high load
> condition we observed that the Highlighter component was taking
> longest , is there anything in particular we forgot to include that
> highlighting doesn't gives a performance hit .
> Attached is the snapshot taken from jvisualvm.

Attachments rarely make it through the mailing list.  Yours didn't, so I
cannot see that snapshot.

I do not know anything about highlighting, so I cannot comment on how
much CPU it takes.  I've never used the feature.

My best idea about why your CPU is so high is problems with garbage
collection.  To look into that, I need to have the full GC log.  The
rest of the information I've asked for will help focus my efforts.

Thanks,
Shawn

Reply via email to