Hi Shawn,
yes i am running solr in cloud mode and Even after adding the params row=0
and distrib=false, the query response is more than 15 sec due to more than
a billion doc set.
Also the soft commit setting can not be changed to a higher no. due to
requirement from business team.

http://hostname:8983/solr/parts/select?indent=on&q=*:*&rows=0&wt=json&distrib=false
takes more than 10 sec always.

Here are the java heap and G1GC setting i have ,

/usr/java/default/bin/java -server -Xmx31g -Xms31g -XX:+UseG1GC
-XX:MaxGCPauseMillis=250 -XX:ConcGCThreads=5
-XX:ParallelGCThreads=10 -XX:+UseLargePages -XX:+AggressiveOpts
-XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
-XX:InitiatingHeapOccupancyPercent=50 -XX:G1ReservePercent=18
-XX:MaxNewSize=6G -XX:PrintFLSStatistics=1
-XX:+PrintPromotionFailure -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/solr7/logs/heapdump
-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime

JVM  heap has never crossed 20GB in my setup , also Young G1GC timing is
well within milli seconds (in range of 25-200 ms).

On Mon, Aug 5, 2019 at 6:37 PM Shawn Heisey <apa...@elyograg.org> wrote:

> On 8/4/2019 10:15 PM, dinesh naik wrote:
> > My question is regarding the custom query being used. Here i am querying
> > for field _root_ which is available in all of my cluster and defined as a
> > string field. The result for _root_:abc might not get me any match as
> > well(i am ok with not finding any matches, the query should not be taking
> > 10-15 seconds for getting the response).
>
> Typically the *:* query is the fastest option.  It is special syntax
> that means "all documents" and it usually executes very quickly.  It
> will be faster than querying for a value in a specific field, which is
> what you have defined currently.
>
> I will typically add a "rows" parameter to the ping handler with a value
> of 1, so Solr will not be retrieving a large amount of data.  If you are
> running Solr in cloud mode, you should experiment with setting the
> distrib parameter to false, which will hopefully limit the query to the
> receiving node only.
>
> Erick has already mentioned GC pauses as a potential problem.  With a
> 10-15 second response time, I think that has high potential to be the
> underlying cause.
>
> The response you included at the beginning of the thread indicates there
> are 1.3 billion documents, which is going to require a fair amount of
> heap memory.  If seeing such long ping times with a *:* query is
> something that happens frequently, your heap may be too small, which
> will cause frequent full garbage collections.
>
> The very low autoSoftCommit time can contribute to system load.  I think
> it's very likely, especially with such a large index, that in many cases
> those automatic commits are taking far longer than 5 seconds to
> complete.  If that's the case, you're not achieving a 5 second
> visibility interval and you are putting a lot of load on Solr, so I
> would consider increasing it.
>
> Thanks,
> Shawn
>


-- 
Best Regards,
Dinesh Naik

Reply via email to