Hi Jan, Thanks for your reply.
However, we are still getting a slow QTime of 517ms even after we set hl=false&fl=null. Below is the debug query: "debug":{ "rawquerystring":"cherry", "querystring":"cherry", "parsedquery":"searchFields_tcs:cherry", "parsedquery_toString":"searchFields_tcs:cherry", "explain":{ "46226513":"\n14.227914 = weight(searchFields_tcs:cherry in 5747763) [SchemaSimilarity], result of:\n 14.227914 = score(doc=5747763,freq=3.0 = termFreq=3.0\n), product of:\n 9.614556 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:\n 400.0 = docFreq\n 6000000.0 = docCount\n 1.4798305 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:\n 3.0 = termFreq=3.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 19.397041 = avgFieldLength\n 25.0 = fieldLength\n", "54088731":"\n13.937909 = weight(searchFields_tcs:cherry in 4840794) [SchemaSimilarity], result of:\n 13.937909 = score(doc=4840794,freq=3.0 = termFreq=3.0\n), product of:\n 9.614556 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:\n 400.0 = docFreq\n 6000000.0 = docCount\n 1.4496675 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:\n 3.0 = termFreq=3.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 19.397041 = avgFieldLength\n 27.0 = fieldLength\n", "QParser":"LuceneQParser", "timing":{ "time":517.0, "prepare":{ "time":0.0, "query":{ "time":0.0}, "facet":{ "time":0.0}, "facet_module":{ "time":0.0}, "mlt":{ "time":0.0}, "highlight":{ "time":0.0}, "stats":{ "time":0.0}, "expand":{ "time":0.0}, "terms":{ "time":0.0}, "debug":{ "time":0.0}}, "process":{ "time":516.0, "query":{ "time":15.0}, "facet":{ "time":0.0}, "facet_module":{ "time":0.0}, "mlt":{ "time":0.0}, "highlight":{ "time":0.0}, "stats":{ "time":0.0}, "expand":{ "time":0.0}, "terms":{ "time":0.0}, "debug":{ "time":500.0}}}}} Regards, Edwin On Thu, 24 Jan 2019 at 22:43, Jan Høydahl <jan....@cominvent.com> wrote: > Looks like highlighting takes most of the time on the first query (680ms). > You config seems to ask for a lot of highlighting here, like 100 snippets > of max 100000 characters etc. > Sounds to me that this might be a highlighting configuration problem. Try > to disable highlighting (hl=false) and see if you get back your speed. > Also, I see fl=* in your config, which is really asking for all fields. > Are you sure you want that, that may also be slow. Try to ask for just the > fields you will be using. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > > 24. jan. 2019 kl. 14:59 skrev Zheng Lin Edwin Yeo <edwinye...@gmail.com > >: > > > > Thanks for your reply. > > > > Below are what you have requested about our Solr setup, configurations > > files ,schema and results of debug queries: > > > > Looking forward to your advice and support on our problem. > > > > 1. System configurations > > OS: Windows 10 Pro 64 bit > > System Memory: 32GB > > CPU: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz, 4 Core(s), 8 Logical > > Processor(s) > > HDD: 3.0 TB (free 2.1 TB) SATA > > > > 2. solrconfig.xml of customers and policies collection, and solr.in,cmd > > which can be download from the following link: > > > https://drive.google.com/file/d/1AATjonQsEC5B0ldz27Xvx5A55Dp5ul8K/view?usp=sharing > > > > 3. The debug queries from both collections > > > > *3.1. Debug Query From Policies ( which is Slow)* > > > > "debug":{ > > > > "rawquerystring":"sherry", > > > > "querystring":"sherry", > > > > "parsedquery":"searchFields_tcs:sherry", > > > > "parsedquery_toString":"searchFields_tcs:sherry", > > > > "explain":{ > > > > "31702988":"\n14.540428 = weight(searchFields_tcs:sherry in > > 3097315) [SchemaSimilarity], result of:\n 14.540428 = > > score(doc=3097315,freq=5.0 = termFreq=5.0\n), product of:\n > > 8.907154 = idf, computed as log(1 + (docCount - docFreq + 0.5) / > > (docFreq + 0.5)) from:\n 812.0 = docFreq\n 6000000.0 = > > docCount\n 1.6324438 = tfNorm, computed as (freq * (k1 + 1)) / > > (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:\n > > 5.0 = termFreq=5.0\n 1.2 = parameter k1\n 0.75 = parameter > > b\n 19.397041 = avgFieldLength\n 31.0 = fieldLength\n”,.. > > > > "QParser":"LuceneQParser", > > > > "timing":{ > > > > "time":681.0, > > > > "prepare":{ > > > > "time":0.0, > > > > "query":{ > > > > "time":0.0}, > > > > "facet":{ > > > > "time":0.0}, > > > > "facet_module":{ > > > > "time":0.0}, > > > > "mlt":{ > > > > "time":0.0}, > > > > "highlight":{ > > > > "time":0.0}, > > > > "stats":{ > > > > "time":0.0}, > > > > "expand":{ > > > > "time":0.0}, > > > > "terms":{ > > > > "time":0.0}, > > > > "debug":{ > > > > "time":0.0}}, > > > > "process":{ > > > > "time":680.0, > > > > "query":{ > > > > "time":19.0}, > > > > "facet":{ > > > > "time":0.0}, > > > > "facet_module":{ > > > > "time":0.0}, > > > > "mlt":{ > > > > "time":0.0}, > > > > "highlight":{ > > > > "time":651.0}, > > > > "stats":{ > > > > "time":0.0}, > > > > "expand":{ > > > > "time":0.0}, > > > > "terms":{ > > > > "time":0.0}, > > > > "debug":{ > > > > "time":8.0}}, > > > > "loadFieldValues":{ > > > > "time":12.0}}}} > > > > > > > > *3.2. Debug Query From Customers (which is fast because we index it after > > indexing Policies):* > > > > > > > > "debug":{ > > > > "rawquerystring":"sherry", > > > > "querystring":"sherry", > > > > "parsedquery":"searchFields_tcs:sherry", > > > > "parsedquery_toString":"searchFields_tcs:sherry", > > > > "explain":{ > > > > "S7900271B":"\n13.191501 = weight(searchFields_tcs:sherry in > > 2453665) [SchemaSimilarity], result of:\n 13.191501 = > > score(doc=2453665,freq=3.0 = termFreq=3.0\n), product of:\n 9.08604 > > = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + > > 0.5)) from:\n 428.0 = docFreq\n 3784142.0 = docCount\n > > 1.4518428 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - > > b + b * fieldLength / avgFieldLength)) from:\n 3.0 = > > termFreq=3.0\n 1.2 = parameter k1\n 0.75 = parameter b\n > > 20.22558 = avgFieldLength\n 28.0 = fieldLength\n”, .. > > > > "QParser":"LuceneQParser", > > > > "timing":{ > > > > "time":38.0, > > > > "prepare":{ > > > > "time":1.0, > > > > "query":{ > > > > "time":1.0}, > > > > "facet":{ > > > > "time":0.0}, > > > > "facet_module":{ > > > > "time":0.0}, > > > > "mlt":{ > > > > "time":0.0}, > > > > "highlight":{ > > > > "time":0.0}, > > > > "stats":{ > > > > "time":0.0}, > > > > "expand":{ > > > > "time":0.0}, > > > > "terms":{ > > > > "time":0.0}, > > > > "debug":{ > > > > "time":0.0}}, > > > > "process":{ > > > > "time":36.0, > > > > "query":{ > > > > "time":1.0}, > > > > "facet":{ > > > > "time":0.0}, > > > > "facet_module":{ > > > > "time":0.0}, > > > > "mlt":{ > > > > "time":0.0}, > > > > "highlight":{ > > > > "time":31.0}, > > > > "stats":{ > > > > "time":0.0}, > > > > "expand":{ > > > > "time":0.0}, > > > > "terms":{ > > > > "time":0.0}, > > > > "debug":{ > > > > "time":3.0}}, > > > > "loadFieldValues":{ > > > > "time":13.0}}}} > > > > > > > > Best Regards, > > Edwin > > > > On Thu, 24 Jan 2019 at 20:57, Jan Høydahl <jan....@cominvent.com> wrote: > > > >> It would be useful if you can disclose the machine configuration, OS, > >> memory, settings etc, as well as solr config including solr.in < > >> http://solr.in/>.sh, solrconfig.xml etc, so we can see the whole > picture > >> of memory, GC, etc. > >> You could also specify debugQuery=true on a slow search and check the > >> timings section for clues. What QTime are you seeing on the slow > queries in > >> solr.log? > >> If that does not reveal the reason, I'd connect to your solr instance > with > >> a tool like jVisualVM or similar, to inspect what takes time. Or better, > >> hook up to DataDog, SPM or some other cloud tool to get a full view of > the > >> system. > >> > >> -- > >> Jan Høydahl, search solution architect > >> Cominvent AS - www.cominvent.com > >> > >>> 24. jan. 2019 kl. 13:42 skrev Zheng Lin Edwin Yeo < > edwinye...@gmail.com > >>> : > >>> > >>> Hi Shawn, > >>> > >>> Unfortunately your reply of memory may not be valid. Please refer to my > >>> explanation below of the strange behaviors (is it much more like a BUG > >> than > >>> anything else that is explainable): > >>> > >>> Note that we still have 18GB of free unused memory on the server. > >>> > >>> 1. We indexed the first collection called customers (3.7 millioin > records > >>> from CSV data), index size is 2.09GB. The search in customers for any > >>> keyword is returned within 50ms (QTime) for using highlight (unified > >>> highlighter, posting, light term vectors) > >>> > >>> 2. Then we indexed the second collection called policies (6 million > >> records > >>> from CSV data), index size is 2.55GB. The search in policies for any > >>> keyword is returned within 50ms (QTime) for using highlight (unified > >>> highlighter, posting, light term vectors) > >>> > >>> 3. But now any search in customers for any keywords (not from cache) > >> takes > >>> as high as 1200ms (QTime). But still policies search remains very fast > >>> (50ms). > >>> > >>> 4. So we decided to run the force optimize command on customers > >> collection ( > >>> > >> > https://localhost:8983/edm/customers/update?optimize=true&numSegments=1&waitFlush=false > >> ), > >>> surprisingly after optimization the search on customers collection for > >> any > >>> keywords become very fast again (less than 50ms). BUT strangely, the > >> search > >>> in policies collection become very slow (around 1200ms) without any > >> changes > >>> to the policies collection. > >>> > >>> 5. Based on above result, we decided to run the force optimize command > on > >>> policies collection ( > >>> > >> > https://localhost:8983/edm/policies/update?optimize=true&numSegments=1&waitFlush=false > >> ). > >>> More surprisingly, after optimization the search on policies collection > >> for > >>> any keywords become very fast again (less than 50ms). BUT more > strangely, > >>> the search in customers collection again become very slow (around > 1200ms) > >>> without any changes to the customers collection. > >>> > >>> What a strange and unexpected behavior! If this is not a bug, how could > >> you > >>> explain the above very strange behavior in Solr 7.5. Could it be a bug? > >>> > >>> We would appreciate any support or help on our above situation. > >>> > >>> Thank you. > >>> > >>> Regards, > >>> Edwin > >>> > >>> On Thu, 24 Jan 2019 at 16:14, Zheng Lin Edwin Yeo < > edwinye...@gmail.com> > >>> wrote: > >>> > >>>> Hi Shawn, > >>>> > >>>>> If the two collections have data on the same server(s), I can see > this > >>>>> happening. More memory is consumed when there is additional data, > and > >>>>> when Solr needs more memory, performance might be affected. The > >>>>> solution is generally to install more memory in the server. > >>>> > >>>> I have found that even after we delete the index in collection2, the > >> query > >>>> QTime for collection1 still remains slow. It does not goes back to its > >>>> previous fast speed before we index collection2. > >>>> > >>>> Regards, > >>>> Edwin > >>>> > >>>> > >>>> On Thu, 24 Jan 2019 at 11:13, Zheng Lin Edwin Yeo < > edwinye...@gmail.com > >>> > >>>> wrote: > >>>> > >>>>> Hi Shawn, > >>>>> > >>>>> Thanks for your reply. > >>>>> > >>>>> The log only shows a list the following and I don't see any other > logs > >>>>> besides these. > >>>>> > >>>>> 2019-01-24 02:47:57.925 INFO (qtp2131952342-1330) [c:collectioin1 > >>>>> s:shard1 r:core_node4 x:collection1_shard1_replica_n2] > >>>>> o.a.s.u.p.StatelessScriptUpdateProcessorFactory > >> update-script#processAdd: > >>>>> id=13245417 > >>>>> 2019-01-24 02:47:57.957 INFO (qtp2131952342-1330) [c:collectioin1 > >>>>> s:shard1 r:core_node4 x:collection1_shard1_replica_n2] > >>>>> o.a.s.u.p.StatelessScriptUpdateProcessorFactory > >> update-script#processAdd: > >>>>> id=13245430 > >>>>> 2019-01-24 02:47:57.957 INFO (qtp2131952342-1330) [c:collectioin1 > >>>>> s:shard1 r:core_node4 x:collection1_shard1_replica_n2] > >>>>> o.a.s.u.p.StatelessScriptUpdateProcessorFactory > >> update-script#processAdd: > >>>>> id=13245435 > >>>>> > >>>>> There is no change to the segments info. but the slowdown in the > first > >>>>> collection is very drastic. > >>>>> Before the indexing of collection2, the collection1 query QTime are > in > >>>>> the range of 4 to 50 ms. However, after indexing collection2, the > >>>>> collection1 query QTime increases to more than 1000 ms. The index are > >> done > >>>>> in CSV format, and the size of the index is 3GB. > >>>>> > >>>>> Regards, > >>>>> Edwin > >>>>> > >>>>> > >>>>> > >>>>> On Thu, 24 Jan 2019 at 01:09, Shawn Heisey <apa...@elyograg.org> > >> wrote: > >>>>> > >>>>>> On 1/23/2019 10:01 AM, Zheng Lin Edwin Yeo wrote: > >>>>>>> I am using Solr 7.5.0, and currently I am facing an issue of when I > >> am > >>>>>>> indexing in collection2, the indexing affects the records in > >>>>>> collection1. > >>>>>>> Although the records are still intact, it seems that the settings > of > >>>>>> the > >>>>>>> termVecotrs get wipe out, and the index size of collection1 reduced > >>>>>> from > >>>>>>> 3.3GB to 2.1GB after I do the indexing in collection2. > >>>>>> > >>>>>> This should not be possible. Indexing in one collection should have > >>>>>> absolutely no effect on another collection. > >>>>>> > >>>>>> If logging has been left at its default settings, the solr.log file > >>>>>> should have enough info to show what actually happened. > >>>>>> > >>>>>>> Also, the search in > >>>>>>> collection1, which was originall very fast, becomes very slow after > >> the > >>>>>>> indexing is done is collection2. > >>>>>> > >>>>>> If the two collections have data on the same server(s), I can see > this > >>>>>> happening. More memory is consumed when there is additional data, > and > >>>>>> when Solr needs more memory, performance might be affected. The > >>>>>> solution is generally to install more memory in the server. If the > >>>>>> system is working, there should be no need to increase the heap size > >>>>>> when the memory size increases ... but there can be situations where > >> the > >>>>>> heap is a little bit too small, where you WOULD want to increase the > >>>>>> heap size. > >>>>>> > >>>>>> Thanks, > >>>>>> Shawn > >>>>>> > >>>>>> > >> > >> > >