Below the output of the debug. I am measuring pure solr qtime which show in the Qtime field in solr xml.
<arr name="parsed_filter_queries"> <str>mrank:[0 TO 100]</str> </arr> <lst name="timing"> <double name="time">8584.0</double> <lst name="prepare"> <double name="time">12.0</double> <lst name="org.apache.solr.handler.component.QueryComponent"> <double name="time">12.0</double> </lst> <lst name="org.apache.solr.handler.component.FacetComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.HighlightComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.StatsComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.SpellCheckComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.DebugComponent"> <double name="time">0.0</double> </lst> </lst> <lst name="process"> <double name="time">8572.0</double> <lst name="org.apache.solr.handler.component.QueryComponent"> <double name="time">4480.0</double> </lst> <lst name="org.apache.solr.handler.component.FacetComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.HighlightComponent"> <double name="time">41.0</double> </lst> <lst name="org.apache.solr.handler.component.StatsComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.SpellCheckComponent"> <double name="time">0.0</double> </lst> <lst name="org.apache.solr.handler.component.DebugComponent"> <double name="time">4051.0</double> </lst> On Tue, Aug 30, 2011 at 5:38 PM, Erick Erickson <erickerick...@gmail.com>wrote: > Can we see the output if you specify both > &debugQuery=on&debug=true > > the debug=true will show the time taken up with various > components, which is sometimes surprising... > > Second, we never asked the most basic question, what are > you measuring? Is this the QTime of the returned response? > (which is the time actually spent searching) or the time until > the response gets back to the client, which may involve lots besides > searching... > > Best > Erick > > On Tue, Aug 30, 2011 at 7:59 AM, Lord Khan Han <khanuniver...@gmail.com> > wrote: > > Hi Eric, > > > > Fields are lazy loading, content stored in solr and machine 32 gig.. solr > > has 20 gig heap. There is no swapping. > > > > As you see we have many phrases in the same query . I couldnt find a way > to > > drop qtime to subsecends. Suprisingly non shingled test better qtime ! > > > > > > On Mon, Aug 29, 2011 at 3:10 PM, Erick Erickson <erickerick...@gmail.com > >wrote: > > > >> Oh, one other thing: have you profiled your machine > >> to see if you're swapping? How much memory are > >> you giving your JVM? What is the underlying > >> hardware setup? > >> > >> Best > >> Erick > >> > >> On Mon, Aug 29, 2011 at 8:09 AM, Erick Erickson < > erickerick...@gmail.com> > >> wrote: > >> > 200K docs and 36G index? It sounds like you're storing > >> > your documents in the Solr index. In and of itself, that > >> > shouldn't hurt your query times, *unless* you have > >> > lazy field loading turned off, have you checked that > >> > lazy field loading is enabled? > >> > > >> > > >> > > >> > Best > >> > Erick > >> > > >> > On Sun, Aug 28, 2011 at 5:30 AM, Lord Khan Han < > khanuniver...@gmail.com> > >> wrote: > >> >> Another insteresting thing is : all one word or more word queries > >> including > >> >> phrase queries such as "barack obama" slower in shingle > configuration. > >> What > >> >> i am doing wrong ? without shingle "barack obama" Querytime 300ms > with > >> >> shingle 780 ms.. > >> >> > >> >> > >> >> On Sat, Aug 27, 2011 at 7:58 PM, Lord Khan Han < > khanuniver...@gmail.com > >> >wrote: > >> >> > >> >>> Hi, > >> >>> > >> >>> What is the difference between solr 3.3 and the trunk ? > >> >>> I will try 3.3 and let you know the results. > >> >>> > >> >>> > >> >>> Here the search handler: > >> >>> > >> >>> <requestHandler name="search" class="solr.SearchHandler" > >> default="true"> > >> >>> <lst name="defaults"> > >> >>> <str name="echoParams">explicit</str> > >> >>> <int name="rows">10</int> > >> >>> <!--<str name="fq">category:vv</str>--> > >> >>> <str name="fq">mrank:[0 TO 100]</str> > >> >>> <str name="echoParams">explicit</str> > >> >>> <int name="rows">10</int> > >> >>> <str name="defType">edismax</str> > >> >>> <!--<str name="qf">title^0.05 url^1.2 content^1.7 > >> >>> m_title^10.0</str>--> > >> >>> <str name="qf">title^1.05 url^1.2 content^1.7 m_title^10.0</str> > >> >>> <!-- <str name="bf">recip(ee_score,-0.85,1,0.2)</str> --> > >> >>> <str name="pf">content^18.0 m_title^5.0</str> > >> >>> <int name="ps">1</int> > >> >>> <int name="qs">0</int> > >> >>> <str name="mm">2<-25%</str> > >> >>> <str name="spellcheck">true</str> > >> >>> <!--<str name="spellcheck.collate">true</str> --> > >> >>> <str name="spellcheck.count">5</str> > >> >>> <str name="spellcheck.dictionary">subobjective</str> > >> >>> <str name="spellcheck.onlyMorePopular">false</str> > >> >>> <str name="hl.tag.pre"><b></str> > >> >>> <str name="hl.tag.post"></b></str> > >> >>> <str name="hl.useFastVectorHighlighter">true</str> > >> >>> </lst> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> On Sat, Aug 27, 2011 at 5:31 PM, Erik Hatcher < > erik.hatc...@gmail.com > >> >wrote: > >> >>> > >> >>>> I'm not sure what the issue could be at this point. I see you've > got > >> >>>> qt=search - what's the definition of that request handler? > >> >>>> > >> >>>> What is the parsed query (from the debugQuery response)? > >> >>>> > >> >>>> Have you tried this with Solr 3.3 to see if there's any appreciable > >> >>>> difference? > >> >>>> > >> >>>> Erik > >> >>>> > >> >>>> On Aug 27, 2011, at 09:34 , Lord Khan Han wrote: > >> >>>> > >> >>>> > When grouping off the query time ie 3567 ms to 1912 ms . > Grouping > >> >>>> > increasing the query time and make useless to cache. But same > config > >> >>>> faster > >> >>>> > without shingle still. > >> >>>> > > >> >>>> > We have and head to head test this wednesday tihs commercial > search > >> >>>> engine. > >> >>>> > So I am looking for all suggestions. > >> >>>> > > >> >>>> > > >> >>>> > > >> >>>> > On Sat, Aug 27, 2011 at 3:37 PM, Erik Hatcher < > >> erik.hatc...@gmail.com > >> >>>> >wrote: > >> >>>> > > >> >>>> >> Please confirm is this is caused by grouping. Turn grouping > off, > >> >>>> what's > >> >>>> >> query time like? > >> >>>> >> > >> >>>> >> > >> >>>> >> On Aug 27, 2011, at 07:27 , Lord Khan Han wrote: > >> >>>> >> > >> >>>> >>> On the other hand We couldnt use the cache for below types > >> queries. I > >> >>>> >> think > >> >>>> >>> its caused from grouping. Anyway we need to be sub second > without > >> >>>> cache. > >> >>>> >>> > >> >>>> >>> > >> >>>> >>> > >> >>>> >>> On Sat, Aug 27, 2011 at 2:18 PM, Lord Khan Han < > >> >>>> khanuniver...@gmail.com > >> >>>> >>> wrote: > >> >>>> >>> > >> >>>> >>>> Hi, > >> >>>> >>>> > >> >>>> >>>> Thanks for the reply. > >> >>>> >>>> > >> >>>> >>>> Here the solr log capture.: > >> >>>> >>>> > >> >>>> >>>> ****** > >> >>>> >>>> > >> >>>> >>>> > >> >>>> >> > >> >>>> > >> > hl.fragsize=100&spellcheck=true&spellcheck.q=XXXXX&group.limit=5&hl.simple.pre=<b>&hl.fl=content&spellcheck.collate=true&wt=javabin&hl=true&rows=20&version=2&fl=score,approved,domain,host,id,lang,mimetype,title,tstamp,url,category&hl.snippets=3&start=0&q=%2BXXXX+-"XXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXXX"+-XXX+-"XXXXX"+-XXXX+-XXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXX+-"XXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXXX"+-"XXXXXX"+-XXXXXX+-XXXXX+-"XXXXX"+"XXXXX"+"XXXXX"+"XXXXXX"++&group.field=host&hl.simple.post=</b>&group=true&qt=search&fq=mrank:[0+TO+100]&fq=word_count:[70+TO+*] > >> >>>> >>>> ****** > >> >>>> >>>> > >> >>>> >>>> XXXX is the words. All phrases "xxxxx" has two words inside. > >> >>>> >>>> > >> >>>> >>>> The timing from the DebugQuery: > >> >>>> >>>> > >> >>>> >>>> <lst name="timing"> > >> >>>> >>>> <double name="time">8654.0</double> > >> >>>> >>>> <lst name="prepare"> > >> >>>> >>>> <double name="time">16.0</double> > >> >>>> >>>> <lst name="org.apache.solr.handler.component.QueryComponent"> > >> >>>> >>>> <double name="time">16.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst name="org.apache.solr.handler.component.FacetComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst > >> name="org.apache.solr.handler.component.MoreLikeThisComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst > name="org.apache.solr.handler.component.HighlightComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst name="org.apache.solr.handler.component.StatsComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst > >> name="org.apache.solr.handler.component.SpellCheckComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst name="org.apache.solr.handler.component.DebugComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst name="process"> > >> >>>> >>>> <double name="time">8638.0</double> > >> >>>> >>>> <lst name="org.apache.solr.handler.component.QueryComponent"> > >> >>>> >>>> <double name="time">4473.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst name="org.apache.solr.handler.component.FacetComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst > >> name="org.apache.solr.handler.component.MoreLikeThisComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst > name="org.apache.solr.handler.component.HighlightComponent"> > >> >>>> >>>> <double name="time">42.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst name="org.apache.solr.handler.component.StatsComponent"> > >> >>>> >>>> <double name="time">0.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst > >> name="org.apache.solr.handler.component.SpellCheckComponent"> > >> >>>> >>>> <double name="time">1.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> <lst name="org.apache.solr.handler.component.DebugComponent"> > >> >>>> >>>> <double name="time">4122.0</double> > >> >>>> >>>> </lst> > >> >>>> >>>> > >> >>>> >>>> > >> >>>> >>>> The funny thing is if I removed the ShingleFilter from the > below > >> >>>> >> "sh_text" > >> >>>> >>>> field and index normally the query time is half of the > current > >> >>>> shingle > >> >>>> >> one > >> >>>> >>>> !. Shouldn't be shingled index better for such heavy 2 word > >> phrases > >> >>>> >> search > >> >>>> >>>> ? I am confused. > >> >>>> >>>> > >> >>>> >>>> On the other hand One of the on the shelf big FAT companies > >> search > >> >>>> >> engine > >> >>>> >>>> doing the same query same machine 0.7 / 0.8 secs without cache > . > >> I am > >> >>>> >>>> confident we can do better in solr but couldnt find the way at > >> the > >> >>>> >> moment. > >> >>>> >>>> > >> >>>> >>>> thanks for helping.. > >> >>>> >>>> > >> >>>> >>>> > >> >>>> >>>> > >> >>>> >>>> > >> >>>> >>>> On Sat, Aug 27, 2011 at 2:46 AM, Erik Hatcher < > >> >>>> erik.hatc...@gmail.com > >> >>>> >>> wrote: > >> >>>> >>>> > >> >>>> >>>>> > >> >>>> >>>>> On Aug 26, 2011, at 17:49 , Lord Khan Han wrote: > >> >>>> >>>>>> We are indexing news document from the various sites. > >> Currently we > >> >>>> >> have > >> >>>> >>>>>> 200K docs indexed. Total index size is 36 gig. There is > also > >> >>>> >>>>> attachement to > >> >>>> >>>>>> the news (pdf -docs etc) So document size could be high (ie > >> 10mb). > >> >>>> >>>>>> > >> >>>> >>>>>> We are using some complex queries which includes around 30 - > 40 > >> >>>> terms > >> >>>> >>>>> per > >> >>>> >>>>>> query. %70 of this terms is two word phrases. We are using > >> >>>> >>>>>> with conjunction + and - to pinpoint exact result. > >> >>>> >>>>>> There is also grouping, dismax and boosting , Termvector HL > . > >> >>>> >>>>> > >> >>>> >>>>> You're using a lot of componentry there, and have complex > >> queries. > >> >>>> We > >> >>>> >>>>> need more details. > >> >>>> >>>>> > >> >>>> >>>>> Turn on debugQuery=true... what do the timings say for each > >> >>>> component? > >> >>>> >>>>> > >> >>>> >>>>>> Our problem is query times. Currently its around 6-7 secs. I > >> know > >> >>>> our > >> >>>> >>>>> query > >> >>>> >>>>>> is little bit heavy but we want to improve query > performance. I > >> >>>> >> believe > >> >>>> >>>>> we > >> >>>> >>>>>> can make it sub second but no succes at the moment. > >> >>>> >>>>> > >> >>>> >>>>> Please provide an example query or two (perhaps a full line > >> logged > >> >>>> from > >> >>>> >>>>> Solr itself), and then let's see what debugQuery says about > your > >> >>>> query > >> >>>> >> being > >> >>>> >>>>> parsed. > >> >>>> >>>>> > >> >>>> >>>>>> We tried to use shingle 2 word token it decreases the query > >> >>>> performcen > >> >>>> >>>>> !! We > >> >>>> >>>>>> assumed it will help the speed up phrases search.. > >> >>>> >>>>> > >> >>>> >>>>> Again, we'd need to see a parsed query to understand this > >> deeper. > >> >>>> >>>>> > >> >>>> >>>>> Lots of synonym expansion? A parsed query will tell us. > >> >>>> >>>>> > >> >>>> >>>>> > >> >>>> >>>>> > >> >>>> >>>>>> (using solr latest trunk and HW is pretty good, 32 core > with > >> 32 > >> >>>> gig > >> >>>> >>>>> ram) > >> >>>> >>>>>> > >> >>>> >>>>>> Here the field def: > >> >>>> >>>>>> > >> >>>> >>>>>> <fieldType name="sh_text" class="solr.TextField" > >> >>>> >>>>> positionIncrementGap="100" > >> >>>> >>>>>> autoGeneratePhraseQueries="true"> > >> >>>> >>>>>> <analyzer type="index"> > >> >>>> >>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >> >>>> >>>>>> <filter class="solr.StopFilterFactory" > ignoreCase="true" > >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" /> > >> >>>> >>>>>> <filter class="solr.WordDelimiterFilterFactory" > >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1" > catenateWords="1" > >> >>>> >>>>>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > >> >>>> >>>>>> <!--<filter class="solr.LowerCaseFilterFactory"/>--> > >> >>>> >>>>>> <filter class="solr.KeywordMarkerFilterFactory" > >> >>>> >>>>>> protected="protwords.txt"/> > >> >>>> >>>>>> <filter class="solr.ShingleFilterFactory" > >> maxShingleSize="2" > >> >>>> >>>>>> outputUnigrams="true"/> > >> >>>> >>>>>> </analyzer> > >> >>>> >>>>>> <analyzer type="query"> > >> >>>> >>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >> >>>> >>>>>> <filter class="solr.SynonymFilterFactory" > >> >>>> >> synonyms="synonyms.txt" > >> >>>> >>>>>> ignoreCase="true" expand="true"/> > >> >>>> >>>>>> <filter class="solr.StopFilterFactory" > ignoreCase="true" > >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" /> > >> >>>> >>>>>> <filter class="solr.WordDelimiterFilterFactory" > >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1" > catenateWords="0" > >> >>>> >>>>>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > >> >>>> >>>>>> <!--<filter class="solr.LowerCaseFilterFactory"/>--> > >> >>>> >>>>>> <filter class="solr.KeywordMarkerFilterFactory" > >> >>>> >>>>>> protected="protwords.txt"/> > >> >>>> >>>>>> <filter class="solr.ShingleFilterFactory" > >> maxShingleSize="2" > >> >>>> >>>>>> outputUnigrams="true"/> > >> >>>> >>>>>> </analyzer> > >> >>>> >>>>>> </fieldType> > >> >>>> >>>>>> > >> >>>> >>>>>> and > >> >>>> >>>>>> > >> >>>> >>>>>> <field name="content" type="sh_text" stored="true" > >> indexed="true" > >> >>>> >>>>>> termVectors="true" termPositions="true" termOffsets="true"/> > >> >>>> >>>>> > >> >>>> >>>>> > >> >>>> >>>> > >> >>>> >> > >> >>>> >> > >> >>>> > >> >>>> > >> >>> > >> >> > >> > > >> > > >