OK, I'll have to defer because this makes no sense. 4+ seconds in the debug component?
Sorry I can't be more help here, but nothing really jumps out. Erick On Tue, Aug 30, 2011 at 12:45 PM, Lord Khan Han <khanuniver...@gmail.com> wrote: > Below the output of the debug. I am measuring pure solr qtime which show in > the Qtime field in solr xml. > > <arr name="parsed_filter_queries"> > <str>mrank:[0 TO 100]</str> > </arr> > <lst name="timing"> > <double name="time">8584.0</double> > <lst name="prepare"> > <double name="time">12.0</double> > <lst name="org.apache.solr.handler.component.QueryComponent"> > <double name="time">12.0</double> > </lst> > <lst name="org.apache.solr.handler.component.FacetComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.MoreLikeThisComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.HighlightComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.StatsComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.SpellCheckComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.DebugComponent"> > <double name="time">0.0</double> > </lst> > </lst> > <lst name="process"> > <double name="time">8572.0</double> > <lst name="org.apache.solr.handler.component.QueryComponent"> > <double name="time">4480.0</double> > </lst> > <lst name="org.apache.solr.handler.component.FacetComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.MoreLikeThisComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.HighlightComponent"> > <double name="time">41.0</double> > </lst> > <lst name="org.apache.solr.handler.component.StatsComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.SpellCheckComponent"> > <double name="time">0.0</double> > </lst> > <lst name="org.apache.solr.handler.component.DebugComponent"> > <double name="time">4051.0</double> > </lst> > > On Tue, Aug 30, 2011 at 5:38 PM, Erick Erickson > <erickerick...@gmail.com>wrote: > >> Can we see the output if you specify both >> &debugQuery=on&debug=true >> >> the debug=true will show the time taken up with various >> components, which is sometimes surprising... >> >> Second, we never asked the most basic question, what are >> you measuring? Is this the QTime of the returned response? >> (which is the time actually spent searching) or the time until >> the response gets back to the client, which may involve lots besides >> searching... >> >> Best >> Erick >> >> On Tue, Aug 30, 2011 at 7:59 AM, Lord Khan Han <khanuniver...@gmail.com> >> wrote: >> > Hi Eric, >> > >> > Fields are lazy loading, content stored in solr and machine 32 gig.. solr >> > has 20 gig heap. There is no swapping. >> > >> > As you see we have many phrases in the same query . I couldnt find a way >> to >> > drop qtime to subsecends. Suprisingly non shingled test better qtime ! >> > >> > >> > On Mon, Aug 29, 2011 at 3:10 PM, Erick Erickson <erickerick...@gmail.com >> >wrote: >> > >> >> Oh, one other thing: have you profiled your machine >> >> to see if you're swapping? How much memory are >> >> you giving your JVM? What is the underlying >> >> hardware setup? >> >> >> >> Best >> >> Erick >> >> >> >> On Mon, Aug 29, 2011 at 8:09 AM, Erick Erickson < >> erickerick...@gmail.com> >> >> wrote: >> >> > 200K docs and 36G index? It sounds like you're storing >> >> > your documents in the Solr index. In and of itself, that >> >> > shouldn't hurt your query times, *unless* you have >> >> > lazy field loading turned off, have you checked that >> >> > lazy field loading is enabled? >> >> > >> >> > >> >> > >> >> > Best >> >> > Erick >> >> > >> >> > On Sun, Aug 28, 2011 at 5:30 AM, Lord Khan Han < >> khanuniver...@gmail.com> >> >> wrote: >> >> >> Another insteresting thing is : all one word or more word queries >> >> including >> >> >> phrase queries such as "barack obama" slower in shingle >> configuration. >> >> What >> >> >> i am doing wrong ? without shingle "barack obama" Querytime 300ms >> with >> >> >> shingle 780 ms.. >> >> >> >> >> >> >> >> >> On Sat, Aug 27, 2011 at 7:58 PM, Lord Khan Han < >> khanuniver...@gmail.com >> >> >wrote: >> >> >> >> >> >>> Hi, >> >> >>> >> >> >>> What is the difference between solr 3.3 and the trunk ? >> >> >>> I will try 3.3 and let you know the results. >> >> >>> >> >> >>> >> >> >>> Here the search handler: >> >> >>> >> >> >>> <requestHandler name="search" class="solr.SearchHandler" >> >> default="true"> >> >> >>> <lst name="defaults"> >> >> >>> <str name="echoParams">explicit</str> >> >> >>> <int name="rows">10</int> >> >> >>> <!--<str name="fq">category:vv</str>--> >> >> >>> <str name="fq">mrank:[0 TO 100]</str> >> >> >>> <str name="echoParams">explicit</str> >> >> >>> <int name="rows">10</int> >> >> >>> <str name="defType">edismax</str> >> >> >>> <!--<str name="qf">title^0.05 url^1.2 content^1.7 >> >> >>> m_title^10.0</str>--> >> >> >>> <str name="qf">title^1.05 url^1.2 content^1.7 m_title^10.0</str> >> >> >>> <!-- <str name="bf">recip(ee_score,-0.85,1,0.2)</str> --> >> >> >>> <str name="pf">content^18.0 m_title^5.0</str> >> >> >>> <int name="ps">1</int> >> >> >>> <int name="qs">0</int> >> >> >>> <str name="mm">2<-25%</str> >> >> >>> <str name="spellcheck">true</str> >> >> >>> <!--<str name="spellcheck.collate">true</str> --> >> >> >>> <str name="spellcheck.count">5</str> >> >> >>> <str name="spellcheck.dictionary">subobjective</str> >> >> >>> <str name="spellcheck.onlyMorePopular">false</str> >> >> >>> <str name="hl.tag.pre"><b></str> >> >> >>> <str name="hl.tag.post"></b></str> >> >> >>> <str name="hl.useFastVectorHighlighter">true</str> >> >> >>> </lst> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> On Sat, Aug 27, 2011 at 5:31 PM, Erik Hatcher < >> erik.hatc...@gmail.com >> >> >wrote: >> >> >>> >> >> >>>> I'm not sure what the issue could be at this point. I see you've >> got >> >> >>>> qt=search - what's the definition of that request handler? >> >> >>>> >> >> >>>> What is the parsed query (from the debugQuery response)? >> >> >>>> >> >> >>>> Have you tried this with Solr 3.3 to see if there's any appreciable >> >> >>>> difference? >> >> >>>> >> >> >>>> Erik >> >> >>>> >> >> >>>> On Aug 27, 2011, at 09:34 , Lord Khan Han wrote: >> >> >>>> >> >> >>>> > When grouping off the query time ie 3567 ms to 1912 ms . >> Grouping >> >> >>>> > increasing the query time and make useless to cache. But same >> config >> >> >>>> faster >> >> >>>> > without shingle still. >> >> >>>> > >> >> >>>> > We have and head to head test this wednesday tihs commercial >> search >> >> >>>> engine. >> >> >>>> > So I am looking for all suggestions. >> >> >>>> > >> >> >>>> > >> >> >>>> > >> >> >>>> > On Sat, Aug 27, 2011 at 3:37 PM, Erik Hatcher < >> >> erik.hatc...@gmail.com >> >> >>>> >wrote: >> >> >>>> > >> >> >>>> >> Please confirm is this is caused by grouping. Turn grouping >> off, >> >> >>>> what's >> >> >>>> >> query time like? >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> On Aug 27, 2011, at 07:27 , Lord Khan Han wrote: >> >> >>>> >> >> >> >>>> >>> On the other hand We couldnt use the cache for below types >> >> queries. I >> >> >>>> >> think >> >> >>>> >>> its caused from grouping. Anyway we need to be sub second >> without >> >> >>>> cache. >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> On Sat, Aug 27, 2011 at 2:18 PM, Lord Khan Han < >> >> >>>> khanuniver...@gmail.com >> >> >>>> >>> wrote: >> >> >>>> >>> >> >> >>>> >>>> Hi, >> >> >>>> >>>> >> >> >>>> >>>> Thanks for the reply. >> >> >>>> >>>> >> >> >>>> >>>> Here the solr log capture.: >> >> >>>> >>>> >> >> >>>> >>>> ****** >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >> >> >> >>>> >> >> >> hl.fragsize=100&spellcheck=true&spellcheck.q=XXXXX&group.limit=5&hl.simple.pre=<b>&hl.fl=content&spellcheck.collate=true&wt=javabin&hl=true&rows=20&version=2&fl=score,approved,domain,host,id,lang,mimetype,title,tstamp,url,category&hl.snippets=3&start=0&q=%2BXXXX+-"XXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXXX"+-XXX+-"XXXXX"+-XXXX+-XXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXX+-"XXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXXX"+-"XXXXXX"+-XXXXXX+-XXXXX+-"XXXXX"+"XXXXX"+"XXXXX"+"XXXXXX"++&group.field=host&hl.simple.post=</b>&group=true&qt=search&fq=mrank:[0+TO+100]&fq=word_count:[70+TO+*] >> >> >>>> >>>> ****** >> >> >>>> >>>> >> >> >>>> >>>> XXXX is the words. All phrases "xxxxx" has two words inside. >> >> >>>> >>>> >> >> >>>> >>>> The timing from the DebugQuery: >> >> >>>> >>>> >> >> >>>> >>>> <lst name="timing"> >> >> >>>> >>>> <double name="time">8654.0</double> >> >> >>>> >>>> <lst name="prepare"> >> >> >>>> >>>> <double name="time">16.0</double> >> >> >>>> >>>> <lst name="org.apache.solr.handler.component.QueryComponent"> >> >> >>>> >>>> <double name="time">16.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst name="org.apache.solr.handler.component.FacetComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst >> >> name="org.apache.solr.handler.component.MoreLikeThisComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst >> name="org.apache.solr.handler.component.HighlightComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst name="org.apache.solr.handler.component.StatsComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst >> >> name="org.apache.solr.handler.component.SpellCheckComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst name="org.apache.solr.handler.component.DebugComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst name="process"> >> >> >>>> >>>> <double name="time">8638.0</double> >> >> >>>> >>>> <lst name="org.apache.solr.handler.component.QueryComponent"> >> >> >>>> >>>> <double name="time">4473.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst name="org.apache.solr.handler.component.FacetComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst >> >> name="org.apache.solr.handler.component.MoreLikeThisComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst >> name="org.apache.solr.handler.component.HighlightComponent"> >> >> >>>> >>>> <double name="time">42.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst name="org.apache.solr.handler.component.StatsComponent"> >> >> >>>> >>>> <double name="time">0.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst >> >> name="org.apache.solr.handler.component.SpellCheckComponent"> >> >> >>>> >>>> <double name="time">1.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> <lst name="org.apache.solr.handler.component.DebugComponent"> >> >> >>>> >>>> <double name="time">4122.0</double> >> >> >>>> >>>> </lst> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >>>> The funny thing is if I removed the ShingleFilter from the >> below >> >> >>>> >> "sh_text" >> >> >>>> >>>> field and index normally the query time is half of the >> current >> >> >>>> shingle >> >> >>>> >> one >> >> >>>> >>>> !. Shouldn't be shingled index better for such heavy 2 word >> >> phrases >> >> >>>> >> search >> >> >>>> >>>> ? I am confused. >> >> >>>> >>>> >> >> >>>> >>>> On the other hand One of the on the shelf big FAT companies >> >> search >> >> >>>> >> engine >> >> >>>> >>>> doing the same query same machine 0.7 / 0.8 secs without cache >> . >> >> I am >> >> >>>> >>>> confident we can do better in solr but couldnt find the way at >> >> the >> >> >>>> >> moment. >> >> >>>> >>>> >> >> >>>> >>>> thanks for helping.. >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >>>> On Sat, Aug 27, 2011 at 2:46 AM, Erik Hatcher < >> >> >>>> erik.hatc...@gmail.com >> >> >>>> >>> wrote: >> >> >>>> >>>> >> >> >>>> >>>>> >> >> >>>> >>>>> On Aug 26, 2011, at 17:49 , Lord Khan Han wrote: >> >> >>>> >>>>>> We are indexing news document from the various sites. >> >> Currently we >> >> >>>> >> have >> >> >>>> >>>>>> 200K docs indexed. Total index size is 36 gig. There is >> also >> >> >>>> >>>>> attachement to >> >> >>>> >>>>>> the news (pdf -docs etc) So document size could be high (ie >> >> 10mb). >> >> >>>> >>>>>> >> >> >>>> >>>>>> We are using some complex queries which includes around 30 - >> 40 >> >> >>>> terms >> >> >>>> >>>>> per >> >> >>>> >>>>>> query. %70 of this terms is two word phrases. We are using >> >> >>>> >>>>>> with conjunction + and - to pinpoint exact result. >> >> >>>> >>>>>> There is also grouping, dismax and boosting , Termvector HL >> . >> >> >>>> >>>>> >> >> >>>> >>>>> You're using a lot of componentry there, and have complex >> >> queries. >> >> >>>> We >> >> >>>> >>>>> need more details. >> >> >>>> >>>>> >> >> >>>> >>>>> Turn on debugQuery=true... what do the timings say for each >> >> >>>> component? >> >> >>>> >>>>> >> >> >>>> >>>>>> Our problem is query times. Currently its around 6-7 secs. I >> >> know >> >> >>>> our >> >> >>>> >>>>> query >> >> >>>> >>>>>> is little bit heavy but we want to improve query >> performance. I >> >> >>>> >> believe >> >> >>>> >>>>> we >> >> >>>> >>>>>> can make it sub second but no succes at the moment. >> >> >>>> >>>>> >> >> >>>> >>>>> Please provide an example query or two (perhaps a full line >> >> logged >> >> >>>> from >> >> >>>> >>>>> Solr itself), and then let's see what debugQuery says about >> your >> >> >>>> query >> >> >>>> >> being >> >> >>>> >>>>> parsed. >> >> >>>> >>>>> >> >> >>>> >>>>>> We tried to use shingle 2 word token it decreases the query >> >> >>>> performcen >> >> >>>> >>>>> !! We >> >> >>>> >>>>>> assumed it will help the speed up phrases search.. >> >> >>>> >>>>> >> >> >>>> >>>>> Again, we'd need to see a parsed query to understand this >> >> deeper. >> >> >>>> >>>>> >> >> >>>> >>>>> Lots of synonym expansion? A parsed query will tell us. >> >> >>>> >>>>> >> >> >>>> >>>>> >> >> >>>> >>>>> >> >> >>>> >>>>>> (using solr latest trunk and HW is pretty good, 32 core >> with >> >> 32 >> >> >>>> gig >> >> >>>> >>>>> ram) >> >> >>>> >>>>>> >> >> >>>> >>>>>> Here the field def: >> >> >>>> >>>>>> >> >> >>>> >>>>>> <fieldType name="sh_text" class="solr.TextField" >> >> >>>> >>>>> positionIncrementGap="100" >> >> >>>> >>>>>> autoGeneratePhraseQueries="true"> >> >> >>>> >>>>>> <analyzer type="index"> >> >> >>>> >>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> >> >>>> >>>>>> <filter class="solr.StopFilterFactory" >> ignoreCase="true" >> >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" /> >> >> >>>> >>>>>> <filter class="solr.WordDelimiterFilterFactory" >> >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1" >> catenateWords="1" >> >> >>>> >>>>>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> >> >> >>>> >>>>>> <!--<filter class="solr.LowerCaseFilterFactory"/>--> >> >> >>>> >>>>>> <filter class="solr.KeywordMarkerFilterFactory" >> >> >>>> >>>>>> protected="protwords.txt"/> >> >> >>>> >>>>>> <filter class="solr.ShingleFilterFactory" >> >> maxShingleSize="2" >> >> >>>> >>>>>> outputUnigrams="true"/> >> >> >>>> >>>>>> </analyzer> >> >> >>>> >>>>>> <analyzer type="query"> >> >> >>>> >>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> >> >>>> >>>>>> <filter class="solr.SynonymFilterFactory" >> >> >>>> >> synonyms="synonyms.txt" >> >> >>>> >>>>>> ignoreCase="true" expand="true"/> >> >> >>>> >>>>>> <filter class="solr.StopFilterFactory" >> ignoreCase="true" >> >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" /> >> >> >>>> >>>>>> <filter class="solr.WordDelimiterFilterFactory" >> >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1" >> catenateWords="0" >> >> >>>> >>>>>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> >> >> >>>> >>>>>> <!--<filter class="solr.LowerCaseFilterFactory"/>--> >> >> >>>> >>>>>> <filter class="solr.KeywordMarkerFilterFactory" >> >> >>>> >>>>>> protected="protwords.txt"/> >> >> >>>> >>>>>> <filter class="solr.ShingleFilterFactory" >> >> maxShingleSize="2" >> >> >>>> >>>>>> outputUnigrams="true"/> >> >> >>>> >>>>>> </analyzer> >> >> >>>> >>>>>> </fieldType> >> >> >>>> >>>>>> >> >> >>>> >>>>>> and >> >> >>>> >>>>>> >> >> >>>> >>>>>> <field name="content" type="sh_text" stored="true" >> >> indexed="true" >> >> >>>> >>>>>> termVectors="true" termPositions="true" termOffsets="true"/> >> >> >>>> >>>>> >> >> >>>> >>>>> >> >> >>>> >>>> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >>>> >> >> >>> >> >> >> >> >> > >> >> >> > >> >