Below the output of the debug. I am measuring pure solr qtime which show in
the Qtime field in solr xml.

<arr name="parsed_filter_queries">
<str>mrank:[0 TO 100]</str>
</arr>
<lst name="timing">
<double name="time">8584.0</double>
<lst name="prepare">
<double name="time">12.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">12.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
<lst name="process">
<double name="time">8572.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">4480.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">41.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">4051.0</double>
</lst>

On Tue, Aug 30, 2011 at 5:38 PM, Erick Erickson <erickerick...@gmail.com>wrote:

> Can we see the output if you specify both
> &debugQuery=on&debug=true
>
> the debug=true will show the time taken up with various
> components, which is sometimes surprising...
>
> Second, we never asked the most basic question, what are
> you measuring? Is this the QTime of the returned response?
> (which is the time actually spent searching) or the time until
> the response gets back to the client, which may involve lots besides
> searching...
>
> Best
> Erick
>
> On Tue, Aug 30, 2011 at 7:59 AM, Lord Khan Han <khanuniver...@gmail.com>
> wrote:
> > Hi Eric,
> >
> > Fields are lazy loading, content stored in solr and machine 32 gig.. solr
> > has 20 gig heap. There is no swapping.
> >
> > As you see we have many phrases in the same query . I couldnt find a way
> to
> > drop qtime to subsecends. Suprisingly non shingled test better qtime !
> >
> >
> > On Mon, Aug 29, 2011 at 3:10 PM, Erick Erickson <erickerick...@gmail.com
> >wrote:
> >
> >> Oh, one other thing: have you profiled your machine
> >> to see if you're swapping? How much memory are
> >> you giving your JVM? What is the underlying
> >> hardware setup?
> >>
> >> Best
> >> Erick
> >>
> >> On Mon, Aug 29, 2011 at 8:09 AM, Erick Erickson <
> erickerick...@gmail.com>
> >> wrote:
> >> > 200K docs and 36G index? It sounds like you're storing
> >> > your documents in the Solr index. In and of itself, that
> >> > shouldn't hurt your query times, *unless* you have
> >> > lazy field loading turned off, have you checked that
> >> > lazy field loading is enabled?
> >> >
> >> >
> >> >
> >> > Best
> >> > Erick
> >> >
> >> > On Sun, Aug 28, 2011 at 5:30 AM, Lord Khan Han <
> khanuniver...@gmail.com>
> >> wrote:
> >> >> Another insteresting thing is : all one word or more word queries
> >> including
> >> >> phrase queries such as "barack obama"  slower in shingle
> configuration.
> >> What
> >> >> i am doing wrong ? without shingle "barack obama" Querytime 300ms
>  with
> >> >> shingle  780 ms..
> >> >>
> >> >>
> >> >> On Sat, Aug 27, 2011 at 7:58 PM, Lord Khan Han <
> khanuniver...@gmail.com
> >> >wrote:
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> What is the difference between solr 3.3  and the trunk ?
> >> >>> I will try 3.3  and let you know the results.
> >> >>>
> >> >>>
> >> >>> Here the search handler:
> >> >>>
> >> >>> <requestHandler name="search" class="solr.SearchHandler"
> >> default="true">
> >> >>>      <lst name="defaults">
> >> >>>        <str name="echoParams">explicit</str>
> >> >>>        <int name="rows">10</int>
> >> >>>        <!--<str name="fq">category:vv</str>-->
> >> >>>  <str name="fq">mrank:[0 TO 100]</str>
> >> >>>        <str name="echoParams">explicit</str>
> >> >>>        <int name="rows">10</int>
> >> >>>  <str name="defType">edismax</str>
> >> >>>        <!--<str name="qf">title^0.05 url^1.2 content^1.7
> >> >>> m_title^10.0</str>-->
> >> >>> <str name="qf">title^1.05 url^1.2 content^1.7 m_title^10.0</str>
> >> >>>  <!-- <str name="bf">recip(ee_score,-0.85,1,0.2)</str> -->
> >> >>>  <str name="pf">content^18.0 m_title^5.0</str>
> >> >>>  <int name="ps">1</int>
> >> >>>  <int name="qs">0</int>
> >> >>>  <str name="mm">2&lt;-25%</str>
> >> >>>  <str name="spellcheck">true</str>
> >> >>>  <!--<str name="spellcheck.collate">true</str>   -->
> >> >>> <str name="spellcheck.count">5</str>
> >> >>>  <str name="spellcheck.dictionary">subobjective</str>
> >> >>> <str name="spellcheck.onlyMorePopular">false</str>
> >> >>>   <str name="hl.tag.pre">&lt;b&gt;</str>
> >> >>> <str name="hl.tag.post">&lt;/b&gt;</str>
> >> >>>  <str name="hl.useFastVectorHighlighter">true</str>
> >> >>>      </lst>
> >> >>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Sat, Aug 27, 2011 at 5:31 PM, Erik Hatcher <
> erik.hatc...@gmail.com
> >> >wrote:
> >> >>>
> >> >>>> I'm not sure what the issue could be at this point.   I see you've
> got
> >> >>>> qt=search - what's the definition of that request handler?
> >> >>>>
> >> >>>> What is the parsed query (from the debugQuery response)?
> >> >>>>
> >> >>>> Have you tried this with Solr 3.3 to see if there's any appreciable
> >> >>>> difference?
> >> >>>>
> >> >>>>        Erik
> >> >>>>
> >> >>>> On Aug 27, 2011, at 09:34 , Lord Khan Han wrote:
> >> >>>>
> >> >>>> > When grouping off the query time ie 3567 ms  to 1912 ms .
> Grouping
> >> >>>> > increasing the query time and make useless to cache. But same
> config
> >> >>>> faster
> >> >>>> > without shingle still.
> >> >>>> >
> >> >>>> > We have and head to head test this wednesday tihs commercial
> search
> >> >>>> engine.
> >> >>>> > So I am looking for all suggestions.
> >> >>>> >
> >> >>>> >
> >> >>>> >
> >> >>>> > On Sat, Aug 27, 2011 at 3:37 PM, Erik Hatcher <
> >> erik.hatc...@gmail.com
> >> >>>> >wrote:
> >> >>>> >
> >> >>>> >> Please confirm is this is caused by grouping.  Turn grouping
> off,
> >> >>>> what's
> >> >>>> >> query time like?
> >> >>>> >>
> >> >>>> >>
> >> >>>> >> On Aug 27, 2011, at 07:27 , Lord Khan Han wrote:
> >> >>>> >>
> >> >>>> >>> On the other hand We couldnt use the cache for below types
> >> queries. I
> >> >>>> >> think
> >> >>>> >>> its caused from grouping. Anyway we need to be sub second
> without
> >> >>>> cache.
> >> >>>> >>>
> >> >>>> >>>
> >> >>>> >>>
> >> >>>> >>> On Sat, Aug 27, 2011 at 2:18 PM, Lord Khan Han <
> >> >>>> khanuniver...@gmail.com
> >> >>>> >>> wrote:
> >> >>>> >>>
> >> >>>> >>>> Hi,
> >> >>>> >>>>
> >> >>>> >>>> Thanks for the reply.
> >> >>>> >>>>
> >> >>>> >>>> Here the solr log capture.:
> >> >>>> >>>>
> >> >>>> >>>> ******
> >> >>>> >>>>
> >> >>>> >>>>
> >> >>>> >>
> >> >>>>
> >>
> hl.fragsize=100&spellcheck=true&spellcheck.q=XXXXX&group.limit=5&hl.simple.pre=<b>&hl.fl=content&spellcheck.collate=true&wt=javabin&hl=true&rows=20&version=2&fl=score,approved,domain,host,id,lang,mimetype,title,tstamp,url,category&hl.snippets=3&start=0&q=%2BXXXX+-"XXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXXX"+-XXX+-"XXXXX"+-XXXX+-XXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXX+-"XXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXXX"+-"XXXXXX"+-XXXXXX+-XXXXX+-"XXXXX"+"XXXXX"+"XXXXX"+"XXXXXX"++&group.field=host&hl.simple.post=</b>&group=true&qt=search&fq=mrank:[0+TO+100]&fq=word_count:[70+TO+*]
> >> >>>> >>>> ******
> >> >>>> >>>>
> >> >>>> >>>> XXXX is the words. All phrases "xxxxx"  has two words inside.
> >> >>>> >>>>
> >> >>>> >>>> The timing from the DebugQuery:
> >> >>>> >>>>
> >> >>>> >>>> <lst name="timing">
> >> >>>> >>>> <double name="time">8654.0</double>
> >> >>>> >>>> <lst name="prepare">
> >> >>>> >>>> <double name="time">16.0</double>
> >> >>>> >>>> <lst name="org.apache.solr.handler.component.QueryComponent">
> >> >>>> >>>> <double name="time">16.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst name="org.apache.solr.handler.component.FacetComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst
> >> name="org.apache.solr.handler.component.MoreLikeThisComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.HighlightComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst name="org.apache.solr.handler.component.StatsComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst
> >> name="org.apache.solr.handler.component.SpellCheckComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst name="org.apache.solr.handler.component.DebugComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst name="process">
> >> >>>> >>>> <double name="time">8638.0</double>
> >> >>>> >>>> <lst name="org.apache.solr.handler.component.QueryComponent">
> >> >>>> >>>> <double name="time">4473.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst name="org.apache.solr.handler.component.FacetComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst
> >> name="org.apache.solr.handler.component.MoreLikeThisComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.HighlightComponent">
> >> >>>> >>>> <double name="time">42.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst name="org.apache.solr.handler.component.StatsComponent">
> >> >>>> >>>> <double name="time">0.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst
> >> name="org.apache.solr.handler.component.SpellCheckComponent">
> >> >>>> >>>> <double name="time">1.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>> <lst name="org.apache.solr.handler.component.DebugComponent">
> >> >>>> >>>> <double name="time">4122.0</double>
> >> >>>> >>>> </lst>
> >> >>>> >>>>
> >> >>>> >>>>
> >> >>>> >>>> The funny thing is if I removed the ShingleFilter from the
> below
> >> >>>> >> "sh_text"
> >> >>>> >>>> field and index normally  the query time is half of the
> current
> >> >>>> shingle
> >> >>>> >> one
> >> >>>> >>>> !. Shouldn't  be shingled index better for such heavy 2 word
> >> phrases
> >> >>>> >> search
> >> >>>> >>>> ? I am confused.
> >> >>>> >>>>
> >> >>>> >>>> On the other hand One of the on the shelf big FAT companies
> >> search
> >> >>>> >> engine
> >> >>>> >>>> doing the same query same machine 0.7 / 0.8 secs without cache
> .
> >> I am
> >> >>>> >>>> confident we can do better in solr but couldnt find the way at
> >> the
> >> >>>> >> moment.
> >> >>>> >>>>
> >> >>>> >>>> thanks for helping..
> >> >>>> >>>>
> >> >>>> >>>>
> >> >>>> >>>>
> >> >>>> >>>>
> >> >>>> >>>> On Sat, Aug 27, 2011 at 2:46 AM, Erik Hatcher <
> >> >>>> erik.hatc...@gmail.com
> >> >>>> >>> wrote:
> >> >>>> >>>>
> >> >>>> >>>>>
> >> >>>> >>>>> On Aug 26, 2011, at 17:49 , Lord Khan Han wrote:
> >> >>>> >>>>>> We are indexing news  document from the various sites.
> >> Currently we
> >> >>>> >> have
> >> >>>> >>>>>> 200K docs indexed. Total index size is 36 gig.  There is
> also
> >> >>>> >>>>> attachement to
> >> >>>> >>>>>> the news (pdf -docs etc) So document size could be high (ie
> >> 10mb).
> >> >>>> >>>>>>
> >> >>>> >>>>>> We are using some complex queries which includes around 30 -
> 40
> >> >>>> terms
> >> >>>> >>>>> per
> >> >>>> >>>>>> query. %70 of this terms is two word phrases. We are using
> >> >>>> >>>>>> with conjunction +  and -  to pinpoint exact result.
> >> >>>> >>>>>> There is also grouping, dismax and boosting , Termvector HL
>  .
> >> >>>> >>>>>
> >> >>>> >>>>> You're using a lot of componentry there, and have complex
> >> queries.
> >> >>>>  We
> >> >>>> >>>>> need more details.
> >> >>>> >>>>>
> >> >>>> >>>>> Turn on debugQuery=true... what do the timings say for each
> >> >>>> component?
> >> >>>> >>>>>
> >> >>>> >>>>>> Our problem is query times. Currently its around 6-7 secs. I
> >> know
> >> >>>> our
> >> >>>> >>>>> query
> >> >>>> >>>>>> is little bit heavy but we want to improve query
> performance. I
> >> >>>> >> believe
> >> >>>> >>>>> we
> >> >>>> >>>>>> can make it sub second but no succes at the moment.
> >> >>>> >>>>>
> >> >>>> >>>>> Please provide an example query or two (perhaps a full line
> >> logged
> >> >>>> from
> >> >>>> >>>>> Solr itself), and then let's see what debugQuery says about
> your
> >> >>>> query
> >> >>>> >> being
> >> >>>> >>>>> parsed.
> >> >>>> >>>>>
> >> >>>> >>>>>> We tried to use shingle 2 word token it decreases the query
> >> >>>> performcen
> >> >>>> >>>>> !! We
> >> >>>> >>>>>> assumed it will help the speed up phrases search..
> >> >>>> >>>>>
> >> >>>> >>>>> Again, we'd need to see a parsed query to understand this
> >> deeper.
> >> >>>> >>>>>
> >> >>>> >>>>> Lots of synonym expansion?  A parsed query will tell us.
> >> >>>> >>>>>
> >> >>>> >>>>>
> >> >>>> >>>>>
> >> >>>> >>>>>> (using solr latest trunk and HW is pretty good, 32 core
>  with
> >> 32
> >> >>>> gig
> >> >>>> >>>>> ram)
> >> >>>> >>>>>>
> >> >>>> >>>>>> Here the field def:
> >> >>>> >>>>>>
> >> >>>> >>>>>> <fieldType name="sh_text" class="solr.TextField"
> >> >>>> >>>>> positionIncrementGap="100"
> >> >>>> >>>>>> autoGeneratePhraseQueries="true">
> >> >>>> >>>>>>    <analyzer type="index">
> >> >>>> >>>>>>      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >> >>>> >>>>>>      <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" />
> >> >>>> >>>>>>      <filter class="solr.WordDelimiterFilterFactory"
> >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1"
> catenateWords="1"
> >> >>>> >>>>>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >> >>>> >>>>>>      <!--<filter class="solr.LowerCaseFilterFactory"/>-->
> >> >>>> >>>>>>      <filter class="solr.KeywordMarkerFilterFactory"
> >> >>>> >>>>>> protected="protwords.txt"/>
> >> >>>> >>>>>>      <filter class="solr.ShingleFilterFactory"
> >> maxShingleSize="2"
> >> >>>> >>>>>> outputUnigrams="true"/>
> >> >>>> >>>>>>    </analyzer>
> >> >>>> >>>>>>    <analyzer type="query">
> >> >>>> >>>>>>      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >> >>>> >>>>>>      <filter class="solr.SynonymFilterFactory"
> >> >>>> >> synonyms="synonyms.txt"
> >> >>>> >>>>>> ignoreCase="true" expand="true"/>
> >> >>>> >>>>>>      <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" />
> >> >>>> >>>>>>      <filter class="solr.WordDelimiterFilterFactory"
> >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1"
> catenateWords="0"
> >> >>>> >>>>>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >> >>>> >>>>>>      <!--<filter class="solr.LowerCaseFilterFactory"/>-->
> >> >>>> >>>>>>      <filter class="solr.KeywordMarkerFilterFactory"
> >> >>>> >>>>>> protected="protwords.txt"/>
> >> >>>> >>>>>>      <filter class="solr.ShingleFilterFactory"
> >> maxShingleSize="2"
> >> >>>> >>>>>> outputUnigrams="true"/>
> >> >>>> >>>>>>    </analyzer>
> >> >>>> >>>>>>  </fieldType>
> >> >>>> >>>>>>
> >> >>>> >>>>>> and
> >> >>>> >>>>>>
> >> >>>> >>>>>> <field name="content" type="sh_text" stored="true"
> >> indexed="true"
> >> >>>> >>>>>> termVectors="true" termPositions="true" termOffsets="true"/>
> >> >>>> >>>>>
> >> >>>> >>>>>
> >> >>>> >>>>
> >> >>>> >>
> >> >>>> >>
> >> >>>>
> >> >>>>
> >> >>>
> >> >>
> >> >
> >>
> >
>

Reply via email to