Thanks Erick.. If I figure out something I will let you know also..  No body
replied except you I thought there might be more people involve here..

Thanks


On Wed, Aug 31, 2011 at 3:47 AM, Erick Erickson <erickerick...@gmail.com>wrote:

> OK, I'll have to defer because this makes no sense.
> 4+ seconds in the debug component?
>
> Sorry I can't be more help here, but nothing really
> jumps out.
> Erick
>
> On Tue, Aug 30, 2011 at 12:45 PM, Lord Khan Han <khanuniver...@gmail.com>
> wrote:
> > Below the output of the debug. I am measuring pure solr qtime which show
> in
> > the Qtime field in solr xml.
> >
> > <arr name="parsed_filter_queries">
> > <str>mrank:[0 TO 100]</str>
> > </arr>
> > <lst name="timing">
> > <double name="time">8584.0</double>
> > <lst name="prepare">
> > <double name="time">12.0</double>
> > <lst name="org.apache.solr.handler.component.QueryComponent">
> > <double name="time">12.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.FacetComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.HighlightComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.StatsComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.SpellCheckComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.DebugComponent">
> > <double name="time">0.0</double>
> > </lst>
> > </lst>
> > <lst name="process">
> > <double name="time">8572.0</double>
> > <lst name="org.apache.solr.handler.component.QueryComponent">
> > <double name="time">4480.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.FacetComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.HighlightComponent">
> > <double name="time">41.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.StatsComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.SpellCheckComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.DebugComponent">
> > <double name="time">4051.0</double>
> > </lst>
> >
> > On Tue, Aug 30, 2011 at 5:38 PM, Erick Erickson <erickerick...@gmail.com
> >wrote:
> >
> >> Can we see the output if you specify both
> >> &debugQuery=on&debug=true
> >>
> >> the debug=true will show the time taken up with various
> >> components, which is sometimes surprising...
> >>
> >> Second, we never asked the most basic question, what are
> >> you measuring? Is this the QTime of the returned response?
> >> (which is the time actually spent searching) or the time until
> >> the response gets back to the client, which may involve lots besides
> >> searching...
> >>
> >> Best
> >> Erick
> >>
> >> On Tue, Aug 30, 2011 at 7:59 AM, Lord Khan Han <khanuniver...@gmail.com
> >
> >> wrote:
> >> > Hi Eric,
> >> >
> >> > Fields are lazy loading, content stored in solr and machine 32 gig..
> solr
> >> > has 20 gig heap. There is no swapping.
> >> >
> >> > As you see we have many phrases in the same query . I couldnt find a
> way
> >> to
> >> > drop qtime to subsecends. Suprisingly non shingled test better qtime !
> >> >
> >> >
> >> > On Mon, Aug 29, 2011 at 3:10 PM, Erick Erickson <
> erickerick...@gmail.com
> >> >wrote:
> >> >
> >> >> Oh, one other thing: have you profiled your machine
> >> >> to see if you're swapping? How much memory are
> >> >> you giving your JVM? What is the underlying
> >> >> hardware setup?
> >> >>
> >> >> Best
> >> >> Erick
> >> >>
> >> >> On Mon, Aug 29, 2011 at 8:09 AM, Erick Erickson <
> >> erickerick...@gmail.com>
> >> >> wrote:
> >> >> > 200K docs and 36G index? It sounds like you're storing
> >> >> > your documents in the Solr index. In and of itself, that
> >> >> > shouldn't hurt your query times, *unless* you have
> >> >> > lazy field loading turned off, have you checked that
> >> >> > lazy field loading is enabled?
> >> >> >
> >> >> >
> >> >> >
> >> >> > Best
> >> >> > Erick
> >> >> >
> >> >> > On Sun, Aug 28, 2011 at 5:30 AM, Lord Khan Han <
> >> khanuniver...@gmail.com>
> >> >> wrote:
> >> >> >> Another insteresting thing is : all one word or more word queries
> >> >> including
> >> >> >> phrase queries such as "barack obama"  slower in shingle
> >> configuration.
> >> >> What
> >> >> >> i am doing wrong ? without shingle "barack obama" Querytime 300ms
> >>  with
> >> >> >> shingle  780 ms..
> >> >> >>
> >> >> >>
> >> >> >> On Sat, Aug 27, 2011 at 7:58 PM, Lord Khan Han <
> >> khanuniver...@gmail.com
> >> >> >wrote:
> >> >> >>
> >> >> >>> Hi,
> >> >> >>>
> >> >> >>> What is the difference between solr 3.3  and the trunk ?
> >> >> >>> I will try 3.3  and let you know the results.
> >> >> >>>
> >> >> >>>
> >> >> >>> Here the search handler:
> >> >> >>>
> >> >> >>> <requestHandler name="search" class="solr.SearchHandler"
> >> >> default="true">
> >> >> >>>      <lst name="defaults">
> >> >> >>>        <str name="echoParams">explicit</str>
> >> >> >>>        <int name="rows">10</int>
> >> >> >>>        <!--<str name="fq">category:vv</str>-->
> >> >> >>>  <str name="fq">mrank:[0 TO 100]</str>
> >> >> >>>        <str name="echoParams">explicit</str>
> >> >> >>>        <int name="rows">10</int>
> >> >> >>>  <str name="defType">edismax</str>
> >> >> >>>        <!--<str name="qf">title^0.05 url^1.2 content^1.7
> >> >> >>> m_title^10.0</str>-->
> >> >> >>> <str name="qf">title^1.05 url^1.2 content^1.7 m_title^10.0</str>
> >> >> >>>  <!-- <str name="bf">recip(ee_score,-0.85,1,0.2)</str> -->
> >> >> >>>  <str name="pf">content^18.0 m_title^5.0</str>
> >> >> >>>  <int name="ps">1</int>
> >> >> >>>  <int name="qs">0</int>
> >> >> >>>  <str name="mm">2&lt;-25%</str>
> >> >> >>>  <str name="spellcheck">true</str>
> >> >> >>>  <!--<str name="spellcheck.collate">true</str>   -->
> >> >> >>> <str name="spellcheck.count">5</str>
> >> >> >>>  <str name="spellcheck.dictionary">subobjective</str>
> >> >> >>> <str name="spellcheck.onlyMorePopular">false</str>
> >> >> >>>   <str name="hl.tag.pre">&lt;b&gt;</str>
> >> >> >>> <str name="hl.tag.post">&lt;/b&gt;</str>
> >> >> >>>  <str name="hl.useFastVectorHighlighter">true</str>
> >> >> >>>      </lst>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> On Sat, Aug 27, 2011 at 5:31 PM, Erik Hatcher <
> >> erik.hatc...@gmail.com
> >> >> >wrote:
> >> >> >>>
> >> >> >>>> I'm not sure what the issue could be at this point.   I see
> you've
> >> got
> >> >> >>>> qt=search - what's the definition of that request handler?
> >> >> >>>>
> >> >> >>>> What is the parsed query (from the debugQuery response)?
> >> >> >>>>
> >> >> >>>> Have you tried this with Solr 3.3 to see if there's any
> appreciable
> >> >> >>>> difference?
> >> >> >>>>
> >> >> >>>>        Erik
> >> >> >>>>
> >> >> >>>> On Aug 27, 2011, at 09:34 , Lord Khan Han wrote:
> >> >> >>>>
> >> >> >>>> > When grouping off the query time ie 3567 ms  to 1912 ms .
> >> Grouping
> >> >> >>>> > increasing the query time and make useless to cache. But same
> >> config
> >> >> >>>> faster
> >> >> >>>> > without shingle still.
> >> >> >>>> >
> >> >> >>>> > We have and head to head test this wednesday tihs commercial
> >> search
> >> >> >>>> engine.
> >> >> >>>> > So I am looking for all suggestions.
> >> >> >>>> >
> >> >> >>>> >
> >> >> >>>> >
> >> >> >>>> > On Sat, Aug 27, 2011 at 3:37 PM, Erik Hatcher <
> >> >> erik.hatc...@gmail.com
> >> >> >>>> >wrote:
> >> >> >>>> >
> >> >> >>>> >> Please confirm is this is caused by grouping.  Turn grouping
> >> off,
> >> >> >>>> what's
> >> >> >>>> >> query time like?
> >> >> >>>> >>
> >> >> >>>> >>
> >> >> >>>> >> On Aug 27, 2011, at 07:27 , Lord Khan Han wrote:
> >> >> >>>> >>
> >> >> >>>> >>> On the other hand We couldnt use the cache for below types
> >> >> queries. I
> >> >> >>>> >> think
> >> >> >>>> >>> its caused from grouping. Anyway we need to be sub second
> >> without
> >> >> >>>> cache.
> >> >> >>>> >>>
> >> >> >>>> >>>
> >> >> >>>> >>>
> >> >> >>>> >>> On Sat, Aug 27, 2011 at 2:18 PM, Lord Khan Han <
> >> >> >>>> khanuniver...@gmail.com
> >> >> >>>> >>> wrote:
> >> >> >>>> >>>
> >> >> >>>> >>>> Hi,
> >> >> >>>> >>>>
> >> >> >>>> >>>> Thanks for the reply.
> >> >> >>>> >>>>
> >> >> >>>> >>>> Here the solr log capture.:
> >> >> >>>> >>>>
> >> >> >>>> >>>> ******
> >> >> >>>> >>>>
> >> >> >>>> >>>>
> >> >> >>>> >>
> >> >> >>>>
> >> >>
> >>
> hl.fragsize=100&spellcheck=true&spellcheck.q=XXXXX&group.limit=5&hl.simple.pre=<b>&hl.fl=content&spellcheck.collate=true&wt=javabin&hl=true&rows=20&version=2&fl=score,approved,domain,host,id,lang,mimetype,title,tstamp,url,category&hl.snippets=3&start=0&q=%2BXXXX+-"XXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXXX"+-XXX+-"XXXXX"+-XXXX+-XXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXX+-"XXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXXX"+-"XXXXXX"+-XXXXXX+-XXXXX+-"XXXXX"+"XXXXX"+"XXXXX"+"XXXXXX"++&group.field=host&hl.simple.post=</b>&group=true&qt=search&fq=mrank:[0+TO+100]&fq=word_count:[70+TO+*]
> >> >> >>>> >>>> ******
> >> >> >>>> >>>>
> >> >> >>>> >>>> XXXX is the words. All phrases "xxxxx"  has two words
> inside.
> >> >> >>>> >>>>
> >> >> >>>> >>>> The timing from the DebugQuery:
> >> >> >>>> >>>>
> >> >> >>>> >>>> <lst name="timing">
> >> >> >>>> >>>> <double name="time">8654.0</double>
> >> >> >>>> >>>> <lst name="prepare">
> >> >> >>>> >>>> <double name="time">16.0</double>
> >> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.QueryComponent">
> >> >> >>>> >>>> <double name="time">16.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.FacetComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> >> >> name="org.apache.solr.handler.component.MoreLikeThisComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> >> name="org.apache.solr.handler.component.HighlightComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.StatsComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> >> >> name="org.apache.solr.handler.component.SpellCheckComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.DebugComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst name="process">
> >> >> >>>> >>>> <double name="time">8638.0</double>
> >> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.QueryComponent">
> >> >> >>>> >>>> <double name="time">4473.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.FacetComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> >> >> name="org.apache.solr.handler.component.MoreLikeThisComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> >> name="org.apache.solr.handler.component.HighlightComponent">
> >> >> >>>> >>>> <double name="time">42.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.StatsComponent">
> >> >> >>>> >>>> <double name="time">0.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> >> >> name="org.apache.solr.handler.component.SpellCheckComponent">
> >> >> >>>> >>>> <double name="time">1.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>> <lst
> name="org.apache.solr.handler.component.DebugComponent">
> >> >> >>>> >>>> <double name="time">4122.0</double>
> >> >> >>>> >>>> </lst>
> >> >> >>>> >>>>
> >> >> >>>> >>>>
> >> >> >>>> >>>> The funny thing is if I removed the ShingleFilter from the
> >> below
> >> >> >>>> >> "sh_text"
> >> >> >>>> >>>> field and index normally  the query time is half of the
> >> current
> >> >> >>>> shingle
> >> >> >>>> >> one
> >> >> >>>> >>>> !. Shouldn't  be shingled index better for such heavy 2
> word
> >> >> phrases
> >> >> >>>> >> search
> >> >> >>>> >>>> ? I am confused.
> >> >> >>>> >>>>
> >> >> >>>> >>>> On the other hand One of the on the shelf big FAT companies
> >> >> search
> >> >> >>>> >> engine
> >> >> >>>> >>>> doing the same query same machine 0.7 / 0.8 secs without
> cache
> >> .
> >> >> I am
> >> >> >>>> >>>> confident we can do better in solr but couldnt find the way
> at
> >> >> the
> >> >> >>>> >> moment.
> >> >> >>>> >>>>
> >> >> >>>> >>>> thanks for helping..
> >> >> >>>> >>>>
> >> >> >>>> >>>>
> >> >> >>>> >>>>
> >> >> >>>> >>>>
> >> >> >>>> >>>> On Sat, Aug 27, 2011 at 2:46 AM, Erik Hatcher <
> >> >> >>>> erik.hatc...@gmail.com
> >> >> >>>> >>> wrote:
> >> >> >>>> >>>>
> >> >> >>>> >>>>>
> >> >> >>>> >>>>> On Aug 26, 2011, at 17:49 , Lord Khan Han wrote:
> >> >> >>>> >>>>>> We are indexing news  document from the various sites.
> >> >> Currently we
> >> >> >>>> >> have
> >> >> >>>> >>>>>> 200K docs indexed. Total index size is 36 gig.  There is
> >> also
> >> >> >>>> >>>>> attachement to
> >> >> >>>> >>>>>> the news (pdf -docs etc) So document size could be high
> (ie
> >> >> 10mb).
> >> >> >>>> >>>>>>
> >> >> >>>> >>>>>> We are using some complex queries which includes around
> 30 -
> >> 40
> >> >> >>>> terms
> >> >> >>>> >>>>> per
> >> >> >>>> >>>>>> query. %70 of this terms is two word phrases. We are
> using
> >> >> >>>> >>>>>> with conjunction +  and -  to pinpoint exact result.
> >> >> >>>> >>>>>> There is also grouping, dismax and boosting , Termvector
> HL
> >>  .
> >> >> >>>> >>>>>
> >> >> >>>> >>>>> You're using a lot of componentry there, and have complex
> >> >> queries.
> >> >> >>>>  We
> >> >> >>>> >>>>> need more details.
> >> >> >>>> >>>>>
> >> >> >>>> >>>>> Turn on debugQuery=true... what do the timings say for
> each
> >> >> >>>> component?
> >> >> >>>> >>>>>
> >> >> >>>> >>>>>> Our problem is query times. Currently its around 6-7
> secs. I
> >> >> know
> >> >> >>>> our
> >> >> >>>> >>>>> query
> >> >> >>>> >>>>>> is little bit heavy but we want to improve query
> >> performance. I
> >> >> >>>> >> believe
> >> >> >>>> >>>>> we
> >> >> >>>> >>>>>> can make it sub second but no succes at the moment.
> >> >> >>>> >>>>>
> >> >> >>>> >>>>> Please provide an example query or two (perhaps a full
> line
> >> >> logged
> >> >> >>>> from
> >> >> >>>> >>>>> Solr itself), and then let's see what debugQuery says
> about
> >> your
> >> >> >>>> query
> >> >> >>>> >> being
> >> >> >>>> >>>>> parsed.
> >> >> >>>> >>>>>
> >> >> >>>> >>>>>> We tried to use shingle 2 word token it decreases the
> query
> >> >> >>>> performcen
> >> >> >>>> >>>>> !! We
> >> >> >>>> >>>>>> assumed it will help the speed up phrases search..
> >> >> >>>> >>>>>
> >> >> >>>> >>>>> Again, we'd need to see a parsed query to understand this
> >> >> deeper.
> >> >> >>>> >>>>>
> >> >> >>>> >>>>> Lots of synonym expansion?  A parsed query will tell us.
> >> >> >>>> >>>>>
> >> >> >>>> >>>>>
> >> >> >>>> >>>>>
> >> >> >>>> >>>>>> (using solr latest trunk and HW is pretty good, 32 core
> >>  with
> >> >> 32
> >> >> >>>> gig
> >> >> >>>> >>>>> ram)
> >> >> >>>> >>>>>>
> >> >> >>>> >>>>>> Here the field def:
> >> >> >>>> >>>>>>
> >> >> >>>> >>>>>> <fieldType name="sh_text" class="solr.TextField"
> >> >> >>>> >>>>> positionIncrementGap="100"
> >> >> >>>> >>>>>> autoGeneratePhraseQueries="true">
> >> >> >>>> >>>>>>    <analyzer type="index">
> >> >> >>>> >>>>>>      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >> >> >>>> >>>>>>      <filter class="solr.StopFilterFactory"
> >> ignoreCase="true"
> >> >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" />
> >> >> >>>> >>>>>>      <filter class="solr.WordDelimiterFilterFactory"
> >> >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1"
> >> catenateWords="1"
> >> >> >>>> >>>>>> catenateNumbers="1" catenateAll="0"
> splitOnCaseChange="1"/>
> >> >> >>>> >>>>>>      <!--<filter class="solr.LowerCaseFilterFactory"/>-->
> >> >> >>>> >>>>>>      <filter class="solr.KeywordMarkerFilterFactory"
> >> >> >>>> >>>>>> protected="protwords.txt"/>
> >> >> >>>> >>>>>>      <filter class="solr.ShingleFilterFactory"
> >> >> maxShingleSize="2"
> >> >> >>>> >>>>>> outputUnigrams="true"/>
> >> >> >>>> >>>>>>    </analyzer>
> >> >> >>>> >>>>>>    <analyzer type="query">
> >> >> >>>> >>>>>>      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >> >> >>>> >>>>>>      <filter class="solr.SynonymFilterFactory"
> >> >> >>>> >> synonyms="synonyms.txt"
> >> >> >>>> >>>>>> ignoreCase="true" expand="true"/>
> >> >> >>>> >>>>>>      <filter class="solr.StopFilterFactory"
> >> ignoreCase="true"
> >> >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" />
> >> >> >>>> >>>>>>      <filter class="solr.WordDelimiterFilterFactory"
> >> >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1"
> >> catenateWords="0"
> >> >> >>>> >>>>>> catenateNumbers="0" catenateAll="0"
> splitOnCaseChange="1"/>
> >> >> >>>> >>>>>>      <!--<filter class="solr.LowerCaseFilterFactory"/>-->
> >> >> >>>> >>>>>>      <filter class="solr.KeywordMarkerFilterFactory"
> >> >> >>>> >>>>>> protected="protwords.txt"/>
> >> >> >>>> >>>>>>      <filter class="solr.ShingleFilterFactory"
> >> >> maxShingleSize="2"
> >> >> >>>> >>>>>> outputUnigrams="true"/>
> >> >> >>>> >>>>>>    </analyzer>
> >> >> >>>> >>>>>>  </fieldType>
> >> >> >>>> >>>>>>
> >> >> >>>> >>>>>> and
> >> >> >>>> >>>>>>
> >> >> >>>> >>>>>> <field name="content" type="sh_text" stored="true"
> >> >> indexed="true"
> >> >> >>>> >>>>>> termVectors="true" termPositions="true"
> termOffsets="true"/>
> >> >> >>>> >>>>>
> >> >> >>>> >>>>>
> >> >> >>>> >>>>
> >> >> >>>> >>
> >> >> >>>> >>
> >> >> >>>>
> >> >> >>>>
> >> >> >>>
> >> >> >>
> >> >> >
> >> >>
> >> >
> >>
> >
>

Reply via email to