Hi Erick,

Thanks for you reply. Actually I did the following search:
survey_url:http\://www.someurl.com/sch/i.html* referal_url:http\://
www.someurl.com/sch/i.html* page_url:http\://www.someurl.com/sch/i.html*

I did not prepend any asterisk to the field value, but only append to them.

I analyze url field on solr admin page, and it give me this, meaning the
url is not tokenized. I notice you mentioned a WordDelimiterFilterFactory.
Do I need to configure it in schema.xml or some place else?
term position 1 term text http://www.someurl.com/sch/i.html* term type
word source
start,end 0,31
I add the debugQuery=on to the query url, I got this (Sorry to paste such
long encrypted code here, they are really mysterious to me)
<lst name="debug">
    <str name="rawquerystring">survey_url:http\://
www.someurl.com/sch/i.html*
referal_url:http\://www.someurl.com/sch/i.html*page_url:http\://
www.someurl.com/sch/i.html*</str>
    <str 
name="querystring">survey_url:http\://www.someurl.com/sch/i.html*referal_url:http\://
www.someurl.com/sch/i.html* page_url:http\://www.someurl.com/sch/i.html*
</str>
    <str 
name="parsedquery">survey_url:http://www.someurl.com/sch/i.html*referal_url:
http://www.someurl.com/sch/i.html* page_url:
http://www.someurl.com/sch/i.html*</str>
    <str name="parsedquery_toString">survey_url:
http://www.someurl.com/sch/i.html* referal_url:
http://www.someurl.com/sch/i.html* page_url:
http://www.someurl.com/sch/i.html*</str>
    <lst name="explain">
        <str name="5007688343">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
        </str>
        <str name="5007648909">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
        </str>
        <str name="5007653989">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
        </str>
        <str name="5007709065">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
        </str>
        <str name="5007710379">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
</str><str name="5007739634">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
</str><str name="5007753066">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
</str><str name="5007756045">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
</str><str name="5007832978">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
</str><str name="5007849124">
0.76980036 = (MATCH) product of:
  1.1547005 = (MATCH) sum of:
    0.57735026 = (MATCH) ConstantScoreQuery(referal_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
    0.57735026 = (MATCH) ConstantScoreQuery(page_url:
http://www.someurl.com/sch/i.html*), product of:
      1.0 = boost
      0.57735026 = queryNorm
  0.6666667 = coord(2/3)
</str></lst><str name="QParser">LuceneQParser</str><lst
name="timing"><double name="time">17156.0</double><lst
name="prepare"><double name="time">0.0</double><lst
name="org.apache.solr.handler.component.QueryComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.FacetComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.MoreLikeThisComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.HighlightComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.StatsComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.DebugComponent"><double
name="time">0.0</double></lst></lst><lst name="process"><double
name="time">17156.0</double><lst
name="org.apache.solr.handler.component.QueryComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.FacetComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.MoreLikeThisComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.HighlightComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.StatsComponent"><double
name="time">0.0</double></lst><lst
name="org.apache.solr.handler.component.DebugComponent"><double
name="time">17156.0</double></lst></lst></lst></lst>



2012/1/9 Erick Erickson <erickerick...@gmail.com>

> Yu Shen & Arian:
>
> We can't help much without more information. In particular, how are
> the fields in question analyzed? What is the result of looking
> at the admin/analysis page? What do you get when you
> attach &debugQuery=on to the query?
>
> You might review:
> http://wiki.apache.org/solr/UsingMailingLists
>
> But at a wild guess, you have something like WordDelimiterFilterFactory
> in your analysis chain, and it's splitting up your input into
> "www" "someurl" "com" as separate tokens, and www matches
> all documents so Solr is having to score all documents in your corpus, but
> that's just a guess. See the admin/schema browser page and find the most
> frequent terms for the field in question, that should indicate whether
> you have some tokens that appear in all docs. Try searching on
> plain "someurl". Is that slow? Or "someurl.anotherpart" or whatever.
>
> Best
> Erick
>
> 2012/1/9 François Schiettecatte <fschietteca...@gmail.com>:
> > About the search 'referal_url:*www.someurl.com*', having a wildcard at
> the start will cause a dictionary scan for every term you search on unless
> you use ReversedWildcardFilterFactory. That could be the cause of your
> slowdown if you are I/O bound, and even if you are CPU bound for that
> matter.
> >
> > François
> >
> >
> > On Jan 8, 2012, at 8:44 PM, yu shen wrote:
> >
> >> Hi,
> >>
> >> My solr document has up to 20 fields, containing data from product name,
> >> date, url etc.
> >>
> >> The volume of documents is around 1.5m.
> >>
> >> My symptom is when doing url search like [ url:*www.someurl.com*
> >> referal_url:*www.someurl.com* page_url:*www.someurl.com*] will get a
> >> extraordinary long response time, while search against all other fields,
> >> the response time will be normal.
> >>
> >> Can anyone share any insights on this?
> >>
> >> Spark
> >
>

Reply via email to