Hi Erick, Thanks for you reply. Actually I did the following search: survey_url:http\://www.someurl.com/sch/i.html* referal_url:http\:// www.someurl.com/sch/i.html* page_url:http\://www.someurl.com/sch/i.html*
I did not prepend any asterisk to the field value, but only append to them. I analyze url field on solr admin page, and it give me this, meaning the url is not tokenized. I notice you mentioned a WordDelimiterFilterFactory. Do I need to configure it in schema.xml or some place else? term position 1 term text http://www.someurl.com/sch/i.html* term type word source start,end 0,31 I add the debugQuery=on to the query url, I got this (Sorry to paste such long encrypted code here, they are really mysterious to me) <lst name="debug"> <str name="rawquerystring">survey_url:http\:// www.someurl.com/sch/i.html* referal_url:http\://www.someurl.com/sch/i.html*page_url:http\:// www.someurl.com/sch/i.html*</str> <str name="querystring">survey_url:http\://www.someurl.com/sch/i.html*referal_url:http\:// www.someurl.com/sch/i.html* page_url:http\://www.someurl.com/sch/i.html* </str> <str name="parsedquery">survey_url:http://www.someurl.com/sch/i.html*referal_url: http://www.someurl.com/sch/i.html* page_url: http://www.someurl.com/sch/i.html*</str> <str name="parsedquery_toString">survey_url: http://www.someurl.com/sch/i.html* referal_url: http://www.someurl.com/sch/i.html* page_url: http://www.someurl.com/sch/i.html*</str> <lst name="explain"> <str name="5007688343"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str> <str name="5007648909"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str> <str name="5007653989"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str> <str name="5007709065"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str> <str name="5007710379"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str><str name="5007739634"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str><str name="5007753066"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str><str name="5007756045"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str><str name="5007832978"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str><str name="5007849124"> 0.76980036 = (MATCH) product of: 1.1547005 = (MATCH) sum of: 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.57735026 = (MATCH) ConstantScoreQuery(page_url: http://www.someurl.com/sch/i.html*), product of: 1.0 = boost 0.57735026 = queryNorm 0.6666667 = coord(2/3) </str></lst><str name="QParser">LuceneQParser</str><lst name="timing"><double name="time">17156.0</double><lst name="prepare"><double name="time">0.0</double><lst name="org.apache.solr.handler.component.QueryComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.FacetComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.MoreLikeThisComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.HighlightComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.StatsComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.DebugComponent"><double name="time">0.0</double></lst></lst><lst name="process"><double name="time">17156.0</double><lst name="org.apache.solr.handler.component.QueryComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.FacetComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.MoreLikeThisComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.HighlightComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.StatsComponent"><double name="time">0.0</double></lst><lst name="org.apache.solr.handler.component.DebugComponent"><double name="time">17156.0</double></lst></lst></lst></lst> 2012/1/9 Erick Erickson <erickerick...@gmail.com> > Yu Shen & Arian: > > We can't help much without more information. In particular, how are > the fields in question analyzed? What is the result of looking > at the admin/analysis page? What do you get when you > attach &debugQuery=on to the query? > > You might review: > http://wiki.apache.org/solr/UsingMailingLists > > But at a wild guess, you have something like WordDelimiterFilterFactory > in your analysis chain, and it's splitting up your input into > "www" "someurl" "com" as separate tokens, and www matches > all documents so Solr is having to score all documents in your corpus, but > that's just a guess. See the admin/schema browser page and find the most > frequent terms for the field in question, that should indicate whether > you have some tokens that appear in all docs. Try searching on > plain "someurl". Is that slow? Or "someurl.anotherpart" or whatever. > > Best > Erick > > 2012/1/9 François Schiettecatte <fschietteca...@gmail.com>: > > About the search 'referal_url:*www.someurl.com*', having a wildcard at > the start will cause a dictionary scan for every term you search on unless > you use ReversedWildcardFilterFactory. That could be the cause of your > slowdown if you are I/O bound, and even if you are CPU bound for that > matter. > > > > François > > > > > > On Jan 8, 2012, at 8:44 PM, yu shen wrote: > > > >> Hi, > >> > >> My solr document has up to 20 fields, containing data from product name, > >> date, url etc. > >> > >> The volume of documents is around 1.5m. > >> > >> My symptom is when doing url search like [ url:*www.someurl.com* > >> referal_url:*www.someurl.com* page_url:*www.someurl.com*] will get a > >> extraordinary long response time, while search against all other fields, > >> the response time will be normal. > >> > >> Can anyone share any insights on this? > >> > >> Spark > > >