Hi Erick, I only added debugyQuery=on to the url, and did not do any configuration with regard to DebugComponent. Seems like 'string' type should be substituted with 'text' type.
I will paste the result here after I did some experiments. Spark 2012/1/9 Erick Erickson <erickerick...@gmail.com> > Do you by chance have the debugQuery on by default? > Because if you look down in the "timing" section, > you can see the times the various components took to do > their work, there are two sections "prepare" and "process". > > The cumulative time is 17.156 seconds. Of which 17.156 > seconds is reported to be in the DebugComponent..... > > So what happens if you just turn that component off? Because > I don't see anything in your output that really looks like it is > taking any time. Of course if you've changed your code from > *url* to url*, that will account for time too, since the infix case > requires that every term in the fields in question be examined. > > About WordDelimiterFilterFactory That is irrelevant for a "string" > type. It's an oen question whether a string type is what you > want, but that is determined by your problem space. You might > spend some time with admin/analysis to see the effects of > various analysis chains. "string" is used when you want no > tokenization, no case transformations etc. > > Best > Erick > > On Mon, Jan 9, 2012 at 10:04 AM, yu shen <shenyu...@gmail.com> wrote: > > Hi Erick, > > > > Thanks for you reply. Actually I did the following search: > > survey_url:http\://www.someurl.com/sch/i.html* referal_url:http\:// > > www.someurl.com/sch/i.html* page_url:http\://www.someurl.com/sch/i.html* > > > > I did not prepend any asterisk to the field value, but only append to > them. > > > > I analyze url field on solr admin page, and it give me this, meaning the > > url is not tokenized. I notice you mentioned a > WordDelimiterFilterFactory. > > Do I need to configure it in schema.xml or some place else? > > term position 1 term text http://www.someurl.com/sch/i.html* term type > > word source > > start,end 0,31 > > I add the debugQuery=on to the query url, I got this (Sorry to paste such > > long encrypted code here, they are really mysterious to me) > > <lst name="debug"> > > <str name="rawquerystring">survey_url:http\:// > > www.someurl.com/sch/i.html* > > referal_url:http\://www.someurl.com/sch/i.html*page_url:http\://<http://www.someurl.com/sch/i.html*page_url:http%5C://> > > www.someurl.com/sch/i.html*</str> > > <str name="querystring">survey_url:http\:// > www.someurl.com/sch/i.html*referal_url:http\://<http://www.someurl.com/sch/i.html*referal_url:http%5C://> > > www.someurl.com/sch/i.html* page_url:http\://www.someurl.com/sch/i.html* > > </str> > > <str name="parsedquery">survey_url: > http://www.someurl.com/sch/i.html*referal_url: > > http://www.someurl.com/sch/i.html* page_url: > > http://www.someurl.com/sch/i.html*</str> > > <str name="parsedquery_toString">survey_url: > > http://www.someurl.com/sch/i.html* referal_url: > > http://www.someurl.com/sch/i.html* page_url: > > http://www.someurl.com/sch/i.html*</str> > > <lst name="explain"> > > <str name="5007688343"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str> > > <str name="5007648909"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str> > > <str name="5007653989"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str> > > <str name="5007709065"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str> > > <str name="5007710379"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str><str name="5007739634"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str><str name="5007753066"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str><str name="5007756045"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str><str name="5007832978"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str><str name="5007849124"> > > 0.76980036 = (MATCH) product of: > > 1.1547005 = (MATCH) sum of: > > 0.57735026 = (MATCH) ConstantScoreQuery(referal_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.57735026 = (MATCH) ConstantScoreQuery(page_url: > > http://www.someurl.com/sch/i.html*), product of: > > 1.0 = boost > > 0.57735026 = queryNorm > > 0.6666667 = coord(2/3) > > </str></lst><str name="QParser">LuceneQParser</str><lst > > name="timing"><double name="time">17156.0</double><lst > > name="prepare"><double name="time">0.0</double><lst > > name="org.apache.solr.handler.component.QueryComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.FacetComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.MoreLikeThisComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.HighlightComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.StatsComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.DebugComponent"><double > > name="time">0.0</double></lst></lst><lst name="process"><double > > name="time">17156.0</double><lst > > name="org.apache.solr.handler.component.QueryComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.FacetComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.MoreLikeThisComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.HighlightComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.StatsComponent"><double > > name="time">0.0</double></lst><lst > > name="org.apache.solr.handler.component.DebugComponent"><double > > name="time">17156.0</double></lst></lst></lst></lst> > > > > > > > > 2012/1/9 Erick Erickson <erickerick...@gmail.com> > > > >> Yu Shen & Arian: > >> > >> We can't help much without more information. In particular, how are > >> the fields in question analyzed? What is the result of looking > >> at the admin/analysis page? What do you get when you > >> attach &debugQuery=on to the query? > >> > >> You might review: > >> http://wiki.apache.org/solr/UsingMailingLists > >> > >> But at a wild guess, you have something like WordDelimiterFilterFactory > >> in your analysis chain, and it's splitting up your input into > >> "www" "someurl" "com" as separate tokens, and www matches > >> all documents so Solr is having to score all documents in your corpus, > but > >> that's just a guess. See the admin/schema browser page and find the most > >> frequent terms for the field in question, that should indicate whether > >> you have some tokens that appear in all docs. Try searching on > >> plain "someurl". Is that slow? Or "someurl.anotherpart" or whatever. > >> > >> Best > >> Erick > >> > >> 2012/1/9 François Schiettecatte <fschietteca...@gmail.com>: > >> > About the search 'referal_url:*www.someurl.com*', having a wildcard > at > >> the start will cause a dictionary scan for every term you search on > unless > >> you use ReversedWildcardFilterFactory. That could be the cause of your > >> slowdown if you are I/O bound, and even if you are CPU bound for that > >> matter. > >> > > >> > François > >> > > >> > > >> > On Jan 8, 2012, at 8:44 PM, yu shen wrote: > >> > > >> >> Hi, > >> >> > >> >> My solr document has up to 20 fields, containing data from product > name, > >> >> date, url etc. > >> >> > >> >> The volume of documents is around 1.5m. > >> >> > >> >> My symptom is when doing url search like [ url:*www.someurl.com* > >> >> referal_url:*www.someurl.com* page_url:*www.someurl.com*] will get a > >> >> extraordinary long response time, while search against all other > fields, > >> >> the response time will be normal. > >> >> > >> >> Can anyone share any insights on this? > >> >> > >> >> Spark > >> > > >> >