Hello Jack, Thanks for pointing the issues out and for your valuable suggestion. My preliminary tests were okay on search but I will be doing more testing to see if this has impacted any other searches.
Thanks once again and have a nice sunny weekend, Sandeep On 17 May 2013 05:35, Jack Krupansky <j...@basetechnology.com> wrote: > Ah... I think your issue is the preserveOriginal=1 on the query analyzer > as well as the fact that you have all of these catenatexx="1" options on > the query analyzer - I indicated that you should remove them all. > > The problem is that the whitespace analyzer leaves the leading comma in > place, and the preserveOriginal="1" also generates an extra token for the > term, with the comma in place . But, with the space, the comma and "10" are > separate terms and get analyzed independently. > > The query results probably indicate that you don't have that exact > combination of the term and leading punctuation - or that there is no > standalone comma in your input data. > > Try the following replacement for the query-time WDF: > > > <filter class="solr.**WordDelimiterFilterFactory" > stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1" > catenateWords="0" catenateNumbers="0" catenateAll="0" > splitOnCaseChange="1" splitOnNumerics="0" preserveOriginal="0" /> > > > -- Jack Krupansky > > -----Original Message----- From: Sandeep Mestry > Sent: Thursday, May 16, 2013 5:50 PM > > To: solr-user@lucene.apache.org > Subject: Re: Question about Edismax - Solr 4.0 > > Hi Jack, > > Thanks for your response again and for helping me out to get through this. > > The URL is definitely encoded for spaces and it looks like below. As I > mentioned in my previous mail, I can't add it to query parameter as that > searches on multiple fields. > > The title field is defined as below: > <field name="title" type="text_wc" indexed="true" stored="false" > multiValued="true"/> > > q=countryside&rows=20&qt=**assdismax&fq=%28title%3A%28,** > 10%29%29&fq=collection:assets > > <requestHandler name="assdismax" class="solr.SearchHandler"> > <lst name="defaults"> > <str name="defType">edismax</str> > <str name="echoParams">explicit</**str> > <float name="tie">0.01</float> > <str name="qf">title^10 description^5 annotations^3 notes^2 > categories</str> > <str name="pf">title</str> > <int name="ps">0</int> > <str name="q.alt">*:*</str> > <str name="fl">*,score</str> > <str name="mm">100%</str> > <str name="q.op">AND</str> > <str name="sort">score desc</str> > <str name="facet">true</str> > <str name="facet.limit">-1</str> > <str name="facet.mincount">1</str> > <str name="facet.field">uniq_**subtype_id</str> > <str name="facet.field">component_**type</str> > <str name="facet.field">genre_type<**/str> > </lst> > <lst name="appends"> > <str name="fq">collection:assets</**str> > </lst> > </requestHandler> > > The term 'countryside' needs to be searched against multiple fields > including titles, descriptions, annotations, categories, notes but the UI > also has a feature to limit results by providing a title field. > > > I can see that the filter queries are always parsed by LuceneQueryParser > however I'd expect it to generate the parsed_filter_queries debug output in > every situation. > > I have tried it as the main query with both edismax and lucene defType and > it gives me correct output and correct results. > But, there is some problem when this is used as a filter query as the the > parser is not able to parse a comma with a space. > > Thanks again Jack, please let me know in case you need more inputs from my > side. > > Best Regards, > Sandeep > > On 16 May 2013 18:03, Jack Krupansky <j...@basetechnology.com> wrote: > > Could you show us the full query URL - spaces must be encoded in URL query >> parameters. >> >> Also show the actual field XML - you omitted that. >> >> Try the same query as a main query, using both defType=edismax and >> defType=lucene. >> >> Note that the filter query is parsed using the Lucene query parser, not >> edismax, independent of the defType parameter. But you don't have any >> edismax features in your fq anyway. >> >> But you can stick {!edismax} in front of the query to force edismax to be >> used for the fq, although it really shouldn't change anything: >> >> Also, catenate is fine for indexing, but will mess up your queries at >> query time, so set them to "0" in the query analyzer >> >> Also, make sure you have autoGeneratePhraseQueries="****true" on the >> field >> >> type, but that's not the issue here. >> >> >> -- Jack Krupansky >> >> -----Original Message----- From: Sandeep Mestry >> Sent: Thursday, May 16, 2013 12:42 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Question about Edismax - Solr 4.0 >> >> >> Thanks Jack for your reply.. >> >> The problem is, I'm finding results for fq=title:(,10) but not for >> fq=title:(, 10) - apologies if that was not clear from my first mail. >> I have already mentioned the debug analysis in my previous mail. >> >> Additionally, the title field is defined as below: >> <fieldType name="text_wc" class="solr.TextField" >> positionIncrementGap="100" >> >> >>> <analyzer type="index"> >>> >> <tokenizer class="solr.****WhitespaceTokenizerFactory"/> >> <filter class="solr.****WordDelimiterFilterFactory" >> >> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1" >> catenateWords="1" catenateNumbers="1" catenateAll="1" >> splitOnCaseChange="1" >> splitOnNumerics="0" preserveOriginal="1" /> >> <filter class="solr.****LowerCaseFilterFactory"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.****WhitespaceTokenizerFactory"/> >> <filter class="solr.****WordDelimiterFilterFactory" >> >> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1" >> catenateWords="1" catenateNumbers="1" catenateAll="1" >> splitOnCaseChange="1" >> splitOnNumerics="0" preserveOriginal="1" /> >> <filter class="solr.****LowerCaseFilterFactory"/> >> >> </analyzer> >> </fieldType> >> >> I have the set catenate options to 1 for all types. >> I can understand if ',' getting ignored when it is on its own (title:(, >> 10)) but >> - Why solr is not searching for 10 in that case just like it did when the >> query was (title:(,10))? >> - And why other filter queries did not show up (collection:assets) in >> debug >> section? >> >> >> Thanks, >> Sandeep >> >> >> On 16 May 2013 13:57, Jack Krupansky <j...@basetechnology.com> wrote: >> >> You haven't indicated any problem here! What is the symptom that you >> >>> actually think is a problem. >>> >>> There is no comma operator in any of the Solr query parsers. Comma is >>> just >>> another character that may or may not be included or discarded depending >>> on >>> the specific field type and analyzer. For example, a white space analyzer >>> will keep commas, but the standard analyzer or the word delimiter filter >>> will discard them. If "title" were a "string" type, all punctuation would >>> be preserved, including commas and spaces (but spaces would need to be >>> escaped or the term text enclosed in parentheses.) >>> >>> Let us know what your symptom is though, first. >>> >>> I mean, the filter query looks perfectly reasonable from an abstract >>> perspective. >>> >>> -- Jack Krupansky >>> >>> -----Original Message----- From: Sandeep Mestry >>> Sent: Thursday, May 16, 2013 6:51 AM >>> To: solr-user@lucene.apache.org >>> Subject: Question about Edismax - Solr 4.0 >>> >>> -- *Edismax and Filter Queries with Commas and spaces* -- >>> >>> >>> Dear Experts, >>> >>> This appears to be a bug, please suggest if I'm wrong. >>> >>> If I search with the following filter query, >>> >>> 1) fq=title:(, 10) >>> >>> - I get no results. >>> - The debug output does NOT show the section containing >>> parsed_filter_queries >>> >>> if I carry a search with the filter query, >>> >>> 2) fq=title:(,10) - (No space between , and 10) >>> >>> - I get results and the debug output shows the parsed filter queries >>> section as, >>> <arr name="filter_queries"> >>> <str>(titles:(,10))</str> >>> <str>(collection:assets)</str> >>> >>> As you can see above, I'm also passing in other filter queries >>> (collection:assets) which appear correctly but they do not appear in case >>> 1 >>> above. >>> >>> I can't make this as part of the query parameter as that needs to be >>> searched against multiple fields. >>> >>> Can someone suggest a fix in this case please. I'm using Solr 4.0. >>> >>> Many Thanks, >>> Sandeep >>> >>> >>> >> >