Re: Solr 3.4 problem with words separated by coma without space

darren Thu, 08 Dec 2011 07:08:03 -0800

This would seem to indicate that you are using a whitespace analyzer on
the default search field. I believe other analyzers will properly tokenize
around the comma.


> same problem with Solr 4.0
>
> 2011/12/8 elisabeth benoit <elisaelisael...@gmail.com>
>
>>
>>
>> Hello,
>>
>> I'm using Solr 3.4, and I'm having a problem with a request returning
>> different results if I have or not a space after a coma.
>>
>> The request "name, number rue taine paris" returns results with 4 words
>> out of 5 matching ("name", "number", "rue", "paris")
>>
>> The request "name,number rue taine paris" (no space between coma and
>> "number") returns no results, unless I set mm=3, and then matching words
>> are "rue", "taine", "paris".
>>
>> If I check in the solr.admin.analyzer, I get the same analysis for the
>> two
>> different requests. But it seems, if fact, that the lacking space after
>> coma prevents name and number from matching.
>>
>>
>> My field type is
>>
>>
>>       <analyzer type="query">
>>         <!-- découpage standard -->
>>         <tokenizer class="solr.StandardTokenizerFactory"/>
>>         <!-- normalisation des accents, cédilles, e dans l'o,... -->
>>         <charFilter class="solr.MappingCharFilterFactory"
>> mapping="mapping-ISOLatin1Accent.txt"/>
>>         <filter class="solr.ASCIIFoldingFilterFactory"/>
>>         <!-- suppression des . (I.B.M. => IBM) -->
>>         <filter class="solr.StandardFilterFactory"/>
>>         <!-- passage en minuscules -->
>>         <filter class="solr.LowerCaseFilterFactory"/>
>>         <!-- suppression de la ponctuation -->
>>         <filter class="solr.PatternReplaceFilterFactory"
>> pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/>
>>         <!-- suppression des tokens vides et des mots démesurés -->
>>         <filter class="solr.LengthFilterFactory" min="1" max="100" />
>>         <!-- découpage des mots composés -->
>>         <filter class="solr.WordDelimiterFilterFactory"
>> splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="1"
>> generateWordParts="1"
>>
>> generateNumberParts="1" catenateWords="0" catenateNumbers="1"
>> catenateAll="0" preserveOriginal="1"/>
>>         <!-- suppression des élisions (l', qu',...) -->
>>         <filter class="solr.ElisionFilterFactory"
>> articles="elisionwords.txt"/>
>>         <!-- suppression des mots insignifiants -->
>>         <filter class="solr.StopFilterFactory" ignoreCase="1"
>> words="stopwords.txt" enablePositionIncrements="true"/>
>>         <!-- lemmatisation (pluriels,...) -->
>>         <filter class="solr.SnowballPorterFilterFactory"
>> language="French"
>> protected="protwords.txt"/>
>>         <!-- suppression des doublons éventuels -->
>>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>>       </analyzer>
>>
>> Anyone has a clue?
>>
>> Thanks,
>> Elisabeth
>>
>

Re: Solr 3.4 problem with words separated by coma without space

Reply via email to