Hi Dalius, If not already tried, Check http://localhost:8983/solr/admin/analysis.jsp (enable verbose output for both Field Value index and query for details) for your queries and see what all filters/tokenizers are being applied.
Hope it helps! -param On 2/8/12 10:48 AM, "Dalius Sidlauskas" <dalius.sidlaus...@semantico.com> wrote: >If you can not read this mail easily check this ticket: >https://issues.apache.org/jira/browse/SOLR-3106 This is a copy. > >Regards! >Dalius Sidlauskas > > >On 08/02/12 15:44, Dalius Sidlauskas wrote: >> Sorry for inaccurate title. >> >> I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full) >> containing same value: >> >> <title xmlns="http://www.tei-c.org/ns/1.0">cal.lígraf</title> >> >> and these fields are configured accordingly: >> >> <fieldType name="xml" class="solr.TextField" >> positionIncrementGap="100"> >> <analyzer type="index"> >> <charFilter class="solr.HTMLStripCharFilterFactory"/> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.ICUFoldingFilterFactory"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.ICUFoldingFilterFactory"/> >> </analyzer> >> </fieldType> >> >> <fieldType name="xml_unicode" class="solr.TextField" >> positionIncrementGap="100"> >> <analyzer type="index"> >> <charFilter class="solr.HTMLStripCharFilterFactory"/> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> </analyzer> >> </fieldType> >> >> <fieldType name="xml_unicode_full" class="solr.TextField" >> positionIncrementGap="100"> >> <analyzer type="index"> >> <charFilter class="solr.HTMLStripCharFilterFactory"/> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> </analyzer> >> </fieldType> >> >> And finally my search configuration: >> >> <requestHandler name="dictionary" class="solr.SearchHandler"> >> <lst name="defaults"> >> <str name="echoParams">all</str> >> <str name="defType">edismax</str> >> <str name="mm">2<-25%</str> >> <str name="qf">dc_title_unicode_full^2 dc_title_unicode^2 dc_title</str> >> <int name="rows">10</int> >> <str name="spellcheck.onlyMorePopular">true</str> >> <str name="spellcheck.extendedResults">false</str> >> <str name="spellcheck.count">1</str> >> </lst> >> <arr name="last-components"> >> <str>spellcheck</str> >> </arr> >> </requestHandler> >> >> I am trying to match the field with various search phrases (that are >> valid). There are results: >> >> >> # search phrase match? Comment >> 1 cal.lígra? yes >> 2 cal.ligra? no Changed í to i >> 3 cal.ligraf yes >> 4 calligra? no >> >> >> The problem is the #2 attempt to match a data. The #3 works replacing >> ? with f. >> >> One more thing. If * is used insted of ? other data is matched as >> cal.lígrafia but not cal.lígraf... >> >> Also I have spotted some logic missmatch in debug parsedQuery field: >> * >> cal·lígraf:* +DisjunctionMaxQuery((dc_title:*calligraf*^2.0 | >> dc_title_unicode:cal·lígraf^3.0 | dc_title_unicode_full:cal·lígraf^3.0)) >> *cal·lígra?:*+DisjunctionMaxQuery((dc_title:*cal·lígra?*^2.0 | >> dc_title_unicode:cal·lígra?^3.0 | dc_title_unicode_full:cal·lígra?^3.0)) >> >> Should the second be "*calligra?*" insted?* >> >> *Environment: >> Tomcat 7.0.25 (request encoding UTF-8) >> Solr 3.5.0 >> Java 7 Oracle >> Ubuntu 11.10 >>