Hi Furkan, Thanks the suggestion, I always forget the most effective debugging tool the analysis page.
It turned out that "Jó" was a stop word and it was eliminated during the text analysis. What I will do is to create a new field type but without stop word removal and I will use it like this: <str name="suggestAnalyzerFieldType">short_text_hu_without_stop_removal</str> Thanks again Roland Furkan KAMACI <furkankam...@gmail.com> ezt írta (időpont: 2019. júl. 30., K, 16:17): > Hi Roland, > > Could you check Analysis tab ( > https://lucene.apache.org/solr/guide/8_1/analysis-screen.html) and tell > how > the term is analyzed for both query and index? > > Kind Regards, > Furkan KAMACI > > On Tue, Jul 30, 2019 at 4:50 PM Szűcs Roland <szucs.rol...@bookandwalk.hu> > wrote: > > > Hi All, > > > > I have an author suggester (searchcomponent and the related request > > handler) defined in solrconfig: > > <searchComponent name="suggest" class="solr.SuggestComponent"> > > <!-- All suggester component must have different filepath to avoid > > write lock issues-->> > > <lst name="suggester"> > > <str name="name">author</str> > > <str name="lookupImpl">AnalyzingInfixLookupFactory</str> > > <str name="dictionaryImpl">DocumentDictionaryFactory</str> > > <str name="field">BOOK_productAuthor</str> > > <str name="suggestAnalyzerFieldType">short_text_hu</str> > > <str name="indexPath">suggester_infix_author</str> > > <str name="buildOnStartup">false</str> > > <str name="buildOnCommit">false</str> > > <str name="minPrefixChars">2</str> > > </lst> > > </searchComponent> > > > > <requestHandler name="/suggesthandler" class="solr.SearchHandler" > > startup="lazy" > > > <lst name="defaults"> > > <str name="suggest">true</str> > > <str name="suggest.count">10</str> > > <str name="suggest.dictionary">author</str> > > </lst> > > <arr name="components"> > > <str>suggest</str> > > </arr> > > </requestHandler> > > > > Author field has just a minimal text processing in query and index time > > based on the following definition: > > <fieldType name="short_text_hu" class="solr.TextField" > > positionIncrementGap="100" multiValued="true"> > > <analyzer type="index"> > > <charFilter class="solr.HTMLStripCharFilterFactory"/> > > <tokenizer class="solr.ClassicTokenizerFactory"/> > > <filter class="solr.StopFilterFactory" words="stopwords_hu.txt" > > ignoreCase="true"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.ClassicTokenizerFactory"/> > > <filter class="solr.StopFilterFactory" words="stopwords_hu.txt" > > ignoreCase="true"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > </analyzer> > > </fieldType> > > <fieldType name="string" class="solr.StrField" sortMissingLast="true" > > docValues="true"/> > > <fieldType name="strings" class="solr.StrField" sortMissingLast="true" > > docValues="true" multiValued="true"/> > > <fieldType name="text_ar" class="solr.TextField" > > positionIncrementGap="100"> > > <analyzer> > > <tokenizer class="solr.StandardTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.StopFilterFactory" > words="lang/stopwords_ar.txt" > > ignoreCase="true"/> > > <filter class="solr.ArabicNormalizationFilterFactory"/> > > <filter class="solr.ArabicStemFilterFactory"/> > > </analyzer> > > </fieldType> > > > > When I use qeries with only ASCII characters, the results are correct: > > "Al":{ > > "term":"<b>Al</b>exandre Dumas", "weight":0, "payload":""} > > > > When I try it with Hungarian authorname with special character: > > "Jó":"author":{ > > "Jó":{ "numFound":0, "suggestions":[]}} > > > > When I try it with three letters, it works again: > > "Józ":"author":{ > > "Józ":{ "numFound":10, "suggestions":[{ "term":"Bajza <b>Józ</b>sef", " > > weight":0, "payload":""}, { "term":"Eötvös <b>Józ</b>sef", "weight":0, " > > payload":""}, { "term":"Eötvös <b>Józ</b>sef", "weight":0, > "payload":""}, { > > "term":"Eötvös <b>Józ</b>sef", "weight":0, "payload":""}, { > > "term":"<b>Józ</b>sef > > Attila", "weight":0, "payload":""}.. > > > > Any idea how can it happen that a longer string has more matches than a > > shorter one. It is inconsistent. What can I do to fix it as it would > > results poor customer experience. > > They would feel that sometimes they need 2 sometimes 3 characters to get > > suggestions. > > > > Thanks in advance, > > Roland > > >