Dear all, I am trying to implement an autocomplete system for research. But I am stuck on some problems that I can't solve.
Here is my problem : I give text like : "the cat is black" and I want to explore all 1 gram to 8 gram for all the text that are passed : the, cat, is, black, the cat, cat is, is black, etc... In order to do that I have defined the following fieldtype in my schema : <!--Custom fieldtype--> <fieldType name="ngram_field" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.LowerCaseTokenizerFactory" /> <filter class="solr.CommonGramsFilterFactory" words="stopwords.txt" ignoreCase="true" maxGramSize="8" minGramSize="1"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.LowerCaseTokenizerFactory" /> <filter class="solr.CommonGramsFilterFactory" ignoreCase="true" maxGramSize="8" minGramSize="1"/> </analyzer> </fieldType> Then the following field : <field name="p_title_ngram" type="ngram_field" indexed="true" stored="true"/> Then I feed solr with some phrases and I was really surprised to see that Solr didn't behave as expected. I went to the schema browser to see the result for the very profound query : "the cat is black and it rains" The results are quite deceiving : first 1 grams are not found. some 2 grams are found like : the_cat, "and_it" etc... But not what I expected. Is there something I am missing here ? (by the way I also tried to remove the mingramsize and maxgramsize even the words). Thank you, Victor Kabdebon