I don't suppose it's something silly like the fact that your indexing chain includes 'words="stopwords.txt"', and your query chain does not?
Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com _____ Early COSUGI birds get the worm! Register by 15 February and get a one time viewing of the three course Circulation Basics self-paced training suite. http://www.cosugi.org/ > -----Original Message----- > From: openvictor Open [mailto:openvic...@gmail.com] > Sent: Thursday, February 03, 2011 12:02 AM > To: solr-user@lucene.apache.org > Subject: Using terms and N-gram > > Dear all, > > I am trying to implement an autocomplete system for research. But I am > stuck > on some problems that I can't solve. > > Here is my problem : > I give text like : > "the cat is black" and I want to explore all 1 gram to 8 gram for all > the > text that are passed : > the, cat, is, black, the cat, cat is, is black, etc... > > In order to do that I have defined the following fieldtype in my schema > : > > <!--Custom fieldtype--> > <fieldType name="ngram_field" class="solr.TextField"> > <analyzer type="index"> > <tokenizer class="solr.LowerCaseTokenizerFactory" /> > <filter class="solr.CommonGramsFilterFactory" words="stopwords.txt" > ignoreCase="true" maxGramSize="8" > minGramSize="1"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.LowerCaseTokenizerFactory" /> > <filter class="solr.CommonGramsFilterFactory" ignoreCase="true" > maxGramSize="8" > minGramSize="1"/> > </analyzer> > </fieldType> > > > Then the following field : > > <field name="p_title_ngram" type="ngram_field" indexed="true" > stored="true"/> > > Then I feed solr with some phrases and I was really surprised to see > that > Solr didn't behave as expected. > I went to the schema browser to see the result for the very profound > query : > "the cat is black and it rains" > > The results are quite deceiving : first 1 grams are not found. some 2 > grams > are found like : the_cat, "and_it" etc... But not what I expected. > Is there something I am missing here ? (by the way I also tried to > remove > the mingramsize and maxgramsize even the words). > > Thank you, > Victor Kabdebon