Do you really need a custom similarity? Did you try to put the attribute "omitTermFreqAndPositions" in your field?
It could be: <field name="description" omitTermFreqAndPositions="true" type="text" indexed="true" stored="true" multiValued="false" omitNorms="true" /> http://wiki.apache.org/solr/SchemaXml On Thu, Mar 21, 2013 at 7:35 AM, xavier jmlucjav <jmluc...@gmail.com> wrote: > I have the following setup: > > <fieldType name="text" class="solr.TextField" > positionIncrementGap="100"> > <analyzer> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > <field name="description" type="text" indexed="true" > stored="true" multiValued="false" omitNorms="true" /> > > I index my corpus, and I can see tf is as usual, in this doc is 14 times in > this field: > 4.5094776 = (MATCH) weight(description:galaxy^10.0 in 440) > [DefaultSimilarity], result of: > 4.5094776 = score(doc=440,freq=14.0 = termFreq=14.0), product of: > 0.14165252 = queryWeight, product of: > 10.0 = boost > 8.5082035 = idf(docFreq=30, maxDocs=56511) > 0.0016648936 = queryNorm > 31.834784 = fieldWeight in 440, product of: > 3.7416575 = tf(freq=14.0), with freq of: > 14.0 = termFreq=14.0 > 8.5082035 = idf(docFreq=30, maxDocs=56511) > 1.0 = fieldNorm(doc=440) > > > Then I modify my schema: > > <similarity class="solr.SchemaSimilarityFactory"/> > <fieldType name="text" class="solr.TextField" > positionIncrementGap="100"> > <analyzer> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <similarity class="com.customsolr.NoTfSimilarityFactory"/> > </fieldType> > > I just want to disable term freq > 1, so a term its either present or not. > > public class NoTfSimilarity extends DefaultSimilarity { > public float tf(float freq) { > return freq > 0 ? 1.0f : 0.0f; > } > } > > But I still see tf=14 in my query?? > 723.89526 = (MATCH) weight(description:galaxy^10.0 in 440) [], result of: > 723.89526 = score(doc=440,freq=14.0 = termFreq=14.0), product of: > 85.08203 = queryWeight, product of: > 10.0 = boost > 8.5082035 = idf(docFreq=30, maxDocs=56511) > 1.0 = queryNorm > 8.5082035 = fieldWeight in 440, product of: > 1.0 = tf(freq=14.0), with freq of: > 14.0 = termFreq=14.0 > 8.5082035 = idf(docFreq=30, maxDocs=56511) > 1.0 = fieldNorm(doc=440) > > anyone sees what I am missing? > I am on solr4.0 > > thanks > xavier > -- Felipe Lahti Consultant Developer - ThoughtWorks Porto Alegre