Hi xavier, Have you set the global similarity to solr.SchemaSimilarityFactory?
See <http://wiki.apache.org/solr/SchemaXml#Similarity>. Steve On Mar 21, 2013, at 9:44 AM, xavier jmlucjav <jmluc...@gmail.com> wrote: > Hi Felipe, > > I need to keep positions, that is why I cannot just use > omitTermFreqAndPositions > > > On Thu, Mar 21, 2013 at 2:36 PM, Felipe Lahti <fla...@thoughtworks.com>wrote: > >> Do you really need a custom similarity? >> Did you try to put the attribute "omitTermFreqAndPositions" in your field? >> >> It could be: >> >> <field name="description" omitTermFreqAndPositions="true" type="text" >> indexed="true" stored="true" multiValued="false" omitNorms="true" /> >> >> http://wiki.apache.org/solr/SchemaXml >> >> >> On Thu, Mar 21, 2013 at 7:35 AM, xavier jmlucjav <jmluc...@gmail.com> >> wrote: >> >>> I have the following setup: >>> >>> <fieldType name="text" class="solr.TextField" >>> positionIncrementGap="100"> >>> <analyzer> >>> <tokenizer class="solr.StandardTokenizerFactory"/> >>> <filter class="solr.LowerCaseFilterFactory"/> >>> </analyzer> >>> </fieldType> >>> <field name="description" type="text" indexed="true" >>> stored="true" multiValued="false" omitNorms="true" /> >>> >>> I index my corpus, and I can see tf is as usual, in this doc is 14 times >> in >>> this field: >>> 4.5094776 = (MATCH) weight(description:galaxy^10.0 in 440) >>> [DefaultSimilarity], result of: >>> 4.5094776 = score(doc=440,freq=14.0 = termFreq=14.0), product of: >>> 0.14165252 = queryWeight, product of: >>> 10.0 = boost >>> 8.5082035 = idf(docFreq=30, maxDocs=56511) >>> 0.0016648936 = queryNorm >>> 31.834784 = fieldWeight in 440, product of: >>> 3.7416575 = tf(freq=14.0), with freq of: >>> 14.0 = termFreq=14.0 >>> 8.5082035 = idf(docFreq=30, maxDocs=56511) >>> 1.0 = fieldNorm(doc=440) >>> >>> >>> Then I modify my schema: >>> >>> <similarity class="solr.SchemaSimilarityFactory"/> >>> <fieldType name="text" class="solr.TextField" >>> positionIncrementGap="100"> >>> <analyzer> >>> <tokenizer class="solr.StandardTokenizerFactory"/> >>> <filter class="solr.LowerCaseFilterFactory"/> >>> </analyzer> >>> <similarity class="com.customsolr.NoTfSimilarityFactory"/> >>> </fieldType> >>> >>> I just want to disable term freq > 1, so a term its either present or >> not. >>> >>> public class NoTfSimilarity extends DefaultSimilarity { >>> public float tf(float freq) { >>> return freq > 0 ? 1.0f : 0.0f; >>> } >>> } >>> >>> But I still see tf=14 in my query?? >>> 723.89526 = (MATCH) weight(description:galaxy^10.0 in 440) [], result of: >>> 723.89526 = score(doc=440,freq=14.0 = termFreq=14.0), product of: >>> 85.08203 = queryWeight, product of: >>> 10.0 = boost >>> 8.5082035 = idf(docFreq=30, maxDocs=56511) >>> 1.0 = queryNorm >>> 8.5082035 = fieldWeight in 440, product of: >>> 1.0 = tf(freq=14.0), with freq of: >>> 14.0 = termFreq=14.0 >>> 8.5082035 = idf(docFreq=30, maxDocs=56511) >>> 1.0 = fieldNorm(doc=440) >>> >>> anyone sees what I am missing? >>> I am on solr4.0 >>> >>> thanks >>> xavier >>> >> >> >> >> -- >> Felipe Lahti >> Consultant Developer - ThoughtWorks Porto Alegre >>