In schema.xml, at the very bottom, you should see:

  <!--
  <similarity class="com.example.solr.CustomSimilarityFactory">
    <str name="paramkey">param value</str>
  </similarity>
  -->

I believe creating the Factory wrapper is pretty simple.  See 
http://wiki.apache.org/solr/SolrPlugins

On Feb 4, 2009, at 7:29 PM, Jonah Schwartz wrote:

We want to configure solr so that fields are indexed with a maximum term frequency and a minimum document length. If a term appears more than N times
in a field it will be considered to have appeared only N times. If a
document length is under M terms, it will be considered to exactly M terms. We have done this in the past in raw Lucene by writing a Similarity class
like this:

public class LimitingSimilarity extends DefaultSimilarity {
  public float lengthNorm(String fieldName, int numTerms) {
return super.lengthNorm(fieldName, Math.max(minNumTerms, numTerms));
  }
  public float tf(float freq) {
      freq = Math.min(maxTermFrequency,freq);
      return super.tf(freq);
  }
}


Is there a better way to this within solr configuration files?

Thanks,
Jonah

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ











Reply via email to