UnInvertedField vs FieldCache for facets for single-token text fields

Michael Ryan Thu, 03 Nov 2011 13:17:09 -0700

I have some fields I facet on that are TextFields but have just a single token.
The fieldType looks like this:


<fieldType name="myStringFieldType" class="solr.TextField" indexed="true"
    stored="false" omitNorms="true" sortMissingLast="true"
    positionIncrementGap="100">
  <analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
  </analyzer>
</fieldType>

SimpleFacets uses an UnInvertedField for these fields because
multiValuedFieldCache() returns true for TextField. I tried changing the type 
for
these fields to the plain "string" type (StrField). The facets *seem* to be
generated much faster. Is it expected that FieldCache would be faster than
UnInvertedField for single-token strings like this?

My goal is to make the facet re-generation after a commit as fast as possible. I
would like to continue using TextField for these fields since I have a need for
filters like LowerCaseFilterFactory, which still produces a single token. Is it
safe to extend TextField and have multiValuedFieldCache() return false for these
fields, so that UnInvertedField is not used? Or is there a better way to
accomplish what I'm trying to do?

-Michael

UnInvertedField vs FieldCache for facets for single-token text fields

Reply via email to