On 7/16/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: > ...but i don't understand why both checking isTokenized() ... shouldn't : > multiValued() be enough? : : A field could return "false" for multiValued() and still have multiple : tokens per document for that field. ah .. right ... sorry: multiValued() indicates wether multiple discreet values can be added to the field (and stored if the field is stored) but says nothing baout what the Analyzer may do with any single value. perhaps we should really have an [f.foo.]facet.field.type=(single|multi) param to let clients indicate when they know exactly which method they wnat used (getFacetTermEnumCounts vs getFieldCacheCounts) ... if the property is not set, the default can be determeined using the "sf.multiValued() || ft.isTokenized() || ft instanceof BoolField" logic.
Or a method FieldType.multiToken(), and a new method TokenizerFactory/TokenFilterFactory.multiToken() that can be used to determine this when the FieldType was created (grrr, too bad they weren't abstract classes) Or a new attribute in the schema (but I don't like that solution much) But allowing the user to select the strategy has some merit, esp since there will be an additional way to find the top "n" when I get around to finishing my facet-tree-index code. -Yonik
