On 3/1/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 3/1/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> Faceting is much happier if you use a single valued field, but my apps
> all require multivalued fields:

If by "happy" you mean performance, things should be better in the
future though.


yes, performance.  The docs seems to say "avoid faceting on
multiValued fields if possible"

With SOLR-153, do you think that won't be an issue anymore?


>
> I'd like to use copyField to accumulate the multivalued fields into a
> single field that can be efficiently faceted.

Not sure I understand...  you don't want counts for aaa, bbb, and ccc
separately, but you want counts for the combined values "aaa;bbb;ccc"?

I'm not sure I see the usecases for this.


Maybe its clearer if i say

<arr name="subject">
 <str>San Francisco</str>
 <str>San Diego</str>
 <str>DC</str>
</arr>

I want facets for "San Francisco", "San Diego" and "DC", not "san"
"francisco", "diego", "dc".  I want the faceting to be as efficient as
it could/should be.  If i search for "San Fran" (or San Leandro) this
doc should show up.

I was suggesting using copyField with accumulate the cities into a
single field used for faceting:
 tokens[] = "San Francisco; San Diego; DC".split( ";" )

In my current setup, I have:

<field name="subject" type="string" indexed="true" stored="true"
multiValued="true"/>
<field name="subject_txt" type="text" indexed="true" stored="false"
multiValued="true"/>
<copyField source="subject" dest="subject_txt"  />

I facet on the multivalued field "subject" and search on the text
field "subject_txt" -- "subject" is stored as a "string" so that the
tokens resemble the input, and "subject_txt" is tokenized for search.
If i have to go through the overhead of copy field to make search and
faceting work nice together, it may as well be configured to be as
efficient as possible.  Should I ignore the problem for now, and bank
on SOLR-153?

Am i missing something?

thanks
ryan

Reply via email to