Chris Hostetter schrieb:
: to make it clear, i agree that it doesn't make sense faceting on all
: available fields, I only want faceting on those 300 attributes that are
: stored together with the fields for full text searches. A
: product/document has typically only 5-10 attributes.
:
: I like to decide at index time which attributes of a product might be of
: interest for faceting and store those in dynamic fields with the
: attribute-name and some kind of prefix or suffix to identify them at
: query time as facet.fields. Exactly the naming convention you mentioned.

but if the facet fields are different for every document, and they use a
simple dynamicField prefix (like "facet_*" for example) how do you know at
query time which fields to facet on? ... even if wildcards work in
facet.field, usingfacet.field=facet_* would require solr to compute the
counts for *every* field matching that pattern to find out which ones have
positive counts for the current result set -- there may only be 5 that
actually matter, but it's got to try all 300 of them to find out which 5
that is.
I just made a quick test by building a facet query with those 300 attributes. I realized, that the facets are build out of the whole index, not the subset
returned by the initial query. Therefore I have a large number of empty
facets which I simply ignore. In my case the QueryTime is somewhat higher (of
course) but it is still at some milliseconds. (wow!!!) :o)

So at this state of my investigation and in my use case I don't have to worry about performance even if I use the system in a way that uses more resources
than necessary.
this is where custom request handlers that understand that faceting
"metadata" for your documents becomes key ... so you can say "when
querying across the entire collection, only try to facet on category and
manufacturer.  if the search is constrained by category, then lookup other
facet options to offer based on that category name from our metadata
store, etc...
Faceting on manufacturers and categories first and than present the
corresponding facets might be used under some circumstances, but in my case
the category structure is quite deep, detailed and complex. So when
the user enters a query I like to say to him "Look, here are the
manufacturers and categories with matches to your query, choose one if you
want, but maybe there is another one with products that better fit your
needs or products that you didn't even know about. So maybe you like to
filter based on the following attributes." Something like this ;o)

The point is, that i currently don't want to know too much about the data,
I just want to feed it into solr, follow some conventions and get the most
out of it as quickly as possible. Optimizations can and will take place at
a later time.

I hope to find some time to dig into solr SimpleFacets this weekend.

Regards,

Tom

Reply via email to