Yao Ge schrieb:
The facet query is considerably slower comparing to other facets from
structured database fields (with highly repeated values). What I found
interesting is that even after I constrained search results to just a
few hunderd hits using other facets, these text facets are still very
slow.
I understand that text fields are not good candidate for faceting as
it can contain very large number of unique values. However why it is
still slow after my matching documents is reduced to hundreds? Is it
because the whole filter is cached (regardless the matching docs) and
I don't have enough filter cache size to fit the whole list?
Very interesting questions! I think an answer would both require and
further an understanding of how filters work, which might even lead to
a more general guideline on when and how to use filters and facets.
Even though faceting appears to have changed in 1.4 vs 1.3, it would
still be interesting to understand the 1.3 side of things.
Lastly, what I really want to is to give user a chance to visualize
and filter on top relevant words in the free-text fields. Are there
alternative to facet field approach? term vectors? I can do client
side process based on top N (say 100) hits for this but it is my last
option.
Also a very interesting data mining question! I'm sorry I don't have any
answers for you. Maybe someone else does.
Best,
Michael Ludwig