On 5/24/2010 6:30 AM, Sascha Szott wrote:
Hi folks,
is it possible to sort by field length without having to (redundantly)
save the length information in a seperate index field? At first, I
thought to accomplish this using a function query, but I couldn't find
an appropriate one.
I have a slightly different need related to this, though it may turn out
that what Sascha wants is similar. I would like to understand my data
better so I can improve my schema. I need to do some data mining that
is (to my knowledge) difficult or impossible with the source database.
Performance is irrelevant, as long as it finishes eventually.
Completing in less than an hour would be nice.
I would do this on a test system with much lower performance and memory
(4GB) than my production servers, as a single index instead of multiple
shards. When it finishes building, the entire test index is likely to
be about 75GB.
What I'm after is an output that would look very much like faceting, but
I want it to show document counts associated with field length (for a
simple string) and number of terms (for a tokenized field) instead of
field value. Can Solr do that, and if so, what do I need to have
enabled in the schema to get it? Would branch_3x be enough, or would
trunk be better?
Thanks,
Shawn