I thought I asked a variation of this before, but I don't see it on the
list, apologies if this is a duplicate, but I have new questions.
So I need to find the min and max value of a result set. Which can be
several million documents. One way to do this is the StatsComponent.
One problem is that I'm having performance problems with StatsComponent
across so many documents, adding the stats component on the field I'm
interested in is adding 10s to my query response time.
So one question is if there's any way to increase StatsComponent
performance. Does it use any caches, or does it operate without caches?
My Solr is running near the top of it's heap size, although I'm not
currently getting any OOM errors, perhaps not enough free memory is
somehow hurting StatsComponent performance. Or any other ideas for
increasing StatsComponent performance?
But it also occurs to me that the StatsComponent is doing a lot more
than I need. I just need min/max. And the cardinality of this field is a
couple orders of magnitude lower than the total number of documents. But
StatsComponent is also doing a bunch of other things, like sum, median,
etc. Perhaps if there were a way to _just_ get min/max, it would be
faster. Is there any way to get min/max values in a result set other
than StatsComponent?
Jonathan
- min/max, StatsComponent, performance Jonathan Rochkind
-