I found a post 
(http://lucene.472066.n3.nabble.com/Solr-4-3-Pivot-Performance-Issue-td4074617.html
 
<http://lucene.472066.n3.nabble.com/Solr-4-3-Pivot-Performance-Issue-td4074617.html>)
 commenting that the pivot performance issue happened after version 4.0.0. So I 
ran my test on version 4.0.0 and found that the pivoting did not suffer the 
performance crash, and generally produced much better results.

Values    |  Combined|     Facet|     Pivot|
9         |       180|       300|        34|
100       |       163|       521|        30|
961       |       729|       666|        72|
10000     |       709|      1006|       659|
99856     |      1896|      2214|       719|
499849    |      2989|      4863|      1719|
999872    |      5552|      8113|      3856|

Therefore I think something has definitely go awry.

N


> On 13 Nov 2014, at 13:49, Neil Ireson <n.ire...@sheffield.ac.uk> wrote:
> 
> Hi all,
> 
> I was running an experiment which involved counting terms by day, so I was 
> using pivot facets to get the counts. However as the number of time and term 
> values increased the performance got very rubbish. So I knocked up a quick 
> test, using a collection of 1 million documents with a different number of 
> random values, to compare different ways of getting the counts.
> 
> 1) Combined = combining the time and term in a single field.
> 2) Facet = for each term set the query to the term and then get the time 
> facet 
> 3) Pivot = get the pivot facet.
> 
> The results show that, as the number of values (i.e. number of terms * number 
> of times) increases, everything is fine until around 100,000 values and then 
> it goes pair-shaped for pivots, taking nearly 4 minutes for 1 million values, 
> the facet based approach produces much more robust performance.
> 
>           |      Processing time in ms     |
> Values    |  Combined|     Facet|     Pivot|
> 9         |       144|       391|        62|
> 100       |       170|       501|        52|
> 961       |       789|      1014|       153|
> 10000     |       907|      1966|       940|
> 99856     |      1647|      3832|      1960|
> 499849    |      5445|      7573|    136423|
> 999867    |      9926|      8690|    233725|
> 
> 
> In the end I used the facet rather than pivot approach but I’d like to know 
> why pivots have such a catastrophic performance crash? Is this an expected 
> behaviour of pivots or am I doing something wrong?
> 
> N
> 

Reply via email to