HoustonPutman opened a new issue, #13664:
URL: https://github.com/apache/lucene/issues/13664

   ### Description
   
   I've ran a benchmark (using Solr admittedly, not Lucene), that compares the 
speed of various sorted queries. The fields mentioned in the benchmark are the 
fields that were sorted on.
   
   Benchmark parameters:
   - 1M documents
     - High cardinality fields are unique values per document (an incrementing 
counter)
     - Low cardinality fields had 500 unique values
     - All fields were single-valued
   - The different fields were tested: `LongPointField`, `StrField` and 
`TrieLongField` (No longer in lucene)
     - All sort fields were tested using both DocValues and Uninversion.
   - Queries
     - 10 results were requested, for both grouped and non-grouped
     - Each field type, docValues/Uninversion combination were ran in 8 
configurations doing a matrix of:
       - Grouping and Non-grouping
       - Sort `asc` and `desc`
       - High Cardinality values and Low Cardinality values
     - The grouping:
       - All queries were grouped by a the same string field (250,000 unique 
values, docValues enabled)
       - The sort field was used both for both the group sorting and the 
document sorting within the groups
   
   <img width="1113" alt="Screenshot 2024-08-16 at 12 12 18 PM" 
src="https://github.com/user-attachments/assets/75868683-3fe8-4745-b0a9-1501e7af8f28";>
   
   Overall there are some interesting findings:
   
   - **For un-grouped queries (sorting by a high cardinality field), ascending 
sorts were 5x faster than descending sorts. For grouped queries, they were 40x 
faster.** That is ultimately what this issue is meant to address, so I 
highlighted it in the chart above.
     - Note: Low cardinality fields had no difference in speed
   - DocValues and Uninverted fields has similar sorting performance for 
grouped queries. (This is likely related to 
https://github.com/apache/lucene/issues/10368)
   
   I know that since this benchmark is using Solr, it's only so useful here. So 
I can utilize lucenebench to try to recreate this as well, if that would help.
   
   I also have the flame graphs for these benchmarks, which aren't as useful as 
a screenshot but I will provide one anyways (LongPointField, docValues, 
highCardinality, grouped, sorted descending):
   <img width="2221" alt="Screenshot 2024-08-16 at 12 05 02 PM" 
src="https://github.com/user-attachments/assets/0bba8cff-1a47-436d-9a53-3e52d31376ed";>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to