iverase commented on PR #979: URL: https://github.com/apache/lucene/pull/979#issuecomment-1168539694
I make a quick check if this patch by indexing 50 million documents in a sorted index. The documents just contain a SortedDocValues with a 10 bytes term. I checked the index size and the speed of retrieving the first document per term with different cardinalities and the results looks like: Cardinality ~1000 ``` | without patch | with patch Index Size (MB) | 2.800084114074707 | 2.8039379119873047 average advanceOrd (ms)| 0.39255053534999995 | 0.0011012437999999999 ``` Cardinality ~10000 ``` | without patch | with patch Index Size (MB) | 16.125946044921875 | 16.164132118225098 average advanceOrd (ms)| 0.52939177705 | 0.01008831655 ``` Cardinality ~10000 ``` | without patch | with patch Index Size (MB) | 49.320682525634766 | 49.57721138000488 average advanceOrd (ms)| 0.5479114709999999 | 0.03804306865 ``` Cardinality ~50000 ``` | without patch | with patch Index Size (MB) | 52.81498718261719 | 53.66002082824707 average advanceOrd (ms)| 0.6515335270999999 | 0.06898821255000001 ``` The new jump table is tiny compared to the size of the doc value while this new way of navigation os at least one order of magnitude faster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org