neoremind commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-824164245
> +1 to always use the stable version of the algorithm. This would only use transient memory and in reasonable amounts, so I'm not concerned with the memory usage. Per comment, I make stable reorder in MSB Radix Sort as default. Things become a little tricky. The benchmark of this PR branch goes like below. Starting from bytePerDim=4, non-stable MSB Radix Sort runs faster than stable version. ``` ------------------------------------------------- | bytesPerDim | isDocIdIncremental | avg time(us) | ------------------------------------------------- | 1 | N | 981732.7 | | 1 | Y | 56766.1 | | 2 | N | 941666.1 | | 2 | Y | 304395.9 | | 3 | N | 1306374.9 | | 3 | Y | 853744.8 | | 4 | N | 1262810.3 | | 4 | Y | 1345605.8 | | 8 | N | 1282193.9 | | 8 | Y | 1364078.3 | | 16 | N | 1269967.4 | | 16 | Y | 1433378.3 | | 32 | N | 1293686.0 | | 32 | Y | 1468472.1 | ------------------------------------------------- ``` To make it more clear, I also add the new result into the graph I made above.  I try to analysis, according to the flame graph I generated (see above picture). Sorting includes the following stages. - build histogram - fallback intro sort (quick sort for non-stable, in place merge sort for stable) - reorder Let's illustrate the total time partitioned by the cost per stage. For main branch `MSBRadixSort`, it shows as below. ``` +--------------------------------------------------------------------------------------------------------+ | build histogram | fallback sort (quick sort) | non stable reorder | +--------------------------------------------------------------------------------------------------------+ ``` For PR branch `StableMSBRadixSort`, it shows as below ``` +-------------------------------------------------------------------------------------------------+ | build histogram | fallback sort (in place merge sort) | stable reorder | +-------------------------------------------------------------------------------------------------+ ``` For PR branch `MSBRadixSort` making stable reorder as default, it shows as below. ``` +----------------------------------------------------------------------------------+ | build histogram | fallback sort (quick sort) | stable reorder | +----------------------------------------------------------------------------------+ ``` I think that stable reorder contributes most part of the speedup, but in place merge sort is slower than 3-pivot quick sort. After all, regardless of the order of doc IDs, stable reorder is definitely a good choice. If data are spread evenly, not many duplicates, `StableMSBRadixSort` is a little slower (10%) than `MSBRadixSorter`. If not, `StableMSBRadixSort` will out-perform far better than `MSBRadixSorter`. This makes me reluctant on the way we should go, @jpountz could you give me some advice? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org