[ https://issues.apache.org/jira/browse/LUCENE-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443195#comment-17443195 ]
Bruno Roustant commented on LUCENE-10225: ----------------------------------------- With the addition of the adaptive top-k algorithm triggered when k is close to from or last, we get significant perf gain when (k - from <= 20) or (last - k <= 20): {noformat} RANDOM IntroSelector ... 485 317 371 449 299 454 333 345 190 311 IntroSelector2 ... 85 86 86 97 90 87 87 86 87 91 RANDOM_LOW_CARDINALITY IntroSelector ... 475 483 239 445 234 236 455 460 365 508 IntroSelector2 ... 113 114 115 114 112 113 114 114 113 114 RANDOM_MEDIUM_CARDINALITY IntroSelector ... 529 287 448 374 347 392 356 412 182 408 IntroSelector2 ... 88 85 86 87 87 87 86 86 88 85 ASCENDING IntroSelector ... 157 150 148 146 146 147 146 147 146 146 IntroSelector2 ... 80 80 82 82 82 83 81 82 81 82 DESCENDING IntroSelector ... 250 250 246 246 254 251 255 247 247 249 IntroSelector2 ... 84 84 84 85 83 82 80 81 81 82 STRICTLY_DESCENDING IntroSelector ... 239 237 239 238 238 250 240 239 240 240 IntroSelector2 ... 82 83 82 81 82 80 82 85 84 83 ASCENDING_SEQUENCES IntroSelector ... 209 217 145 125 185 142 131 172 240 146 IntroSelector2 ... 85 86 86 84 84 82 83 83 83 81 MOSTLY_ASCENDING IntroSelector ... 151 155 150 150 154 147 150 154 154 154 IntroSelector2 ... 82 82 81 81 81 81 82 82 104 85 {noformat} > Improve IntroSelector with 3-way partitioning > --------------------------------------------- > > Key: LUCENE-10225 > URL: https://issues.apache.org/jira/browse/LUCENE-10225 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Bruno Roustant > Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > The same way we improved IntroSorter, we can improve IntroSelector with > Bentley-McIlroy 3-way partitioning. The gain in performance is roughly the > same. > With this new approach, we always use medians-of-medians (Tukey's Ninther), > so there is no real gain to fallback to a slower medians-of-medians technique > as an introspective protection (like the existing implementation does). > Instead we can simply shuffle the sub-range if we exceed the recursive max > depth (this does not change the speed as this intro-protective mechanism > almost never happens - maybe only in adversarial cases). -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org