Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2024-01-23 Thread via GitHub
github-actions[bot] commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1907129767 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2024-01-08 Thread via GitHub
github-actions[bot] commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1880899961 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-21 Thread via GitHub
gf2121 commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1820687175 Thanks for feedback @mikemccand ! > Hmm it looks like random got a bit slower in candidate? Flush time ~550 ish ms in baseline and maybe ~650 ish ms in candidate? Ohhh! I rec

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-21 Thread via GitHub
mikemccand commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1820626033 > I also run the index script to see flush time with this new approach, result in ~15% faster for random data and no regression on asc/desc :) Hmm it looks like random got a bit

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-21 Thread via GitHub
gf2121 commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1820564110 I also run the index script to see flush time with this new approach, result in ~15% faster for random data and no regression on asc/desc :) Benchmark Detail **Baseline**

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-21 Thread via GitHub
gf2121 commented on code in PR #12800: URL: https://github.com/apache/lucene/pull/12800#discussion_r1400210595 ## lucene/core/src/java/org/apache/lucene/util/BaseLSBRadixSorter.java: ## @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-21 Thread via GitHub
gf2121 commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1820461507 I did some more work to find out the balance between memory / performance in various data distribution. The way i'm thinking now is that we keep the timsorter here, but make the run lengt

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-20 Thread via GitHub
gf2121 commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1818846591 Thanks for feedback @jpountz ! > but this seems to come with greater heap requirements as well? Yes, +1 for the concern. The original approach requires at most `ArrayUtil.ove

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-16 Thread via GitHub
jpountz commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1814761757 I like the idea, but this seems to come with greater heap requirements as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-14 Thread via GitHub
gf2121 commented on code in PR #12800: URL: https://github.com/apache/lucene/pull/12800#discussion_r1392571640 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/DocSorterBenchmark.java: ## @@ -0,0 +1,241 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-13 Thread via GitHub
gf2121 commented on code in PR #12800: URL: https://github.com/apache/lucene/pull/12800#discussion_r1391150384 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/DocSorterBenchmark.java: ## @@ -0,0 +1,241 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-13 Thread via GitHub
mikemccand commented on PR #12800: URL: https://github.com/apache/lucene/pull/12800#issuecomment-1808206830 I wonder whether `Arrays.sort` might be a good choice instead of making our own powerful sorting classes? [OpenJDK is (gradually?) taking advantage of fast SIMD sorting](https://gith

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-13 Thread via GitHub
mikemccand commented on code in PR #12800: URL: https://github.com/apache/lucene/pull/12800#discussion_r1391138011 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/DocSorterBenchmark.java: ## @@ -0,0 +1,241 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-13 Thread via GitHub
mikemccand commented on code in PR #12800: URL: https://github.com/apache/lucene/pull/12800#discussion_r1391137486 ## lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/DocSorterBenchmark.java: ## @@ -0,0 +1,241 @@ +/* + * Licensed to the Apache Software Foundation (A

[PR] Generalize LSBRadixSorter and use it in SortingPostingsEnum [lucene]

2023-11-13 Thread via GitHub
gf2121 opened a new pull request, #12800: URL: https://github.com/apache/lucene/pull/12800 **Description** In https://github.com/apache/lucene/pull/12114, we had great numbers for LSB radix sorter when sorting random docs in `SortingDocsEnum` . But we can not take advantage of the LS