[GitHub] [lucene-site] zacharymorn commented on pull request #56: Add Zach Chen to committer list

2021-04-21 Thread GitBox
zacharymorn commented on pull request #56: URL: https://github.com/apache/lucene-site/pull/56#issuecomment-824588404 The change looks good at staging https://lucene.staged.apache.org/whoweare.html . Hi @janhoy, I have a quick question. I see you had a few recent PRs to merge from `

[jira] [Commented] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning

2021-04-21 Thread Zach Chen (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327129#comment-17327129 ] Zach Chen commented on LUCENE-9335: --- Makes sense. I guess the general strategy then wo

[jira] [Commented] (LUCENE-9934) MuseDev on Lucene?

2021-04-21 Thread Thomas DuBuisson (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327083#comment-17327083 ] Thomas DuBuisson commented on LUCENE-9934: -- Previously Muse was installed on ma

[jira] [Commented] (LUCENE-9934) MuseDev on Lucene?

2021-04-21 Thread David Smiley (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327079#comment-17327079 ] David Smiley commented on LUCENE-9934: -- +1 Yes definitely.  The "INFRA" Jira projec

[jira] [Commented] (LUCENE-9936) update gradle build to support gpg signing of tgz/zip distributions

2021-04-21 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326884#comment-17326884 ] Dawid Weiss commented on LUCENE-9936: - Nope, you're correct. I just stated the fact.

[jira] [Commented] (LUCENE-9936) update gradle build to support gpg signing of tgz/zip distributions

2021-04-21 Thread Chris M. Hostetter (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326842#comment-17326842 ] Chris M. Hostetter commented on LUCENE-9936: but that's just for the maven

[jira] [Commented] (LUCENE-9936) update gradle build to support gpg signing of tgz/zip distributions

2021-04-21 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326837#comment-17326837 ] Dawid Weiss commented on LUCENE-9936: - For the record - the artifacts published to A

[GitHub] [lucene] dweiss commented on pull request #100: Update gradle to 6.8.3

2021-04-21 Thread GitBox
dweiss commented on pull request #100: URL: https://github.com/apache/lucene/pull/100#issuecomment-824288978 Thanks @Jawnnypoo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [lucene] dweiss merged pull request #100: Update gradle to 6.8.3

2021-04-21 Thread GitBox
dweiss merged pull request #100: URL: https://github.com/apache/lucene/pull/100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please con

[jira] [Updated] (LUCENE-9936) update gradle build to support gpg signing of tgz/zip distributions

2021-04-21 Thread Chris M. Hostetter (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated LUCENE-9936: --- Attachment: LUCENE-9936.patch Assignee: Chris M. Hostetter Status: Open

[jira] [Created] (LUCENE-9936) update gradle build to support gpg signing of tgz/zip distributions

2021-04-21 Thread Chris M. Hostetter (Jira)
Chris M. Hostetter created LUCENE-9936: -- Summary: update gradle build to support gpg signing of tgz/zip distributions Key: LUCENE-9936 URL: https://issues.apache.org/jira/browse/LUCENE-9936 Proje

[GitHub] [lucene] jpountz commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
jpountz commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-824192328 Very cool, thanks for testing what happens with small integers! Do the numbers get better if you use `TimSorter` as a fallback sort? `InPlaceMergeSorter` is convenient because

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617698010 ## File path: lucene/core/src/java/org/apache/lucene/util/bkd/MutablePointsReaderUtils.java ## @@ -39,13 +41,23 @@ public static void sort( BKDCon

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617697446 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [lucene] neoremind edited a comment on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind edited a comment on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-824164245 > +1 to always use the stable version of the algorithm. This would only use transient memory and in reasonable amounts, so I'm not concerned with the memory usage. Pe

[GitHub] [lucene] neoremind edited a comment on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind edited a comment on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-824164245 > +1 to always use the stable version of the algorithm. This would only use transient memory and in reasonable amounts, so I'm not concerned with the memory usage. Pe

[GitHub] [lucene] neoremind edited a comment on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind edited a comment on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-824172567 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For que

[GitHub] [lucene] neoremind commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-824172567 > For instance I'd expect users who index integers (4 bytes) between 0 and 2^24 to notice speedups that are closer to the one that you computed for bytesPerDim=3 than for bytesPerD

[GitHub] [lucene] Jawnnypoo commented on pull request #100: Update gradle to 6.8.3

2021-04-21 Thread GitBox
Jawnnypoo commented on pull request #100: URL: https://github.com/apache/lucene/pull/100#issuecomment-824170487 Yep, palantir's consistency check is on the latest version. Hopefully we will get the 7.0 check soon🤞 -- This is an automated message from the Apache Git Service. To res

[GitHub] [lucene] neoremind commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-824164245 > +1 to always use the stable version of the algorithm. This would only use transient memory and in reasonable amounts, so I'm not concerned with the memory usage. Per comme

[GitHub] [lucene] jpountz commented on pull request #92: Expunge big segment with oversize deletePct caused by continuously updating a batch of data

2021-04-21 Thread GitBox
jpountz commented on pull request #92: URL: https://github.com/apache/lucene/pull/92#issuecomment-824151870 This PR is mixing a force-merge-specific setting with natural merges. Can you give more context about the problem that you are trying to solve? Is setting `deletesPctAllowed` t

[jira] [Updated] (LUCENE-9932) Performance improvement for BKD index building

2021-04-21 Thread neoremind (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] neoremind updated LUCENE-9932: -- Attachment: refined-code-benchmark2.png > Performance improvement for BKD index building > ---

[jira] [Created] (LUCENE-9935) Bulk merges for stored fields when index sorting is enabled

2021-04-21 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-9935: Summary: Bulk merges for stored fields when index sorting is enabled Key: LUCENE-9935 URL: https://issues.apache.org/jira/browse/LUCENE-9935 Project: Lucene - Core

[GitHub] [lucene-site] mikemccand edited a comment on pull request #56: Add Zach Chen to committer list

2021-04-21 Thread GitBox
mikemccand edited a comment on pull request #56: URL: https://github.com/apache/lucene-site/pull/56#issuecomment-824056348 Woot! First commit for @zacharymorn, congrats! Er, rather, first *push*! -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [lucene-site] mikemccand commented on pull request #56: Add Zach Chen to committer list

2021-04-21 Thread GitBox
mikemccand commented on pull request #56: URL: https://github.com/apache/lucene-site/pull/56#issuecomment-824056348 Woot! First commit for @zacharymorn, congrats! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[jira] [Commented] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning

2021-04-21 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326487#comment-17326487 ] Adrien Grand commented on LUCENE-9335: -- I'd be interested in seeing how the results

[GitHub] [lucene] jpountz commented on a change in pull request #84: LUCENE-9929 NorwegianNormalizationFilter

2021-04-21 Thread GitBox
jpountz commented on a change in pull request #84: URL: https://github.com/apache/lucene/pull/84#discussion_r615920337 ## File path: lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/ScandinavianNormalizer.java ## @@ -0,0 +1,134 @@ +/* + * Licensed to th

[GitHub] [lucene] jpountz commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
jpountz commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-823962146 > If we are ok to take the memory penalty of using this extra array, then I think it make sense to always use the stable version of the algorithm? +1 to always use the stable v

[GitHub] [lucene] jpountz commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
jpountz commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617418482 ## File path: lucene/core/src/java/org/apache/lucene/util/bkd/MutablePointsReaderUtils.java ## @@ -39,13 +41,23 @@ public static void sort( BKDConfi

[GitHub] [lucene] neoremind edited a comment on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind edited a comment on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-823960282 > For instance I'd expect users who index integers (4 bytes) between 0 and 2^24 to notice speedups that are closer to the one that you computed for bytesPerDim=3 than for by

[GitHub] [lucene] neoremind commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-823960282 > For instance I'd expect users who index integers (4 bytes) between 0 and 2^24 to notice speedups that are closer to the one that you computed for bytesPerDim=3 than for bytesPerD

[GitHub] [lucene] iverase commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
iverase commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-823957855 What I read for this results is that the stable sorting is faster than the non-stable. I think it makes sense because re-ordering using a helper array is probably faster that in-plac

[GitHub] [lucene] jpountz commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
jpountz commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-823954946 I left a few comments before your previous comment, I left some suggestions to make the code simpler but I think that it's getting close. Thanks for the great performance analy

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617409172 ## File path: lucene/core/src/java/org/apache/lucene/util/bkd/MutablePointsReaderUtils.java ## @@ -39,13 +41,23 @@ public static void sort( BKDCon

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617406830 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617406830 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617406029 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617406029 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [lucene] neoremind commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617403651 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [lucene] neoremind commented on pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
neoremind commented on pull request #91: URL: https://github.com/apache/lucene/pull/91#issuecomment-823949164 @jpountz Per your advice, I have updated the code. In terms of performance, I refined `TestBKDDisableSortDocId`, to make it re-runnable as a benchmark. I have made the follow

[jira] [Updated] (LUCENE-9932) Performance improvement for BKD index building

2021-04-21 Thread neoremind (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] neoremind updated LUCENE-9932: -- Attachment: flame-graph.png > Performance improvement for BKD index building > ---

[GitHub] [lucene] jpountz commented on a change in pull request #91: LUCENE-9932: Performance improvement for BKD index building

2021-04-21 Thread GitBox
jpountz commented on a change in pull request #91: URL: https://github.com/apache/lucene/pull/91#discussion_r617367359 ## File path: lucene/core/src/java/org/apache/lucene/util/StableMSBRadixSorter.java ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[jira] [Updated] (LUCENE-9932) Performance improvement for BKD index building

2021-04-21 Thread neoremind (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] neoremind updated LUCENE-9932: -- Attachment: refined-code-benchmark.png > Performance improvement for BKD index building >

[jira] [Updated] (LUCENE-9334) Require consistency between data-structures on a per-field basis

2021-04-21 Thread neoremind (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] neoremind updated LUCENE-9334: -- Attachment: (was: refined-code-benchmark.png) > Require consistency between data-structures on a p

[jira] [Updated] (LUCENE-9334) Require consistency between data-structures on a per-field basis

2021-04-21 Thread neoremind (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] neoremind updated LUCENE-9334: -- Attachment: refined-code-benchmark.png > Require consistency between data-structures on a per-field ba

[GitHub] [lucene] dweiss commented on pull request #100: Update gradle to 6.8.3

2021-04-21 Thread GitBox
dweiss commented on pull request #100: URL: https://github.com/apache/lucene/pull/100#issuecomment-823851178 Please upgrade palantir's consistency check to the latest version too (if it's not there) as it's known to cause incompatibility problems. The build is not compatible with Gr

[jira] [Commented] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning

2021-04-21 Thread Zach Chen (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326323#comment-17326323 ] Zach Chen commented on LUCENE-9335: --- Hi [~jpountz], I took a stab at implementing BMM

[GitHub] [lucene] zacharymorn commented on pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-04-21 Thread GitBox
zacharymorn commented on pull request #101: URL: https://github.com/apache/lucene/pull/101#issuecomment-823838918 luceneutil benchmark result with wikimedium5m ``` TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff p-value

[GitHub] [lucene] zacharymorn opened a new pull request #101: LUCENE-9335: [Discussion Only] Add BMM scorer and use it for pure disjunction term query

2021-04-21 Thread GitBox
zacharymorn opened a new pull request #101: URL: https://github.com/apache/lucene/pull/101 Implement BMM algorithm from "Optimizing Top-k Document Retrieval Strategies for Block-Max Indexes" by Dimopoulos, Nepomnyachiy and Suel. -- This is an automated message from the Apache Git Service