[jira] [Created] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?
Adrien Grand created LUCENE-10180: - Summary: Remove usage of lambdas in SegmentMerger? Key: LUCENE-10180 URL: https://issues.apache.org/jira/browse/LUCENE-10180 Project: Lucene - Core Issue Type: Wish Reporter: Adrien Grand SegmentMerger now uses lambdas to share the logic around logging merging times for all file formats. One problem is that these lambdas get auto-generated names, and it makes it harder to work with profilers since things that should logically end up in the same sub tree end up in different sub trees because two instances of the same lambda get different names. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?
[ https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429196#comment-17429196 ] Adrien Grand edited comment on LUCENE-10180 at 10/15/21, 9:12 AM: -- !profile.png! Here is a profile generated by async-profiler to show what I'm talking about. There are two separate sub trees for points merges under SegmentMerger#merge because we get two lambdas that have different auto-generated names. was (Author: jpountz): I attached a profile generated by async-profiler to show what I'm talking about. There are two separate sub trees for points merges under SegmentMerger#merge because we get two lambdas that have different auto-generated names. > Remove usage of lambdas in SegmentMerger? > - > > Key: LUCENE-10180 > URL: https://issues.apache.org/jira/browse/LUCENE-10180 > Project: Lucene - Core > Issue Type: Wish >Reporter: Adrien Grand >Priority: Minor > Attachments: profile.png > > > SegmentMerger now uses lambdas to share the logic around logging merging > times for all file formats. > One problem is that these lambdas get auto-generated names, and it makes it > harder to work with profilers since things that should logically end up in > the same sub tree end up in different sub trees because two instances of the > same lambda get different names. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?
[ https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-10180: -- Attachment: profile.png Status: Open (was: Open) I attached a profile generated by async-profiler to show what I'm talking about. There are two separate sub trees for points merges under SegmentMerger#merge because we get two lambdas that have different auto-generated names. > Remove usage of lambdas in SegmentMerger? > - > > Key: LUCENE-10180 > URL: https://issues.apache.org/jira/browse/LUCENE-10180 > Project: Lucene - Core > Issue Type: Wish >Reporter: Adrien Grand >Priority: Minor > Attachments: profile.png > > > SegmentMerger now uses lambdas to share the logic around logging merging > times for all file formats. > One problem is that these lambdas get auto-generated names, and it makes it > harder to work with profilers since things that should logically end up in > the same sub tree end up in different sub trees because two instances of the > same lambda get different names. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-10179) [release wizard] No longer check for release status on mirrors
[ https://issues.apache.org/jira/browse/LUCENE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl reassigned LUCENE-10179: Assignee: Jan Høydahl > [release wizard] No longer check for release status on mirrors > -- > > Key: LUCENE-10179 > URL: https://issues.apache.org/jira/browse/LUCENE-10179 > Project: Lucene - Core > Issue Type: Task >Reporter: Mike Drob >Assignee: Jan Høydahl >Priority: Major > > The ASF has moved to a CDN instead of the mirror system. We don't need to > update our upload steps, but checking mirrors is no longer necessary I > believe. > > https://blogs.apache.org/foundation/entry/apache-software-foundation-moves-to -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10179) [release wizard] No longer check for release status on mirrors
[ https://issues.apache.org/jira/browse/LUCENE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429244#comment-17429244 ] Jan Høydahl commented on LUCENE-10179: -- See PR [https://github.com/apache/lucene/pull/384] We should probably also update the website Download page to use dlcdn.apache.org directly instead of closer.lua? There are redirects, so not urgent... Do you know if this changes anything related to how we trust downloads from CDN vs archive? I.e. could we now retrieve .asc and KEYS files from CDN? We could not trust those from the mirrors.. I suppose the CDN infrastructure is potentially vulnerable to attacks, so the same precaution is valid? > [release wizard] No longer check for release status on mirrors > -- > > Key: LUCENE-10179 > URL: https://issues.apache.org/jira/browse/LUCENE-10179 > Project: Lucene - Core > Issue Type: Task >Reporter: Mike Drob >Assignee: Jan Høydahl >Priority: Major > > The ASF has moved to a CDN instead of the mirror system. We don't need to > update our upload steps, but checking mirrors is no longer necessary I > believe. > > https://blogs.apache.org/foundation/entry/apache-software-foundation-moves-to -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10129) Add RamUsageEstimator shallowSizeOf(long[]) overload that just calls sizeOf(long[])?
[ https://issues.apache.org/jira/browse/LUCENE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429252#comment-17429252 ] Uwe Schindler edited comment on LUCENE-10129 at 10/15/21, 12:04 PM: bq. I am wondering though if TestRamUsageEstimator is missing an import static org.apache.lucene.util.RamUsageEstimator.sizeOf;, so that in lines like assertEquals(sizeOf(array), sizeOf((Object) array)); the first sizeOf() calls RamUsageEstimator.sizeOf, and the second calls RamUsageTester.sizeOf. Apologies if I misunderstood the purpose of the test. I was also stumbling on this. Maybe we should remove the static import and be explicit and compare all three versions? was (Author: thetaphi): bq. I am wondering though if TestRamUsageEstimator is missing an import static org.apache.lucene.util.RamUsageEstimator.sizeOf;, so that in lines like assertEquals(sizeOf(array), sizeOf((Object) array)); the first sizeOf() calls RamUsageEstimator.sizeOf, and the second calls RamUsageTester.sizeOf. Apologies if I misunderstood the purpose of the test. I was also stumbling on this. Maybe we should remove the static import and be explicit and compare all three versions? > Add RamUsageEstimator shallowSizeOf(long[]) overload that just calls > sizeOf(long[])? > > > Key: LUCENE-10129 > URL: https://issues.apache.org/jira/browse/LUCENE-10129 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > See LUCENE-10128 for an example. The problem is there is only a > {{sizeOf(long[])}}, so if the programmer uses {{shallowSizeOf}} instead of > {{sizeOf}} then it falls back to {{shallowSizeOf(Object)}} which does a bunch > of reflection. > This is pretty crazy because it can create performance traps. Should we just > add a {{shallowSizeOf(long[])}} that calls {{sizeOf(long[])}}, so that things > are fast? (same for other primitive arrays). It would solve the problem > easily I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10129) Add RamUsageEstimator shallowSizeOf(long[]) overload that just calls sizeOf(long[])?
[ https://issues.apache.org/jira/browse/LUCENE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429252#comment-17429252 ] Uwe Schindler commented on LUCENE-10129: bq. I am wondering though if TestRamUsageEstimator is missing an import static org.apache.lucene.util.RamUsageEstimator.sizeOf;, so that in lines like assertEquals(sizeOf(array), sizeOf((Object) array)); the first sizeOf() calls RamUsageEstimator.sizeOf, and the second calls RamUsageTester.sizeOf. Apologies if I misunderstood the purpose of the test. I was also stumbling on this. Maybe we should remove the static import and be explicit and compare all three versions? > Add RamUsageEstimator shallowSizeOf(long[]) overload that just calls > sizeOf(long[])? > > > Key: LUCENE-10129 > URL: https://issues.apache.org/jira/browse/LUCENE-10129 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > See LUCENE-10128 for an example. The problem is there is only a > {{sizeOf(long[])}}, so if the programmer uses {{shallowSizeOf}} instead of > {{sizeOf}} then it falls back to {{shallowSizeOf(Object)}} which does a bunch > of reflection. > This is pretty crazy because it can create performance traps. Should we just > add a {{shallowSizeOf(long[])}} that calls {{sizeOf(long[])}}, so that things > are fast? (same for other primitive arrays). It would solve the problem > easily I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10178) Add toString for inspecting Lucene90HnswVectorsFormat
[ https://issues.apache.org/jira/browse/LUCENE-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429266#comment-17429266 ] ASF subversion and git services commented on LUCENE-10178: -- Commit c9e56d27a3b88db2d9ba99477a778b298d8ff08c in lucene's branch refs/heads/main from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=c9e56d2 ] LUCENE-10178 Add toString methond for Lucene90HnswVectorsFormat (#383) All toString method to Lucene90HnswVectorsFormat for testing and debugging. > Add toString for inspecting Lucene90HnswVectorsFormat > - > > Key: LUCENE-10178 > URL: https://issues.apache.org/jira/browse/LUCENE-10178 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Trivial > > Since `Lucene90HnswVectorsFormat` has a number of parameters, it is useful > for testing and debugging to add > `toString()` method that will output `maxConn` and `beamWidth` . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10178) Add toString for inspecting Lucene90HnswVectorsFormat
[ https://issues.apache.org/jira/browse/LUCENE-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429267#comment-17429267 ] Mayya Sharipova commented on LUCENE-10178: -- PR: https://github.com/apache/lucene/pull/383 > Add toString for inspecting Lucene90HnswVectorsFormat > - > > Key: LUCENE-10178 > URL: https://issues.apache.org/jira/browse/LUCENE-10178 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Trivial > > Since `Lucene90HnswVectorsFormat` has a number of parameters, it is useful > for testing and debugging to add > `toString()` method that will output `maxConn` and `beamWidth` . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-10178) Add toString for inspecting Lucene90HnswVectorsFormat
[ https://issues.apache.org/jira/browse/LUCENE-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova resolved LUCENE-10178. -- Fix Version/s: main (9.0) Resolution: Fixed > Add toString for inspecting Lucene90HnswVectorsFormat > - > > Key: LUCENE-10178 > URL: https://issues.apache.org/jira/browse/LUCENE-10178 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Trivial > Fix For: main (9.0) > > > Since `Lucene90HnswVectorsFormat` has a number of parameters, it is useful > for testing and debugging to add > `toString()` method that will output `maxConn` and `beamWidth` . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Closed] (LUCENE-10178) Add toString for inspecting Lucene90HnswVectorsFormat
[ https://issues.apache.org/jira/browse/LUCENE-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova closed LUCENE-10178. > Add toString for inspecting Lucene90HnswVectorsFormat > - > > Key: LUCENE-10178 > URL: https://issues.apache.org/jira/browse/LUCENE-10178 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Trivial > Fix For: main (9.0) > > > Since `Lucene90HnswVectorsFormat` has a number of parameters, it is useful > for testing and debugging to add > `toString()` method that will output `maxConn` and `beamWidth` . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10181) GraphTokenStreamFiniteStrings#articulationPointsRecurse can run into stack overflows
Adrien Grand created LUCENE-10181: - Summary: GraphTokenStreamFiniteStrings#articulationPointsRecurse can run into stack overflows Key: LUCENE-10181 URL: https://issues.apache.org/jira/browse/LUCENE-10181 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand If provided with a very long query string, GraphTokenStreamFiniteStrings#articulationPointsRecurse may run into a StackOverflowError. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10181) GraphTokenStreamFiniteStrings#articulationPointsRecurse can run into stack overflows
[ https://issues.apache.org/jira/browse/LUCENE-10181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429287#comment-17429287 ] Adrien Grand commented on LUCENE-10181: --- I'm not familiar with this code but it looks like it's already tracking the recursion depth for other purposes, so maybe we could fail when the recursion depth goes beyond an arbitrary threshold? > GraphTokenStreamFiniteStrings#articulationPointsRecurse can run into stack > overflows > > > Key: LUCENE-10181 > URL: https://issues.apache.org/jira/browse/LUCENE-10181 > Project: Lucene - Core > Issue Type: Bug >Reporter: Adrien Grand >Priority: Minor > > If provided with a very long query string, > GraphTokenStreamFiniteStrings#articulationPointsRecurse may run into a > StackOverflowError. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10129) Add RamUsageEstimator shallowSizeOf(long[]) overload that just calls sizeOf(long[])?
[ https://issues.apache.org/jira/browse/LUCENE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429293#comment-17429293 ] ASF subversion and git services commented on LUCENE-10129: -- Commit 560f71b47d3969a37b59a39cce85f6868f3c92be in lucene's branch refs/heads/main from Stefan Vodita [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=560f71b ] LUCENE-10129: Add RamUsageEstimator.shallowSizeOf() for primitive arrays (#367) Co-authored-by: Stefan Vodita > Add RamUsageEstimator shallowSizeOf(long[]) overload that just calls > sizeOf(long[])? > > > Key: LUCENE-10129 > URL: https://issues.apache.org/jira/browse/LUCENE-10129 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > See LUCENE-10128 for an example. The problem is there is only a > {{sizeOf(long[])}}, so if the programmer uses {{shallowSizeOf}} instead of > {{sizeOf}} then it falls back to {{shallowSizeOf(Object)}} which does a bunch > of reflection. > This is pretty crazy because it can create performance traps. Should we just > add a {{shallowSizeOf(long[])}} that calls {{sizeOf(long[])}}, so that things > are fast? (same for other primitive arrays). It would solve the problem > easily I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-10129) Add RamUsageEstimator shallowSizeOf(long[]) overload that just calls sizeOf(long[])?
[ https://issues.apache.org/jira/browse/LUCENE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-10129. Fix Version/s: main (9.0) Assignee: Uwe Schindler Resolution: Fixed > Add RamUsageEstimator shallowSizeOf(long[]) overload that just calls > sizeOf(long[])? > > > Key: LUCENE-10129 > URL: https://issues.apache.org/jira/browse/LUCENE-10129 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Uwe Schindler >Priority: Major > Fix For: main (9.0) > > Time Spent: 10m > Remaining Estimate: 0h > > See LUCENE-10128 for an example. The problem is there is only a > {{sizeOf(long[])}}, so if the programmer uses {{shallowSizeOf}} instead of > {{sizeOf}} then it falls back to {{shallowSizeOf(Object)}} which does a bunch > of reflection. > This is pretty crazy because it can create performance traps. Should we just > add a {{shallowSizeOf(long[])}} that calls {{sizeOf(long[])}}, so that things > are fast? (same for other primitive arrays). It would solve the problem > easily I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10182) TestRamUsageEstimator asserts trivial equality
Stefan Vodita created LUCENE-10182: -- Summary: TestRamUsageEstimator asserts trivial equality Key: LUCENE-10182 URL: https://issues.apache.org/jira/browse/LUCENE-10182 Project: Lucene - Core Issue Type: Improvement Reporter: Stefan Vodita {{TestRamUsageEstimator.testStaticOverloads}} has serveral lines like: {code:java} assertEquals(sizeOf(array), sizeOf((Object) array)); {code} Both calls to {{sizeOf()}} fall back on {{RamUsageTester.sizeOf}}, making the 2 calls identical. Instead, we would want one of the calls to go to {{RamUsageEstimator.sizeOf}}. This issue came up while working on LUCENE-10129. A possible solution, as per [~uschindler]'s suggestion, would be to remove the static import {code:java} import static org.apache.lucene.util.RamUsageTester.sizeOf; {code} Instead, we could be explicit on which method we are calling, like: {code:java} assertEquals(RamUsageEstimator.sizeOf(array), RamUsageTester.sizeOf(array)); {code} This could be replicated for other potentially confusing cases in the test class. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10179) [release wizard] No longer check for release status on mirrors
[ https://issues.apache.org/jira/browse/LUCENE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429337#comment-17429337 ] Mike Drob commented on LUCENE-10179: {quote}We should probably also update the website Download page to use dlcdn.apache.org directly instead of closer.lua? There are redirects, so not urgent... {quote} I asked infra about this, they would prefer that we continue to use closer.lua just in case the CDN is ever temporarily down. I don't know what specifically will happen, but it sounds like they have thought about it and have a plan. However, there's a redirect feature we can use and it will sort itself out whether it needs to go to dlcdn, downloads, or archive. [https://www.apache.org/dyn/closer.lua?action=download&filename=/lucene/java/8.10.0/lucene-8.10.0-src.tgz] {quote}Do you know if this changes anything related to how we trust downloads from CDN vs archive? I.e. could we now retrieve .asc and KEYS files from CDN? We could not trust those from the mirrors.. I suppose the CDN infrastructure is potentially vulnerable to attacks, so the same precaution is valid? {quote} Policy hasn't been updated yet, but word from INFRA is that it should be safe for keys/sigs. > [release wizard] No longer check for release status on mirrors > -- > > Key: LUCENE-10179 > URL: https://issues.apache.org/jira/browse/LUCENE-10179 > Project: Lucene - Core > Issue Type: Task >Reporter: Mike Drob >Assignee: Jan Høydahl >Priority: Major > > The ASF has moved to a CDN instead of the mirror system. We don't need to > update our upload steps, but checking mirrors is no longer necessary I > believe. > > https://blogs.apache.org/foundation/entry/apache-software-foundation-moves-to -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10183) KnnVectorsWriter#writeField should take a KnnVectorsReader, not a VectorValues instance
Adrien Grand created LUCENE-10183: - Summary: KnnVectorsWriter#writeField should take a KnnVectorsReader, not a VectorValues instance Key: LUCENE-10183 URL: https://issues.apache.org/jira/browse/LUCENE-10183 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand By taking a VectorValues instance, KnnVectorsWriter#write doesn't let implementations iterate over vectors multiple times if needed. It should take a KnnVectorReaders similarly to doc values, where the writer takes a DocValuesProducer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10179) [release wizard] No longer check for release status on mirrors
[ https://issues.apache.org/jira/browse/LUCENE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429443#comment-17429443 ] ASF subversion and git services commented on LUCENE-10179: -- Commit f38c401283c43b17fa5c63b4da0a9ffc13660bc4 in lucene's branch refs/heads/main from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=f38c401 ] LUCENE-10179 No longer check for release status on mirrors (#384) > [release wizard] No longer check for release status on mirrors > -- > > Key: LUCENE-10179 > URL: https://issues.apache.org/jira/browse/LUCENE-10179 > Project: Lucene - Core > Issue Type: Task >Reporter: Mike Drob >Assignee: Jan Høydahl >Priority: Major > > The ASF has moved to a CDN instead of the mirror system. We don't need to > update our upload steps, but checking mirrors is no longer necessary I > believe. > > https://blogs.apache.org/foundation/entry/apache-software-foundation-moves-to -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-10179) [release wizard] No longer check for release status on mirrors
[ https://issues.apache.org/jira/browse/LUCENE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved LUCENE-10179. -- Fix Version/s: main (9.0) Resolution: Fixed > [release wizard] No longer check for release status on mirrors > -- > > Key: LUCENE-10179 > URL: https://issues.apache.org/jira/browse/LUCENE-10179 > Project: Lucene - Core > Issue Type: Task >Reporter: Mike Drob >Assignee: Jan Høydahl >Priority: Major > Fix For: main (9.0) > > > The ASF has moved to a CDN instead of the mirror system. We don't need to > update our upload steps, but checking mirrors is no longer necessary I > believe. > > https://blogs.apache.org/foundation/entry/apache-software-foundation-moves-to -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10179) [release wizard] No longer check for release status on mirrors
[ https://issues.apache.org/jira/browse/LUCENE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429456#comment-17429456 ] Mike Drob commented on LUCENE-10179: We should back port this to 8x as well > [release wizard] No longer check for release status on mirrors > -- > > Key: LUCENE-10179 > URL: https://issues.apache.org/jira/browse/LUCENE-10179 > Project: Lucene - Core > Issue Type: Task >Reporter: Mike Drob >Assignee: Jan Høydahl >Priority: Major > Fix For: main (9.0) > > > The ASF has moved to a CDN instead of the mirror system. We don't need to > update our upload steps, but checking mirrors is no longer necessary I > believe. > > https://blogs.apache.org/foundation/entry/apache-software-foundation-moves-to -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-10179) [release wizard] No longer check for release status on mirrors
[ https://issues.apache.org/jira/browse/LUCENE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl reopened LUCENE-10179: -- Reopen for 8x. PR is here https://github.com/apache/lucene-solr/pull/2592 > [release wizard] No longer check for release status on mirrors > -- > > Key: LUCENE-10179 > URL: https://issues.apache.org/jira/browse/LUCENE-10179 > Project: Lucene - Core > Issue Type: Task >Reporter: Mike Drob >Assignee: Jan Høydahl >Priority: Major > Fix For: main (9.0) > > > The ASF has moved to a CDN instead of the mirror system. We don't need to > update our upload steps, but checking mirrors is no longer necessary I > believe. > > https://blogs.apache.org/foundation/entry/apache-software-foundation-moves-to -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10179) [release wizard] No longer check for release status on mirrors
[ https://issues.apache.org/jira/browse/LUCENE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429476#comment-17429476 ] Mike Drob commented on LUCENE-10179: FYI [~mayya] since you are in the middle of an 8.10.1 release, when you get to the "check mirrors" step, it will likely fail and you should perform the manual steps of checking for the release artifacts on the CDN and in maven central. > [release wizard] No longer check for release status on mirrors > -- > > Key: LUCENE-10179 > URL: https://issues.apache.org/jira/browse/LUCENE-10179 > Project: Lucene - Core > Issue Type: Task >Reporter: Mike Drob >Assignee: Jan Høydahl >Priority: Major > Fix For: main (9.0) > > > The ASF has moved to a CDN instead of the mirror system. We don't need to > update our upload steps, but checking mirrors is no longer necessary I > believe. > > https://blogs.apache.org/foundation/entry/apache-software-foundation-moves-to -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10182) TestRamUsageEstimator asserts trivial equality
[ https://issues.apache.org/jira/browse/LUCENE-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429494#comment-17429494 ] Stefan Vodita commented on LUCENE-10182: [This PR|https://github.com/apache/lucene/pull/386] implements the solution described above. > TestRamUsageEstimator asserts trivial equality > -- > > Key: LUCENE-10182 > URL: https://issues.apache.org/jira/browse/LUCENE-10182 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Stefan Vodita >Priority: Major > > {{TestRamUsageEstimator.testStaticOverloads}} has serveral lines like: > {code:java} > assertEquals(sizeOf(array), sizeOf((Object) array)); > {code} > Both calls to {{sizeOf()}} fall back on {{RamUsageTester.sizeOf}}, making the > 2 calls identical. Instead, we would want one of the calls to go to > {{RamUsageEstimator.sizeOf}}. > > This issue came up while working on LUCENE-10129. A possible solution, as per > [~uschindler]'s suggestion, would be to remove the static import > {code:java} > import static org.apache.lucene.util.RamUsageTester.sizeOf; > {code} > Instead, we could be explicit on which method we are calling, like: > {code:java} > assertEquals(RamUsageEstimator.sizeOf(array), RamUsageTester.sizeOf(array)); > {code} > This could be replicated for other potentially confusing cases in the test > class. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10061) CombinedFieldsQuery needs dynamic pruning support
[ https://issues.apache.org/jira/browse/LUCENE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429522#comment-17429522 ] Zach Chen commented on LUCENE-10061: Hi [~jpountz], I'm interested in working on this one, but have a question on its potential implementation and would like to get some advices for it. I found https://issues.apache.org/jira/browse/LUCENE-8312 during research for this, and thought the solution should be very similar here (using merged impacts to prune docs that are not competitive), except for maybe how impacts get merged. However, while I understand for SynonymQuery, impacts can be merged effectively by summing term frequencies for each unique norm value as the impacts all come from the same field, I'm not sure how that could be done efficiently in the case of CombinedFieldsQuery. If I understand it correctly, in order to merge impacts from multiple fields for CombinedFieldsQuery, we may need to compute all the possible summation combinations of competitive \{freq, norm} across all fields, and find again the competitive ones among them. So for the case of 4 fields with a list of 4 competitive impacts each during impacts merge, in the worst case we may need to compute 4 * 4 * 4 * 4 = 256 combinations of merged impacts (\{field1FreqA + field2FreqB + field3FreqC + field4FreqD, field1NormA + field2NormB + field3NormC + field4NormD}), and then filter out the ones that are not competitive. This seems to be inefficient. I'm wondering if you may have any suggestion on this, or if using impacts for CombinedFieldsQuery pruning support is the right approach to begin with? > CombinedFieldsQuery needs dynamic pruning support > - > > Key: LUCENE-10061 > URL: https://issues.apache.org/jira/browse/LUCENE-10061 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > > CombinedFieldQuery's Scorer doesn't implement advanceShallow/getMaxScore, > forcing Lucene to collect all matches in order to figure the top-k hits. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10159) Index corruption: IndexOutOfBoundsException for doc values
[ https://issues.apache.org/jira/browse/LUCENE-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429524#comment-17429524 ] Nhat Nguyen edited comment on LUCENE-10159 at 10/16/21, 5:33 AM: - I think I have found the issue. I am working on a test for this. was (Author: dnhatn): I think I have found the issue. I am working a test for this. > Index corruption: IndexOutOfBoundsException for doc values > -- > > Key: LUCENE-10159 > URL: https://issues.apache.org/jira/browse/LUCENE-10159 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Blocker > > Since we upgraded Elasticsearch to a Lucene 9 snaspshot, we have seen test > failures with the following stack trace. This looks like an issue with the > Lucene90 DocValuesFormat. > {noformat} > org.apache.lucene.index.MergePolicy$MergeException: > java.lang.IndexOutOfBoundsException > at > org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2340) > ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] > at > org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) > ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] > at > org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) > ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > [?:?] > at java.lang.Thread.run(Thread.java:833) [?:?] > Caused by: java.lang.IndexOutOfBoundsException > at java.nio.Buffer.checkIndex(Buffer.java:749) ~[?:?] > at java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:692) ~[?:?] > at > org.apache.lucene.store.ByteBufferGuard.getInt(ByteBufferGuard.java:128) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readInt(ByteBufferIndexInput.java:591) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.util.packed.DirectReader$DirectPackedReader20.get(DirectReader.java:222) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.util.packed.DirectMonotonicReader.get(DirectMonotonicReader.java:149) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.set(Lucene90DocValuesProducer.java:1356) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.docValueCount(Lucene90DocValuesProducer.java:1348) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25.nextDoc(Lucene90DocValuesProducer.java:1405) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:837) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:148) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:154) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:168) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > or
[jira] [Commented] (LUCENE-10159) Index corruption: IndexOutOfBoundsException for doc values
[ https://issues.apache.org/jira/browse/LUCENE-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429524#comment-17429524 ] Nhat Nguyen commented on LUCENE-10159: -- I think I have found the issue. I am working a test for this. > Index corruption: IndexOutOfBoundsException for doc values > -- > > Key: LUCENE-10159 > URL: https://issues.apache.org/jira/browse/LUCENE-10159 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Blocker > > Since we upgraded Elasticsearch to a Lucene 9 snaspshot, we have seen test > failures with the following stack trace. This looks like an issue with the > Lucene90 DocValuesFormat. > {noformat} > org.apache.lucene.index.MergePolicy$MergeException: > java.lang.IndexOutOfBoundsException > at > org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2340) > ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] > at > org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) > ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] > at > org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) > ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > [?:?] > at java.lang.Thread.run(Thread.java:833) [?:?] > Caused by: java.lang.IndexOutOfBoundsException > at java.nio.Buffer.checkIndex(Buffer.java:749) ~[?:?] > at java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:692) ~[?:?] > at > org.apache.lucene.store.ByteBufferGuard.getInt(ByteBufferGuard.java:128) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readInt(ByteBufferIndexInput.java:591) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.util.packed.DirectReader$DirectPackedReader20.get(DirectReader.java:222) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.util.packed.DirectMonotonicReader.get(DirectMonotonicReader.java:149) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.set(Lucene90DocValuesProducer.java:1356) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.docValueCount(Lucene90DocValuesProducer.java:1348) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25.nextDoc(Lucene90DocValuesProducer.java:1405) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:837) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:148) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:154) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:168) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a > cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47] > at > org.apache.lucene.index.SegmentMerger.lambda$merge$2(SegmentMerger.java:139) > ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb36