[GitHub] [lucene] gsmiller merged pull request #915: LUCENE-10585: Scrub copy/paste code in the facets module and attempt to simplify a bit
gsmiller merged PR #915: URL: https://github.com/apache/lucene/pull/915 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10585) Cleanup copy/paste code in facets, particularly in SSDV
[ https://issues.apache.org/jira/browse/LUCENE-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543587#comment-17543587 ] ASF subversion and git services commented on LUCENE-10585: -- Commit 8db1e41fc072920dc1c6554f5859d4dd623dace6 in lucene's branch refs/heads/main from Greg Miller [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=8db1e41fc07 ] LUCENE-10585: Scrub copy/paste code in the facets module and attempt to simplify a bit (#915) > Cleanup copy/paste code in facets, particularly in SSDV > --- > > Key: LUCENE-10585 > URL: https://issues.apache.org/jira/browse/LUCENE-10585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Reporter: Greg Miller >Assignee: Greg Miller >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > We've accumulated some copy/paste code in the facets modules, especially in > SSDV-related classes. I'm going to take a pass at cleaning this up to help > make the code more readable and easier to maintain. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10585) Cleanup copy/paste code in facets, particularly in SSDV
[ https://issues.apache.org/jira/browse/LUCENE-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543598#comment-17543598 ] ASF subversion and git services commented on LUCENE-10585: -- Commit 075f5103e9ac7f344d11e4b3ad3699a1f2be9f66 in lucene's branch refs/heads/branch_9x from Greg Miller [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=075f5103e9a ] LUCENE-10585: Scrub copy/paste code in the facets module and attempt to simplify a bit > Cleanup copy/paste code in facets, particularly in SSDV > --- > > Key: LUCENE-10585 > URL: https://issues.apache.org/jira/browse/LUCENE-10585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Reporter: Greg Miller >Assignee: Greg Miller >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > We've accumulated some copy/paste code in the facets modules, especially in > SSDV-related classes. I'm going to take a pass at cleaning this up to help > make the code more readable and easier to maintain. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gsmiller commented on pull request #915: LUCENE-10585: Scrub copy/paste code in the facets module and attempt to simplify a bit
gsmiller commented on PR #915: URL: https://github.com/apache/lucene/pull/915#issuecomment-1140406510 @Yuti-G this is merged onto `main` and backported to `branch_9x`. Please feel free to rebase your PR on top of it, and let me know if you have any questions. Also, thanks for mentioning your PR. I'd missed that but would like to have a look. I'll wait for you to rebase then I'll review. Thanks again (and hopefully rebasing isn't too big of a pain!)! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-10585) Cleanup copy/paste code in facets, particularly in SSDV
[ https://issues.apache.org/jira/browse/LUCENE-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller resolved LUCENE-10585. -- Fix Version/s: 10.0 (main) 9.3 Resolution: Fixed > Cleanup copy/paste code in facets, particularly in SSDV > --- > > Key: LUCENE-10585 > URL: https://issues.apache.org/jira/browse/LUCENE-10585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Reporter: Greg Miller >Assignee: Greg Miller >Priority: Minor > Fix For: 10.0 (main), 9.3 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > We've accumulated some copy/paste code in the facets modules, especially in > SSDV-related classes. I'm going to take a pass at cleaning this up to help > make the code more readable and easier to maintain. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10595) TestGroupFacetCollector#testRandom has caught IndexOutOfBoundsException a couple times
Greg Miller created LUCENE-10595: Summary: TestGroupFacetCollector#testRandom has caught IndexOutOfBoundsException a couple times Key: LUCENE-10595 URL: https://issues.apache.org/jira/browse/LUCENE-10595 Project: Lucene - Core Issue Type: Bug Components: modules/grouping Reporter: Greg Miller Random testing has caught an {{IndexOutOfBoundsException}} a couple times now in {{org.apache.lucene.search.grouping.TestGroupFacetCollector.testRandom}}. I was able to reproduce locally with {{./gradlew :lucene:grouping:test --tests "org.apache.lucene.search.grouping.TestGroupFacetCollector.testRandom" -Ptests.jvms=4 -Ptests.haltonfailure=false -Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=91EC8BE9DE2A5BAB -Ptests.multiplier=2 -Ptests.badapples=false -Ptests.file.encoding=US-ASCII}}. >From what I can tell, the exception is coming from way down in >{{ByteBuffersDataInput#readBytes}} on [this >line|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/store/ByteBuffersDataInput.java#L155]. > Popping the stack a bit, it seems like the issue is maybe in >{{TermGroupFacetCollector$MV$SegmentResult#nextTerm}}. I'm not totally sure if this is an actual bug or a bug in testing methodology. Haven't had time to dig in further and likely won't in the near future, so opening this Jira to track. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gsmiller opened a new pull request, #929: LUCENE-10584: Properly support #getSpecificValue for hierarchical dims in SSDV faceting
gsmiller opened a new pull request, #929: URL: https://github.com/apache/lucene/pull/929 ### Description (or a Jira issue link if you have one) Please see: LUCENE-10584 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gsmiller commented on pull request #929: LUCENE-10584: Properly support #getSpecificValue for hierarchical dims in SSDV faceting
gsmiller commented on PR #929: URL: https://github.com/apache/lucene/pull/929#issuecomment-1140414699 @mdmarshmallow - might be a good one for you to have a look at when you get a minute. Pretty simple change but you've got a lot of context since you added hierarchical dim support to SSDV facets (thanks again!). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)
uschindler commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1140447638 It is now possible to compile and run tests with different JVMs: If `RUNTIME_JAVA_HOME` does not point to *exactly* Java 19, it won't compile and give error message. This is needed until we have toolchain support. To compile, you may pass `JAVA19_HOME` (this is what Jenkins does). You can then still randomize and run the tests against Java 17 (default, no `RUNTIME_JAVA_HOME`) or any other Java version. This *must* be a Java 19 version (exact, otherwise compilation will fail, too). By this Policeman Jenkins is now doing a full tets of all combinations and makes sure the MR-JAR and the provider approach used works on all platforms. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10596) Remove unused parameter in #getOrAddPerField
tangdh created LUCENE-10596: --- Summary: Remove unused parameter in #getOrAddPerField Key: LUCENE-10596 URL: https://issues.apache.org/jira/browse/LUCENE-10596 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 9.2 Reporter: tangdh I noticed that the parameter fieldType is no longer used in the method getOrAddPerField(indexingChain.java:773), do we need to remove it? If so, I'd be happy to raise a PR -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10596) Remove unused parameter in #getOrAddPerField
[ https://issues.apache.org/jira/browse/LUCENE-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543691#comment-17543691 ] Michael McCandless edited comment on LUCENE-10596 at 5/29/22 5:44 PM: -- Thanks for noticing this! +1 to remove it if indeed it is a pointless method)! was (Author: mikemccand): Thanks for noticing this!! +1 to remove it if indeed it is a pointless method!! > Remove unused parameter in #getOrAddPerField > > > Key: LUCENE-10596 > URL: https://issues.apache.org/jira/browse/LUCENE-10596 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 9.2 >Reporter: tangdh >Priority: Minor > > I noticed that the parameter fieldType is no longer used in the method > getOrAddPerField(indexingChain.java:773), do we need to remove it? If so, I'd > be happy to raise a PR -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10596) Remove unused parameter in #getOrAddPerField
[ https://issues.apache.org/jira/browse/LUCENE-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543691#comment-17543691 ] Michael McCandless commented on LUCENE-10596: - Thanks for noticing this!! +1 to remove it if indeed it is a pointless method!! > Remove unused parameter in #getOrAddPerField > > > Key: LUCENE-10596 > URL: https://issues.apache.org/jira/browse/LUCENE-10596 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 9.2 >Reporter: tangdh >Priority: Minor > > I noticed that the parameter fieldType is no longer used in the method > getOrAddPerField(indexingChain.java:773), do we need to remove it? If so, I'd > be happy to raise a PR -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)
dweiss commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1140517332 Hi Uwe. I recall I had problems with toolchains - some things just refused to work (or we couldn't customize them? Can't remember precisely what the problem was). I looked at the code you wrote and it seems all right at a glance. Some things will be awkward, like "rewinding" the javaHome set in other places... perhaps it'd be good to not set it there at all (move the toolchain declaration and runtime java selection into a single script) - can't say what will turn out to be better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)
uschindler commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1140527102 Hi Dawid, Thanks for feedback. Indeed the Tool chain API is very limited and in my honest opinion far from useful for Lucenes usages. There's no way to configure anything inside the Gradle files. Things like Autodownload or provider urls or locations of preinstalled jdks like on Jenkins cannot be put into the build files. It is also not possible to tell the system to download EA releases. It is all a desaster. I will tomorrow start a Twitter thread about this. The tool chain API should be flexible and allow plugins to configure own repository types. In general the tool chains should be standard configurations and dependencies in Gradle. The wax how it is now is a dead end: inflexible, buggy and looks like written by a student as bachelor thesis in isolation. I would wifmsh you change a configuration "toolkit" like any other and add the dependency to the java version there. Compile tasks would then use it automatically. Plugins can implement repositories in same way like Maven or ivy. EMG. One plugin for adoptopenjdk/ eclipse and another one for EA releases. About your comments: o wanted to keep the tool chain JVM hacks at one place because they currently only affect Lucene Core. Once we have a release of Java 19 we can simplify that (see code commented out). Other tasks than javac and jar are not affected. Javadics are not built and ecj/forbiddenapis/errorprone do not work at all, so I disabled them. It is just 2 classes to compile which are package private. If we get more MR-JAR stuff we may think of better ways. The current approach is a separate "singleton- in whole build. Maybe you can make it generic, but the combination with EA releases makes this impossible at moment. On top of that, preview APIs are another big problem. The Class files generated have a special minor version in class file format marking them as preview. This makes class loader picky. The code here won't run with Java 20, we will need a separate MR-JAR section for 20 and then for final release of panama in LTS 21 with unmarked class files. If the APIs will not change we may simplify this by removing the minor class version by patching class file version and only have the java 19 class files. You may read more about preview APIs in the jep, it's a mess to maintain: https://openjdk.java.net/jeps/12 At moment I also tried this PR with Luke: - if you start Luke with Java 17 all works as before, it won't see new classes and loads the byte buffer impl - if you start with Java 19 it will find the new classes, but class loader will reject to load them. Lucene will print a warning (you see it in Luke's log window and console). It will use byte buffer impl then - if you start Luke with Java 19 and `--enable-preview`, it will use new impl. So this is still opt in, which is good with preview APIs. - if you start with a possible Java 20, with or without preview enabled, it will reject the class files and falls back to our old byte buffer impl. In that case it logs warning to update Lucene. 😉 So this looks like working good and people can optionally test new mmap with Java 19 if they opt in (Solr or Elasticsearch). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] tang-hi opened a new pull request, #930: (LUCENE-10596)Remove unused parameter in #getOrAddPerField
tang-hi opened a new pull request, #930: URL: https://github.com/apache/lucene/pull/930 ### Description (or a Jira issue link if you have one) Please see [LUCENE-10596](https://issues.apache.org/jira/browse/LUCENE-10596?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=17543691#comment-17543691) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10596) Remove unused parameter in #getOrAddPerField
[ https://issues.apache.org/jira/browse/LUCENE-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543754#comment-17543754 ] tangdh commented on LUCENE-10596: - I have raised a PR,[PR|https://github.com/apache/lucene/pull/930],:D > Remove unused parameter in #getOrAddPerField > > > Key: LUCENE-10596 > URL: https://issues.apache.org/jira/browse/LUCENE-10596 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 9.2 >Reporter: tangdh >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > I noticed that the parameter fieldType is no longer used in the method > getOrAddPerField(indexingChain.java:773), do we need to remove it? If so, I'd > be happy to raise a PR -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss commented on pull request #912: MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23)
dweiss commented on PR #912: URL: https://github.com/apache/lucene/pull/912#issuecomment-1140717409 Thanks for the explanation, Uwe. In the meantime I recalled what it was about toolchains that was a showstopper - indeed it was the possibility to specify a toolchain pointing at a particular (local) installation - such as a custom-compiled JDK. I couldn't find a way to do it. If you take a look at alternative-jdk-support, it even has a comment about it: ``` // I failed to set it up leveraging Gradle's toolchains because // a toolchain spec is not flexible enough to provide an exact location of the JVM to be used; // if you have two identical JVM lang. versions in auto-discovered JVMs, an arbitrary one is used (?). // This situation is not uncommon when debugging low-level stuff (hand-compiled JVM binaries). ``` Eh. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10596) Remove unused parameter in #getOrAddPerField
[ https://issues.apache.org/jira/browse/LUCENE-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543754#comment-17543754 ] tangdh edited comment on LUCENE-10596 at 5/30/22 5:51 AM: -- I have raised a [PR|https://github.com/apache/lucene/pull/930],:D was (Author: JIRAUSER290177): I have raised a PR,[PR|https://github.com/apache/lucene/pull/930],:D > Remove unused parameter in #getOrAddPerField > > > Key: LUCENE-10596 > URL: https://issues.apache.org/jira/browse/LUCENE-10596 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 9.2 >Reporter: tangdh >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > I noticed that the parameter fieldType is no longer used in the method > getOrAddPerField(indexingChain.java:773), do we need to remove it? If so, I'd > be happy to raise a PR -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] vigyasharma commented on pull request #633: LUCENE-10216: Use MergeScheduler and MergePolicy to run addIndexes(CodecReader[]) merges.
vigyasharma commented on PR #633: URL: https://github.com/apache/lucene/pull/633#issuecomment-1140767642 > I could either wrap the runningMerges update with a synchronized (IndexWriter.this) {}, or make runningMerges a synchronizedSet. I like the second approach as it automatically fixes this at all other places. I decided to go with the first approach of wrapping critical sections with a `synchronized (IndexWriter.this) {}`. This feels simpler to reason about as we're synchronizing on a single object - `IndexWriter.this`. A `runningMerges` synchronizedSet, would've created three different objects that were getting locked at different places - the IndexWriter object, the synchronizedSet lock, and the AddIndexesMergeSource, which has sync functions so that running and pending merge queues can be transactionally updated. Reasoning about, and avoiding deadlocks is simpler with a single IW lock. And given the nature of these critical sections, I don't think this affects concurrency by much. Also rebased on the latest main. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org