[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12246: Set word2vec getSynonyms method synchronized

2023-05-16 Thread via GitHub
alessandrobenedetti commented on code in PR #12246: URL: https://github.com/apache/lucene/pull/12246#discussion_r1194809968 ## lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/word2vec/Word2VecSynonymProvider.java: ## @@ -85,7 +86,7 @@ public List getSynonyms(

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-05-16 Thread via GitHub
alessandrobenedetti commented on code in PR #12253: URL: https://github.com/apache/lucene/pull/12253#discussion_r1194958670 ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/FloatVectorFieldSource.java: ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache S

[GitHub] [lucene] rmuir merged pull request #12298: move max recursion from Operations.java to AutomatonTestUtil.java

2023-05-16 Thread via GitHub
rmuir merged PR #12298: URL: https://github.com/apache/lucene/pull/12298 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

[GitHub] [lucene] rmuir commented on pull request #12298: move max recursion from Operations.java to AutomatonTestUtil.java

2023-05-16 Thread via GitHub
rmuir commented on PR #12298: URL: https://github.com/apache/lucene/pull/12298#issuecomment-1549465353 The limit can only be removed in `main` branch as there are still some recursive algorithms in `branch_9x`. More invasive/heavier automaton changes have happened in 10.x, I think it is ok,

[GitHub] [lucene] tang-hi commented on pull request #12298: move max recursion from Operations.java to AutomatonTestUtil.java

2023-05-16 Thread via GitHub
tang-hi commented on PR #12298: URL: https://github.com/apache/lucene/pull/12298#issuecomment-1549551695 thanks @rmuir -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [lucene] tang-hi commented on issue #12284: input automaton is too large: 1001 in Operations.topoSortStatesRecurse(Operations.java:1357)

2023-05-16 Thread via GitHub
tang-hi commented on issue #12284: URL: https://github.com/apache/lucene/issues/12284#issuecomment-1549552433 it should be solved in PR #12286 😆 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549737196 Hi this is a good idea for Lucene 10.0 (minimum is Java 17). This can't be backported to Java 11, so won't be on Lucene 9.x. -- This is an automated message from the Apache Gi

[GitHub] [lucene] uschindler commented on a diff in pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on code in PR #12296: URL: https://github.com/apache/lucene/pull/12296#discussion_r1195218209 ## lucene/core/src/java/org/apache/lucene/index/IndexReaderContext.java: ## @@ -22,27 +22,25 @@ * A struct like class that represents a hierarchical relationship

[GitHub] [lucene] uschindler commented on a diff in pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on code in PR #12296: URL: https://github.com/apache/lucene/pull/12296#discussion_r1195220727 ## lucene/core/src/java/org/apache/lucene/index/IndexReaderContext.java: ## @@ -22,27 +22,25 @@ * A struct like class that represents a hierarchical relationship

[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549747674 Please run `gradlew tidy`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] uschindler commented on pull request #12295: Use `instanceof` pattern-matching where possible

2023-05-16 Thread via GitHub
uschindler commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549756344 I am fine with this, unless it makes backports harder. So basically I use this for *new* code in new 10.x features only, but I'd like to prevent this from happing in code that is poss

[GitHub] [lucene] uschindler commented on pull request #12295: Use `instanceof` pattern-matching where possible

2023-05-16 Thread via GitHub
uschindler commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549769045 The precommit checks already answer most of your questions: Only use that feature for now if it is easy to apply. :-) -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] uschindler commented on a diff in pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
uschindler commented on code in PR #12290: URL: https://github.com/apache/lucene/pull/12290#discussion_r1195248137 ## lucene/core/src/java/org/apache/lucene/store/ByteBufferGuard.java: ## @@ -39,7 +40,7 @@ final class ByteBufferGuard { * this to allow unmapping of bytebuffer

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors

2023-05-16 Thread via GitHub
alessandrobenedetti commented on code in PR #12253: URL: https://github.com/apache/lucene/pull/12253#discussion_r1195251817 ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ByteVectorFieldSource.java: ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache So

[GitHub] [lucene] uschindler commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
uschindler commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549785157 I like it, but actually this class should go away at some point! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [lucene] mikemccand commented on pull request #12298: move max recursion from Operations.java to AutomatonTestUtil.java

2023-05-16 Thread via GitHub
mikemccand commented on PR #12298: URL: https://github.com/apache/lucene/pull/12298#issuecomment-1549796778 Thank you @tang-hi! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [lucene] JarvisCraft commented on a diff in pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
JarvisCraft commented on code in PR #12290: URL: https://github.com/apache/lucene/pull/12290#discussion_r1195269692 ## lucene/core/src/java/org/apache/lucene/store/ByteBufferGuard.java: ## @@ -39,7 +40,7 @@ final class ByteBufferGuard { * this to allow unmapping of bytebuffe

[GitHub] [lucene] JarvisCraft commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
JarvisCraft commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549806649 > but actually this class should go away at some point! Yup, but as for now it seems more safe to rely on well-defined behaviour (until this class is gone). -- This is an au

[GitHub] [lucene] JarvisCraft commented on a diff in pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
JarvisCraft commented on code in PR #12296: URL: https://github.com/apache/lucene/pull/12296#discussion_r1195281538 ## lucene/core/src/java/org/apache/lucene/index/IndexReaderContext.java: ## @@ -22,27 +22,25 @@ * A struct like class that represents a hierarchical relationship

[GitHub] [lucene] JarvisCraft commented on pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
JarvisCraft commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549819407 > Please run gradlew tidy. All done :+1: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [lucene] JarvisCraft commented on pull request #12295: Use `instanceof` pattern-matching where possible

2023-05-16 Thread via GitHub
JarvisCraft commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549821494 > The precommit checks already answer most of your questions: Only use that feature for now if it is easy to apply. :-) Fair point! As for now, I will run them locally (my bad

[GitHub] [lucene] JarvisCraft commented on pull request #12295: Use `instanceof` pattern-matching where possible

2023-05-16 Thread via GitHub
JarvisCraft commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549837524 I've fixed the issues and applied `./gradlew tidy`. As for the backporting burden, I, unfortunately, lack the expertise in providing Lucene backports, so I guess that I'd bette

[GitHub] [lucene] uschindler commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
uschindler commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549844375 The whole `ByteBufferGuard` does not help too much to prevent code from failing and crushing the VM! But yes, it's a good idea to enforce more. I played with that in the past: `

[GitHub] [lucene] uschindler commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
uschindler commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549857185 Sorry, I'd suggest to not touch `ByteBufferGuard` at all, it did its job to throw AlreadyClosedException instead of segmentation fault consistently until memory access is compiled

[GitHub] [lucene] JarvisCraft commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
JarvisCraft commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549864252 > Sorry, > I'd suggest to not touch ByteBufferGuard at all, No problems! Was glad to hear your review on this change) -- This is an automated message from the Apache Git S

[GitHub] [lucene] uschindler commented on pull request #12295: Use `instanceof` pattern-matching where possible

2023-05-16 Thread via GitHub
uschindler commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549864245 I already assigned the Lucene 10 milestone, not lucene 9.x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549871892 Could you apply the same change to `IndexReader`; it should be sealed, too? https://github.com/apache/lucene/blob/f53eb28af053d7612f7e4d1b2de05d33dc410645/lucene/core/src/java/org/apac

[GitHub] [lucene] dweiss commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
dweiss commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549878245 Had to go back and re-read what all this was about in the beginning. I like the patch (nice API) but sort of agree with Uwe - the current solution is flawed (but working 99% of the time).

[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549878949 Please remove the protected from consructor. We don't want to show it in javadocs. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [lucene] uschindler commented on a diff in pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on code in PR #12296: URL: https://github.com/apache/lucene/pull/12296#discussion_r1195330910 ## lucene/core/src/java/org/apache/lucene/index/IndexReaderContext.java: ## @@ -22,27 +22,26 @@ * A struct like class that represents a hierarchical relationship

[GitHub] [lucene] sherman commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit

2023-05-16 Thread via GitHub
sherman commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549883140 I think it's a good idea to replace this tricky way to flush any CPU caches, such as using lazy set, with an explicit full fence. The purpose of a full fence is to provide a reliable mem

[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549883736 Please don't change public javadocs, so just make the classes sealed and keep documentation the same. Only remove runtime check. -- This is an automated message from the Apache Git

[GitHub] [lucene] JarvisCraft commented on pull request #12296: Seal `IndexReaderContext`

2023-05-16 Thread via GitHub
JarvisCraft commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549897649 Ready: * brought back package-private visibility; * applied the same change to `IndexReader. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [lucene] JarvisCraft commented on pull request #12296: Seal `IndexReader` and `IndexReaderContext`

2023-05-16 Thread via GitHub
JarvisCraft commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549969315 > Do you want to add a CHANGES.txt entry Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [lucene] JerryChin opened a new pull request, #12299: GITHUB-12291: Skip blank lines from stopwords list.

2023-05-16 Thread via GitHub
JerryChin opened a new pull request, #12299: URL: https://github.com/apache/lucene/pull/12299 ### Description Hi team, This PR fixes #12291, it will skip any blank lines when loading stopwords with `WordlistLoader#getWordSet`. -- This is an automated message from the Apa

[GitHub] [lucene] gus-asf opened a new issue, #12300: Clearer error message if Codec loading fails under a Security Manager

2023-05-16 Thread via GitHub
gus-asf opened a new issue, #12300: URL: https://github.com/apache/lucene/issues/12300 ### Description I just spent quite a while tracking down a problem wherein a SecurityManager was silently ignoring the META-INF/services/org.apache.lucene.codecs.Codec file. We should enhance the m

[GitHub] [lucene] gus-asf opened a new pull request, #12301: Improve error message if codec not found Fixes #12300

2023-05-16 Thread via GitHub
gus-asf opened a new pull request, #12301: URL: https://github.com/apache/lucene/pull/12301 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[GitHub] [lucene] rmuir opened a new issue, #12302: vector API integration, plan B

2023-05-16 Thread via GitHub
rmuir opened a new issue, #12302: URL: https://github.com/apache/lucene/issues/12302 ### Description For years we have explored using the vector api to actually take advantage of SIMD units on the hardware. A couple of approaches have been attempted so far: 1. try to coerce the ho

[GitHub] [lucene] rmuir commented on issue #12302: vector API integration, plan B

2023-05-16 Thread via GitHub
rmuir commented on issue #12302: URL: https://github.com/apache/lucene/issues/12302#issuecomment-1550523577 it goes without saying, but i think there are a couple obvious hotspots where we'd want to integrate. I realize containing the complexity may be difficult: 1. the `VectorUtil`

[GitHub] [lucene] benwtrent commented on pull request #12255: allocate one NeighborQueue per search for results

2023-05-16 Thread via GitHub
benwtrent commented on PR #12255: URL: https://github.com/apache/lucene/pull/12255#issuecomment-1550531087 @jbellis would this change effect query per second? There is a 20% drop since this commit in QPS in Lucene bench nightlies: https://home.apache.org/~mikemccand/lucenebench/2023.0

[GitHub] [lucene] jbellis commented on pull request #12255: allocate one NeighborQueue per search for results

2023-05-16 Thread via GitHub
jbellis commented on PR #12255: URL: https://github.com/apache/lucene/pull/12255#issuecomment-1550610764 The testing I performed pre-merge showed an improvement, but I'm happy to look closer. Is there a way to get a graph of performance over time out of that tool or is it manually pasting

[GitHub] [lucene] jbellis commented on pull request #12255: allocate one NeighborQueue per search for results

2023-05-16 Thread via GitHub
jbellis commented on PR #12255: URL: https://github.com/apache/lucene/pull/12255#issuecomment-1550623733 Also, is it actually *this* commit, or is it the HashMap commit 3c163745bb07aed51b52878de0666f1405696147 ? -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [lucene] uschindler commented on issue #12302: vector API integration, plan B

2023-05-16 Thread via GitHub
uschindler commented on issue #12302: URL: https://github.com/apache/lucene/issues/12302#issuecomment-1550830310 Hi @rmuir, I agree with your proposals. #12219 looks like a first start: Making the implementations "pluggable" it allows users to have a custom implementation as separate jar

[GitHub] [lucene] uschindler commented on a diff in pull request #12301: Improve error message if codec not found Fixes #12300

2023-05-16 Thread via GitHub
uschindler commented on code in PR #12301: URL: https://github.com/apache/lucene/pull/12301#discussion_r1196017521 ## lucene/core/src/java/org/apache/lucene/util/NamedSPILoader.java: ## @@ -110,7 +112,14 @@ public S lookup(String name) { + "' does not exist."

[GitHub] [lucene] uschindler commented on a diff in pull request #12299: GITHUB-12291: Skip blank lines from stopwords list.

2023-05-16 Thread via GitHub
uschindler commented on code in PR #12299: URL: https://github.com/apache/lucene/pull/12299#discussion_r1196018225 ## lucene/core/src/java/org/apache/lucene/analysis/WordlistLoader.java: ## @@ -117,7 +120,10 @@ public static CharArraySet getWordSet(Reader reader, String comment

[GitHub] [lucene] uschindler merged pull request #12296: Seal `IndexReader` and `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler merged PR #12296: URL: https://github.com/apache/lucene/pull/12296 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReader` and `IndexReaderContext`

2023-05-16 Thread via GitHub
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1550842694 Thanks @JarvisCraft, I merged it to main branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [lucene] uschindler commented on pull request #12295: Use `instanceof` pattern-matching where possible

2023-05-16 Thread via GitHub
uschindler commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1550852181 I tend to leave out the `equals()` methods. Many people in Lucene prefer to have a "step-by-step equals", as it is easier to read. The general problem is that we add "modern" eq