[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12246: Set word2vec getSynonyms method synchronized
alessandrobenedetti commented on code in PR #12246: URL: https://github.com/apache/lucene/pull/12246#discussion_r1194809968 ## lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/word2vec/Word2VecSynonymProvider.java: ## @@ -85,7 +86,7 @@ public List getSynonyms( SIMILARITY_FUNCTION, hnswGraph, null, - word2VecModel.size()); + NO_LIMIT_ON_VISITED_NODES); Review Comment: Thanks @zhaih ! I would suggest you close this PR as soon as the other two are merged. The one about concurrent problems, to be honest I would not wait further and proceed merging, for https://github.com/apache/lucene/pull/12235 I am not sure if you need some additional review, let me know! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors
alessandrobenedetti commented on code in PR #12253: URL: https://github.com/apache/lucene/pull/12253#discussion_r1194958670 ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/FloatVectorFieldSource.java: ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.queries.function.valuesource; + +import java.io.IOException; +import java.util.Arrays; +import java.util.Map; +import org.apache.lucene.index.FloatVectorValues; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.queries.function.FunctionValues; +import org.apache.lucene.queries.function.ValueSource; +import org.apache.lucene.search.DocIdSetIterator; + +public class FloatVectorFieldSource extends ValueSource { + private final String fieldName; + + public FloatVectorFieldSource(String fieldName) { +this.fieldName = fieldName; + } + + @Override + public FunctionValues getValues(Map context, LeafReaderContext readerContext) + throws IOException { + +final FloatVectorValues vectorValues = readerContext.reader().getFloatVectorValues(fieldName); +return new VectorFieldFunction(this) { + float[] defaultVector = null; Review Comment: to discuss ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ByteVectorFieldSource.java: ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.queries.function.valuesource; + +import java.io.IOException; +import java.util.Arrays; +import java.util.Map; +import org.apache.lucene.index.ByteVectorValues; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.queries.function.FunctionValues; +import org.apache.lucene.queries.function.ValueSource; +import org.apache.lucene.search.DocIdSetIterator; + +public class ByteVectorFieldSource extends ValueSource { + private final String fieldName; + + public ByteVectorFieldSource(String fieldName) { +this.fieldName = fieldName; + } + + @Override + public FunctionValues getValues(Map context, LeafReaderContext readerContext) + throws IOException { + +final ByteVectorValues vectorValues = readerContext.reader().getByteVectorValues(fieldName); +return new VectorFieldFunction(this) { + byte[] defaultVector = null; Review Comment: to discuss ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/VectorFieldFunction.java: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.queries.function.valuesource; + +import java.io.IOException; +import org.apache.lucene.queries.function.Funct
[GitHub] [lucene] rmuir merged pull request #12298: move max recursion from Operations.java to AutomatonTestUtil.java
rmuir merged PR #12298: URL: https://github.com/apache/lucene/pull/12298 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12298: move max recursion from Operations.java to AutomatonTestUtil.java
rmuir commented on PR #12298: URL: https://github.com/apache/lucene/pull/12298#issuecomment-1549465353 The limit can only be removed in `main` branch as there are still some recursive algorithms in `branch_9x`. More invasive/heavier automaton changes have happened in 10.x, I think it is ok, we are still moving forwards. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] tang-hi commented on pull request #12298: move max recursion from Operations.java to AutomatonTestUtil.java
tang-hi commented on PR #12298: URL: https://github.com/apache/lucene/pull/12298#issuecomment-1549551695 thanks @rmuir -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] tang-hi commented on issue #12284: input automaton is too large: 1001 in Operations.topoSortStatesRecurse(Operations.java:1357)
tang-hi commented on issue #12284: URL: https://github.com/apache/lucene/issues/12284#issuecomment-1549552433 it should be solved in PR #12286 😆 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549737196 Hi this is a good idea for Lucene 10.0 (minimum is Java 17). This can't be backported to Java 11, so won't be on Lucene 9.x. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12296: Seal `IndexReaderContext`
uschindler commented on code in PR #12296: URL: https://github.com/apache/lucene/pull/12296#discussion_r1195218209 ## lucene/core/src/java/org/apache/lucene/index/IndexReaderContext.java: ## @@ -22,27 +22,25 @@ * A struct like class that represents a hierarchical relationship between {@link IndexReader} * instances. */ -public abstract class IndexReaderContext { +public abstract sealed class IndexReaderContext permits CompositeReaderContext, LeafReaderContext { /** The reader context for this reader's immediate parent, or null if none */ public final CompositeReaderContext parent; /** - * true if this context struct represents the top level reader within the + * {@code true} if this context struct represents the top level reader within the * hierarchical context */ public final boolean isTopLevel; - /** the doc base for this reader in the parent, 0 if parent is null */ + /** the doc base for this reader in the parent, {@code 0} if parent is null */ public final int docBaseInParent; - /** the ord for this reader in the parent, 0 if parent is null */ + /** the ord for this reader in the parent, {@code 0} if parent is null */ public final int ordInParent; // An object that uniquely identifies this context without referencing // segments. The goal is to make it fine to have references to this // identity object, even after the index reader has been closed - final Object identity = new Object(); + protected final Object identity = new Object(); Review Comment: I think this is an unrelated change. Do we need to make it protected? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12296: Seal `IndexReaderContext`
uschindler commented on code in PR #12296: URL: https://github.com/apache/lucene/pull/12296#discussion_r1195220727 ## lucene/core/src/java/org/apache/lucene/index/IndexReaderContext.java: ## @@ -22,27 +22,25 @@ * A struct like class that represents a hierarchical relationship between {@link IndexReader} * instances. */ -public abstract class IndexReaderContext { +public abstract sealed class IndexReaderContext permits CompositeReaderContext, LeafReaderContext { /** The reader context for this reader's immediate parent, or null if none */ public final CompositeReaderContext parent; /** - * true if this context struct represents the top level reader within the + * {@code true} if this context struct represents the top level reader within the * hierarchical context */ public final boolean isTopLevel; - /** the doc base for this reader in the parent, 0 if parent is null */ + /** the doc base for this reader in the parent, {@code 0} if parent is null */ public final int docBaseInParent; - /** the ord for this reader in the parent, 0 if parent is null */ + /** the ord for this reader in the parent, {@code 0} if parent is null */ public final int ordInParent; // An object that uniquely identifies this context without referencing // segments. The goal is to make it fine to have references to this // identity object, even after the index reader has been closed - final Object identity = new Object(); + protected final Object identity = new Object(); Review Comment: Ah you mentioned it. I don't remember why this `identity` is there. Is it really needed to be public. I'd keep this as is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549747674 Please run `gradlew tidy`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12295: Use `instanceof` pattern-matching where possible
uschindler commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549756344 I am fine with this, unless it makes backports harder. So basically I use this for *new* code in new 10.x features only, but I'd like to prevent this from happing in code that is possibly backported. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12295: Use `instanceof` pattern-matching where possible
uschindler commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549769045 The precommit checks already answer most of your questions: Only use that feature for now if it is easy to apply. :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12290: Make memory fence in `ByteBufferGuard` explicit
uschindler commented on code in PR #12290: URL: https://github.com/apache/lucene/pull/12290#discussion_r1195248137 ## lucene/core/src/java/org/apache/lucene/store/ByteBufferGuard.java: ## @@ -39,7 +40,7 @@ final class ByteBufferGuard { * this to allow unmapping of bytebuffers with private Java APIs. */ @FunctionalInterface - static interface BufferCleaner { + interface BufferCleaner { Review Comment: be explicit here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors
alessandrobenedetti commented on code in PR #12253: URL: https://github.com/apache/lucene/pull/12253#discussion_r1195251817 ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ByteVectorFieldSource.java: ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.queries.function.valuesource; + +import java.io.IOException; +import java.util.Arrays; +import java.util.Map; +import org.apache.lucene.index.ByteVectorValues; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.queries.function.FunctionValues; +import org.apache.lucene.queries.function.ValueSource; +import org.apache.lucene.search.DocIdSetIterator; + +public class ByteVectorFieldSource extends ValueSource { + private final String fieldName; + + public ByteVectorFieldSource(String fieldName) { +this.fieldName = fieldName; + } + + @Override + public FunctionValues getValues(Map context, LeafReaderContext readerContext) + throws IOException { + +final ByteVectorValues vectorValues = readerContext.reader().getByteVectorValues(fieldName); +return new VectorFieldFunction(this) { + byte[] defaultVector = null; Review Comment: after the discussion, possibly NaN is a better option -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit
uschindler commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549785157 I like it, but actually this class should go away at some point! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mikemccand commented on pull request #12298: move max recursion from Operations.java to AutomatonTestUtil.java
mikemccand commented on PR #12298: URL: https://github.com/apache/lucene/pull/12298#issuecomment-1549796778 Thank you @tang-hi! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on a diff in pull request #12290: Make memory fence in `ByteBufferGuard` explicit
JarvisCraft commented on code in PR #12290: URL: https://github.com/apache/lucene/pull/12290#discussion_r1195269692 ## lucene/core/src/java/org/apache/lucene/store/ByteBufferGuard.java: ## @@ -39,7 +40,7 @@ final class ByteBufferGuard { * this to allow unmapping of bytebuffers with private Java APIs. */ @FunctionalInterface - static interface BufferCleaner { + interface BufferCleaner { Review Comment: Rolled back, although I feel like `static` does not add to expliicitness here considering that interfaces never can be non-static; also, most code seems to omit `static` on inner `interface`s: 23 without it VS 4 with it (including this one). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit
JarvisCraft commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549806649 > but actually this class should go away at some point! Yup, but as for now it seems more safe to rely on well-defined behaviour (until this class is gone). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on a diff in pull request #12296: Seal `IndexReaderContext`
JarvisCraft commented on code in PR #12296: URL: https://github.com/apache/lucene/pull/12296#discussion_r1195281538 ## lucene/core/src/java/org/apache/lucene/index/IndexReaderContext.java: ## @@ -22,27 +22,25 @@ * A struct like class that represents a hierarchical relationship between {@link IndexReader} * instances. */ -public abstract class IndexReaderContext { +public abstract sealed class IndexReaderContext permits CompositeReaderContext, LeafReaderContext { /** The reader context for this reader's immediate parent, or null if none */ public final CompositeReaderContext parent; /** - * true if this context struct represents the top level reader within the + * {@code true} if this context struct represents the top level reader within the * hierarchical context */ public final boolean isTopLevel; - /** the doc base for this reader in the parent, 0 if parent is null */ + /** the doc base for this reader in the parent, {@code 0} if parent is null */ public final int docBaseInParent; - /** the ord for this reader in the parent, 0 if parent is null */ + /** the ord for this reader in the parent, {@code 0} if parent is null */ public final int ordInParent; // An object that uniquely identifies this context without referencing // segments. The goal is to make it fine to have references to this // identity object, even after the index reader has been closed - final Object identity = new Object(); + protected final Object identity = new Object(); Review Comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on pull request #12296: Seal `IndexReaderContext`
JarvisCraft commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549819407 > Please run gradlew tidy. All done :+1: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on pull request #12295: Use `instanceof` pattern-matching where possible
JarvisCraft commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549821494 > The precommit checks already answer most of your questions: Only use that feature for now if it is easy to apply. :-) Fair point! As for now, I will run them locally (my bad here) and update the code accordingly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on pull request #12295: Use `instanceof` pattern-matching where possible
JarvisCraft commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549837524 I've fixed the issues and applied `./gradlew tidy`. As for the backporting burden, I, unfortunately, lack the expertise in providing Lucene backports, so I guess that I'd better leave it up to you to decide which of these are okay to be applied and which ones are in "hot code" (although I possitively expect this specific lines to be rarely visited). Considering that this changes also require Java version higher than 11, this seems to be a Lucene 10 change, so I guess there is plenty of room for discussion here whether specific files should be changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit
uschindler commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549844375 The whole `ByteBufferGuard` does not help too much to prevent code from failing and crushing the VM! But yes, it's a good idea to enforce more. I played with that in the past: `ByteBufferGuard` never helped much, a `full fence` neither. This class was bullshit from the beginning as it only delayed the crushing of JVM a bit more on misuse. So it should never have existed... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit
uschindler commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549857185 Sorry, I'd suggest to not touch `ByteBufferGuard` at all, it did its job to throw AlreadyClosedException instead of segmentation fault consistently until memory access is compiled by Hotspot. Don't touch it. W have to live with it only until Project Panama is in mainline JDKs. Unfortunatley as of today, not even 21 will ship with non-preview Panama, so tricky compilation to prevent preview warnings in #12294 is needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit
JarvisCraft commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549864252 > Sorry, > I'd suggest to not touch ByteBufferGuard at all, No problems! Was glad to hear your review on this change) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12295: Use `instanceof` pattern-matching where possible
uschindler commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1549864245 I already assigned the Lucene 10 milestone, not lucene 9.x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549871892 Could you apply the same change to `IndexReader`; it should be sealed, too? https://github.com/apache/lucene/blob/f53eb28af053d7612f7e4d1b2de05d33dc410645/lucene/core/src/java/org/apache/lucene/index/IndexReader.java#L72-L76 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit
dweiss commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549878245 Had to go back and re-read what all this was about in the beginning. I like the patch (nice API) but sort of agree with Uwe - the current solution is flawed (but working 99% of the time). The patch wouldn't change it to work correctly but it has the potential of having unknown side effects... Let's just keep the existing solution until a truly sound one is available. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549878949 Please remove the protected from consructor. We don't want to show it in javadocs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12296: Seal `IndexReaderContext`
uschindler commented on code in PR #12296: URL: https://github.com/apache/lucene/pull/12296#discussion_r1195330910 ## lucene/core/src/java/org/apache/lucene/index/IndexReaderContext.java: ## @@ -22,27 +22,26 @@ * A struct like class that represents a hierarchical relationship between {@link IndexReader} * instances. */ -public abstract class IndexReaderContext { +public abstract sealed class IndexReaderContext permits CompositeReaderContext, LeafReaderContext { /** The reader context for this reader's immediate parent, or null if none */ public final CompositeReaderContext parent; /** - * true if this context struct represents the top level reader within the - * hierarchical context + * {@code true} if this context struct represents the top level reader within the hierarchical + * context */ public final boolean isTopLevel; - /** the doc base for this reader in the parent, 0 if parent is null */ + /** the doc base for this reader in the parent, {@code 0} if parent is null */ public final int docBaseInParent; - /** the ord for this reader in the parent, 0 if parent is null */ + /** the ord for this reader in the parent, {@code 0} if parent is null */ public final int ordInParent; // An object that uniquely identifies this context without referencing // segments. The goal is to make it fine to have references to this // identity object, even after the index reader has been closed final Object identity = new Object(); - IndexReaderContext(CompositeReaderContext parent, int ordInParent, int docBaseInParent) { -if (!(this instanceof CompositeReaderContext || this instanceof LeafReaderContext)) - throw new Error("This class should never be extended by custom code!"); + protected IndexReaderContext( Review Comment: Please make it package protected again, as we don't want it to appear in Javadocs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] sherman commented on pull request #12290: Make memory fence in `ByteBufferGuard` explicit
sherman commented on PR #12290: URL: https://github.com/apache/lucene/pull/12290#issuecomment-1549883140 I think it's a good idea to replace this tricky way to flush any CPU caches, such as using lazy set, with an explicit full fence. The purpose of a full fence is to provide a reliable memory barrier that works consistently across different architectures. But if you're quite certain that the days of this class are numbered, then it's fine. (By the way, we are still planning to test it in our implementation of the directory interface). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReaderContext`
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549883736 Please don't change public javadocs, so just make the classes sealed and keep documentation the same. Only remove runtime check. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on pull request #12296: Seal `IndexReaderContext`
JarvisCraft commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549897649 Ready: * brought back package-private visibility; * applied the same change to `IndexReader. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JarvisCraft commented on pull request #12296: Seal `IndexReader` and `IndexReaderContext`
JarvisCraft commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1549969315 > Do you want to add a CHANGES.txt entry Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] JerryChin opened a new pull request, #12299: GITHUB-12291: Skip blank lines from stopwords list.
JerryChin opened a new pull request, #12299: URL: https://github.com/apache/lucene/pull/12299 ### Description Hi team, This PR fixes #12291, it will skip any blank lines when loading stopwords with `WordlistLoader#getWordSet`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gus-asf opened a new issue, #12300: Clearer error message if Codec loading fails under a Security Manager
gus-asf opened a new issue, #12300: URL: https://github.com/apache/lucene/issues/12300 ### Description I just spent quite a while tracking down a problem wherein a SecurityManager was silently ignoring the META-INF/services/org.apache.lucene.codecs.Codec file. We should enhance the message in the event that a security manager is detected to alert the user to this possibility and the potentially surprising detail that no SecurityException is thrown by java when it fails to load a file via the classpath. Even more confusingly, java can load classes and then silently fail to load files from the same jar file that the class came from, so alerting the user to this complex situation seems worthwhile. (PR soon) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gus-asf opened a new pull request, #12301: Improve error message if codec not found Fixes #12300
gus-asf opened a new pull request, #12301: URL: https://github.com/apache/lucene/pull/12301 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir opened a new issue, #12302: vector API integration, plan B
rmuir opened a new issue, #12302: URL: https://github.com/apache/lucene/issues/12302 ### Description For years we have explored using the vector api to actually take advantage of SIMD units on the hardware. A couple of approaches have been attempted so far: 1. try to coerce the hotspot superword autovectorization into giving better results: I think @jpountz may know details of this and it currently is limited to postings list decode, attempting to use 64-bit long. It is better than nothing, but not good enough. especially for stuff like vectors encoding and other bottlenecks. 2. vectorize code and hope that the openjdk feature will "graduate" soon. This is not happening. Java is dying and becoming the next COBOL, in my opinion, as a result of ignoring the problem and delaying until "perfection". For evidence, simply look at vector api being in "6th incubation" with really no api changes happening, just waiting on other features (some wet dream project valhalla or whatever) which will prolly never land before we retire. 3. hackedy-häcks: this is stuff to bypass the problem, such as prototype i wrote in frustration *two years ago* here: https://github.com/apache/lucene/pull/18 . It is dirty and insecure but it demonstrates we can potentially make stuff easier on the user and take advantage of the hardware. Currently, unless the user is extremely technical, they can't make use of the vector support their hardware has, which is terribly sad. If they have immense resources/funding/etc, they can fork lucene and patch the source code, and maintain a fork, hooking in incubator openjdk stuff, but that's too hard on users. I think we have to draw a line in the sand, basically we can not rely upon openjdk to be managed as a performant project, their decisions make no sense, we have to play a little less nicer and apply some hacks! otherwise give up and switch to a different programming language with better perf! So I'd suggest to look at the work @uschindler has done with mmap and the preview apis, and let's carve out a path where we use the vector api *IFF* the user opts in via the command-line. Proposal (depends entirely upon user's jdk version and supported flags): 1. user does nothing and runs lucene without special flags: they get a warning message logged to the console (once!) telling them they need to add some stuff to the commandline for best performance. something such as "vector falling back to scalar implementation: please add "--add-modules .x.y.z" 2. user supplies that command-line argument and lucene is faster and uses correct incubating vector api associated with their jdk version. Actually the system @uschindler developed I think is the correct design for this, the only trick is that the incubating api is more difficult than the preview api. So we need more build system support, it could require more stuff to be downloaded or build to be slower. But I think its the right decision? We don't want to have base64-encoded hackiness that is hard to maintain, at the same time, we need to give the users option to opt-in to actually making use their hardware. I think we should suffer the complexity to make this easy on them. It fucking sucks that openjdk makes this almost impossible, but we need to do it for our users. That's what being a library is all about. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on issue #12302: vector API integration, plan B
rmuir commented on issue #12302: URL: https://github.com/apache/lucene/issues/12302#issuecomment-1550523577 it goes without saying, but i think there are a couple obvious hotspots where we'd want to integrate. I realize containing the complexity may be difficult: 1. the `VectorUtil` methods (dotProduct etc). These are kinda simplest and contained and the obvious way to prove the approach. 2. the postings list FOR/PFOR decode. 3. possibly the docvalues decode. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] benwtrent commented on pull request #12255: allocate one NeighborQueue per search for results
benwtrent commented on PR #12255: URL: https://github.com/apache/lucene/pull/12255#issuecomment-1550531087 @jbellis would this change effect query per second? There is a 20% drop since this commit in QPS in Lucene bench nightlies: https://home.apache.org/~mikemccand/lucenebench/2023.05.08.18.03.38.html Looking at the other searches, none slow down this significantly, so it doesn't seem to be the lucene util change. Did your testing for recall changes show any performance differences on single threaded QPS? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jbellis commented on pull request #12255: allocate one NeighborQueue per search for results
jbellis commented on PR #12255: URL: https://github.com/apache/lucene/pull/12255#issuecomment-1550610764 The testing I performed pre-merge showed an improvement, but I'm happy to look closer. Is there a way to get a graph of performance over time out of that tool or is it manually pasting things into a spreadsheet? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jbellis commented on pull request #12255: allocate one NeighborQueue per search for results
jbellis commented on PR #12255: URL: https://github.com/apache/lucene/pull/12255#issuecomment-1550623733 Also, is it actually *this* commit, or is it the HashMap commit 3c163745bb07aed51b52878de0666f1405696147 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on issue #12302: vector API integration, plan B
uschindler commented on issue #12302: URL: https://github.com/apache/lucene/issues/12302#issuecomment-1550830310 Hi @rmuir, I agree with your proposals. #12219 looks like a first start: Making the implementations "pluggable" it allows users to have a custom implementation as separate jar file / module. This could also be a solution to add an implementation jar for each java version as a separate Lucene module. For compiling them, we can use exactly the same aproach like for MMapDirectory. Of course we can also include the implementation classes into the main JAR file. About the compilation: - yes, we can extract APIJAR stubs for the verctor API. I would use separate APIJAR stubs for this, especially as it is different java modules. The reason is how the compilation with module pathcing works in my code. - actually compiling might be simpler with the APIJAR stubs for incubator modules: They live in a private module hidden by default in the jdk. So the signatures are not visible unless you explicitely add the module to compiler. So here we can use a very simple approach: Just compile the vectorclasses against incubator classes without any module-patching, just add the APIJAR as a normal dependency. As the package names are unique there's no problem! This is the main reason why the APIJAR files need to be separate from the foreign ones. - We can use the same approach like for MMAP at runtime: Use the MR-JAR mechanism to separate the classes and hide them from stupid IDEs (actually the current approach would not need a MR-JAR at all, it is just used to "hide" the Java 19, 20, 21 classes from stupid IDEs). Of course user has to pass the `--add-module` explicitely. I'd not add the module to module-info as this would cause warning for ALL endusers. - at a later stage when the vector API goes into preview phase, we can merge all this with the current MMAP code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12301: Improve error message if codec not found Fixes #12300
uschindler commented on code in PR #12301: URL: https://github.com/apache/lucene/pull/12301#discussion_r1196017521 ## lucene/core/src/java/org/apache/lucene/util/NamedSPILoader.java: ## @@ -110,7 +112,14 @@ public S lookup(String name) { + "' does not exist." + " You need to add the corresponding JAR file supporting this SPI to your classpath." + " The current classpath supports the following names: " -+ availableServices()); ++ availableServices() ++ ((System.getSecurityManager() == null) +? "" +: " We have detected that a security manager is installed so it is also possible " ++ "that the services file in the jar containing the codec is inaccessible under the current " ++ "security policy. A FilePermission implying 'read' access to the " ++ "jar containing the META-INF/services/org.apache.lucene.codec.Codec file is necessary. " Review Comment: Please don't mention "Codec" here, as the NamedSPILoader is used for many more than codecs. Please be generic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12299: GITHUB-12291: Skip blank lines from stopwords list.
uschindler commented on code in PR #12299: URL: https://github.com/apache/lucene/pull/12299#discussion_r1196018225 ## lucene/core/src/java/org/apache/lucene/analysis/WordlistLoader.java: ## @@ -117,7 +120,10 @@ public static CharArraySet getWordSet(Reader reader, String comment, CharArraySe String word = null; while ((word = br.readLine()) != null) { if (word.startsWith(comment) == false) { - result.add(word.trim()); + word = word.trim(); + // skip blank lines + if (word.length() == 0) continue; Review Comment: Use `word.isEmpty()`. ## lucene/core/src/java/org/apache/lucene/analysis/WordlistLoader.java: ## @@ -53,7 +53,10 @@ public static CharArraySet getWordSet(Reader reader, CharArraySet result) throws try (BufferedReader br = getBufferedReader(reader)) { String word = null; while ((word = br.readLine()) != null) { -result.add(word.trim()); +word = word.trim(); +// skip blank lines +if (word.length() == 0) continue; Review Comment: Use `word.isEmpty()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler merged pull request #12296: Seal `IndexReader` and `IndexReaderContext`
uschindler merged PR #12296: URL: https://github.com/apache/lucene/pull/12296 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12296: Seal `IndexReader` and `IndexReaderContext`
uschindler commented on PR #12296: URL: https://github.com/apache/lucene/pull/12296#issuecomment-1550842694 Thanks @JarvisCraft, I merged it to main branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12295: Use `instanceof` pattern-matching where possible
uschindler commented on PR #12295: URL: https://github.com/apache/lucene/pull/12295#issuecomment-1550852181 I tend to leave out the `equals()` methods. Many people in Lucene prefer to have a "step-by-step equals", as it is easier to read. The general problem is that we add "modern" equals and hashCode methods that do not use instanceof but compare classes directly. The examples you changed are old-style instanceof versions, which are buggy in class hiererachies (unless it is a final class). So I tend to change the `equals` methods with `instanceof` to follow the more modern approach consistently. This would be a separate PR. The problem of all those methods is that they were created by some IDE without human intervention (and all IDEs use different patterns, although modern ones use class comparisons instead of instanceof). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org