[GitHub] [lucene] jpountz commented on a diff in pull request #11998: Migrate away from per-segment-per-threadlocals on SegmentReader
jpountz commented on code in PR #11998: URL: https://github.com/apache/lucene/pull/11998#discussion_r1043042489 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseStoredFieldsFormatTestCase.java: ## @@ -813,6 +837,7 @@ public void testBulkMergeWithDeletes() throws IOException { } /** mix up field numbers, merge, and check that data is correct */ + @AwaitsFix(bugUrl = "WTF with this test") Review Comment: What is the problem? I'm not too familiar with it but it seems to test that merging correctly de-optimizes bulk merges when field numbers are not aligned, which makes sense? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] romseygeek merged pull request #11999: Add support for stored fields to MemoryIndex
romseygeek merged PR #11999: URL: https://github.com/apache/lucene/pull/11999 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on a diff in pull request #11998: Migrate away from per-segment-per-threadlocals on SegmentReader
jpountz commented on code in PR #11998: URL: https://github.com/apache/lucene/pull/11998#discussion_r1043183580 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseStoredFieldsFormatTestCase.java: ## @@ -813,6 +837,7 @@ public void testBulkMergeWithDeletes() throws IOException { } /** mix up field numbers, merge, and check that data is correct */ + @AwaitsFix(bugUrl = "WTF with this test") Review Comment: I pushed a fix for this test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on a diff in pull request #11998: Migrate away from per-segment-per-threadlocals on SegmentReader
rmuir commented on code in PR #11998: URL: https://github.com/apache/lucene/pull/11998#discussion_r1043343548 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseStoredFieldsFormatTestCase.java: ## @@ -813,6 +837,7 @@ public void testBulkMergeWithDeletes() throws IOException { } /** mix up field numbers, merge, and check that data is correct */ + @AwaitsFix(bugUrl = "WTF with this test") Review Comment: thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gf2121 commented on pull request #11880: Use ByteArrayComparator to replace Arrays#compareUnsigned in some other places
gf2121 commented on PR #11880: URL: https://github.com/apache/lucene/pull/11880#issuecomment-1342933516 @jpountz Sorry for being late, thank you for the review ! I'll merge this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gf2121 merged pull request #11880: Use ByteArrayComparator to replace Arrays#compareUnsigned in some other places
gf2121 merged PR #11880: URL: https://github.com/apache/lucene/pull/11880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gf2121 opened a new pull request, #12001: Use ByteArrayComparator to replace Arrays#compareUnsigned in some other places (Backport 9x)
gf2121 opened a new pull request, #12001: URL: https://github.com/apache/lucene/pull/12001 Backport of https://github.com/apache/lucene/pull/11880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gf2121 merged pull request #12001: Use ByteArrayComparator to replace Arrays#compareUnsigned in some other places (Backport 9x)
gf2121 merged PR #12001: URL: https://github.com/apache/lucene/pull/12001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on pull request #11998: Migrate away from per-segment-per-threadlocals on SegmentReader
jpountz commented on PR #11998: URL: https://github.com/apache/lucene/pull/11998#issuecomment-1343043440 I pushed most call sites I think. The main remaining ones are in lucene/highlighter, which require a bit more changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on a diff in pull request #11998: Migrate away from per-segment-per-threadlocals on SegmentReader
rmuir commented on code in PR #11998: URL: https://github.com/apache/lucene/pull/11998#discussion_r1043614087 ## lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java: ## @@ -268,6 +270,12 @@ public final class MoreLikeThis { /** IndexReader to use */ private final IndexReader ir; + /** Stored fields for {@code ir}. */ + private final StoredFields storedFields; Review Comment: hmm, should we do it this way as instance members? `MoreLikeThis` was thread-safe before... i think. This would now make it unsafe. Maybe we should just create these in the actual method that wants to pull from storedfields? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on a diff in pull request #11998: Migrate away from per-segment-per-threadlocals on SegmentReader
jpountz commented on code in PR #11998: URL: https://github.com/apache/lucene/pull/11998#discussion_r1043616431 ## lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java: ## @@ -268,6 +270,12 @@ public final class MoreLikeThis { /** IndexReader to use */ private final IndexReader ir; + /** Stored fields for {@code ir}. */ + private final StoredFields storedFields; Review Comment: Good call, thanks for catching this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11958: GITHUB-11868: Add FilterIndexInput and FilterIndexOutput wrapper classes
mdmarshmallow commented on code in PR #11958: URL: https://github.com/apache/lucene/pull/11958#discussion_r1043767490 ## lucene/core/src/test/org/apache/lucene/index/TestFilterIndexInput.java: ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.index; + +import java.io.IOException; +import java.lang.reflect.Method; +import java.util.HashSet; +import java.util.Random; +import java.util.Set; +import org.apache.lucene.store.DataInput; +import org.apache.lucene.store.Directory; +import org.apache.lucene.store.FSDirectory; +import org.apache.lucene.store.FilterIndexInput; +import org.apache.lucene.store.IOContext; +import org.apache.lucene.store.IndexInput; +import org.apache.lucene.store.IndexOutput; +import org.junit.Test; + +public class TestFilterIndexInput extends TestIndexInput { + + @Override + public IndexInput getIndexInput(long len) { +return new FilterIndexInput("wrapped foo", new InterceptingIndexInput("foo", len)); + } + + public void testRawFilterIndexInputRead() throws IOException { +for (int i = 0; i < 10; i++) { + Random random = random(); + final Directory dir = newDirectory(); + IndexOutput os = dir.createOutput("foo", newIOContext(random)); + os.writeBytes(READ_TEST_BYTES, READ_TEST_BYTES.length); + os.close(); + IndexInput is = + new FilterIndexInput("wrapped foo", dir.openInput("foo", newIOContext(random))); + checkReads(is, IOException.class); + checkSeeksAndSkips(is, random); + is.close(); + + os = dir.createOutput("bar", newIOContext(random)); + os.writeBytes(RANDOM_TEST_BYTES, RANDOM_TEST_BYTES.length); + os.close(); + is = new FilterIndexInput("wrapped bar", dir.openInput("bar", newIOContext(random))); + checkRandomReads(is); + checkSeeksAndSkips(is, random); + is.close(); + dir.close(); +} + } + + @Test + public void testOverrides() throws Exception { +// verify that all abstract methods of IndexInput/DataInput are overridden by FilterDirectory, +// except those under the 'exclude' list +Set exclude = new HashSet<>(); + +exclude.add(IndexInput.class.getMethod("toString")); +exclude.add(IndexInput.class.getMethod("skipBytes", long.class)); +exclude.add(IndexInput.class.getDeclaredMethod("getFullSliceDescription", String.class)); +exclude.add(IndexInput.class.getMethod("randomAccessSlice", long.class, long.class)); + +exclude.add( +DataInput.class.getMethod("readBytes", byte[].class, int.class, int.class, boolean.class)); +exclude.add(DataInput.class.getMethod("readShort")); +exclude.add(DataInput.class.getMethod("readInt")); +exclude.add(DataInput.class.getMethod("readVInt")); +exclude.add(DataInput.class.getMethod("readZInt")); +exclude.add(DataInput.class.getMethod("readLong")); +exclude.add(DataInput.class.getMethod("readLongs", long[].class, int.class, int.class)); +exclude.add(DataInput.class.getMethod("readInts", int[].class, int.class, int.class)); +exclude.add(DataInput.class.getMethod("readFloats", float[].class, int.class, int.class)); +exclude.add(DataInput.class.getMethod("readVLong")); +exclude.add(DataInput.class.getMethod("readZLong")); +exclude.add(DataInput.class.getMethod("readString")); +exclude.add(DataInput.class.getMethod("readMapOfStrings")); +exclude.add(DataInput.class.getMethod("readSetOfStrings")); Review Comment: Yeah, going back to your other comment, you said that we should only override abstract methods, so I changed this test to just check that all methods overridden are not abstract (the compiler would force all abstract methods to be overridden so there is no point in testing that). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucen
[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11958: GITHUB-11868: Add FilterIndexInput and FilterIndexOutput wrapper classes
mdmarshmallow commented on code in PR #11958: URL: https://github.com/apache/lucene/pull/11958#discussion_r1043768983 ## lucene/core/src/java/org/apache/lucene/store/FilterIndexInput.java: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.store; + +import java.io.IOException; + +/** + * IndexInput implementation that delegates calls to another directory. This class can be used to + * add limitations on top of an existing {@link IndexInput} implementation or to add additional + * sanity checks for tests. However, if you plan to write your own {@link IndexInput} + * implementation, you should consider extending directly {@link IndexInput} or {@link DataInput} + * rather than try to reuse functionality of existing {@link IndexInput}s by extending this class. + * + * @lucene.internal + */ +public class FilterIndexInput extends IndexInput { + + public static IndexInput unwrap(IndexInput in) { Review Comment: Speaking as a user, I have seen some uses of the `unwrap` method in our own code base. I think it makes sense to keep them as other users might use it at one point or another? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jmazanec15 opened a new pull request, #12002: Set algorithm params during force merge in KnnGraphTester
jmazanec15 opened a new pull request, #12002: URL: https://github.com/apache/lucene/pull/12002 ### Description Sets index writer config codec for force merge operation in KnnGraphTester. Fixes issue where merged segments are built with different algorithm parameters than segments created from createIndex. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gsmiller opened a new pull request, #12003: Some minor code cleanup in IndexSortSortedNumericDocValuesRangeQuery
gsmiller opened a new pull request, #12003: URL: https://github.com/apache/lucene/pull/12003 * Leverage DISI static factory methods more over custom DISI impl where possible * Assert points field is a single-dim in a couple places * Bound cost estimate by the cost of the doc values field (for sparse fields) ### Description This PR contains some minor cleanup I thought might be useful after recently spending a little time looking at this code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] benwtrent opened a new pull request, #12004: Move byte vector queries into new KnnByteVectorQuery
benwtrent opened a new pull request, #12004: URL: https://github.com/apache/lucene/pull/12004 This is the first commit of a much larger refactor. The overall goal is to separate the concerns of byte vectors and float vectors. Making their usage and APIs clearer for users. This first step adds a new `KnnByteVectorQuery` and only allows it to be used against fields that have the `BYTE` encoding. Additionally, the original `KnnVectorQuery` can only be used against fields that have the `FLOAT32` encoding. this partially addresses: https://github.com/apache/lucene/issues/11963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] benwtrent commented on pull request #12004: Move byte vector queries into new KnnByteVectorQuery
benwtrent commented on PR #12004: URL: https://github.com/apache/lucene/pull/12004#issuecomment-1343454137 @rmuir the first of the multiple refactors. I attempted first to do it all in one change, but this proved to be an absolutely huge change (6k+ LOC churn) that spread across many parts of the code. Hopefully splitting it up like this makes reviewing simpler and allows us to iterate to the correct APIs for Byte and Float vectors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-jira-archive] dependabot[bot] opened a new pull request, #149: Bump certifi from 2022.6.15 to 2022.12.7 in /migration
dependabot[bot] opened a new pull request, #149: URL: https://github.com/apache/lucene-jira-archive/pull/149 Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.6.15 to 2022.12.7. Commits https://github.com/certifi/python-certifi/commit/9e9e840925d7b8e76c76fdac1fab7e6e88c1c3b8";>9e9e840 2022.12.07 https://github.com/certifi/python-certifi/commit/b81bdb269f1edb791bcd4ec8a9d0c053758f961a";>b81bdb2 2022.09.24 https://github.com/certifi/python-certifi/commit/939a28ffc57b1613770f572b584745c7b6d43e7d";>939a28f 2022.09.14 https://github.com/certifi/python-certifi/commit/aca828a78e73235a513dff9ebc181a47ef7dbf7b";>aca828a 2022.06.15.2 https://github.com/certifi/python-certifi/commit/de0eae12a6d5794a4c1e33052af6717707ce1fcc";>de0eae1 Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ... https://github.com/certifi/python-certifi/commit/b8eb5e9af9143b22b7f651942b393e369ed4c52a";>b8eb5e9 2022.06.15.1 https://github.com/certifi/python-certifi/commit/47fb7ab715965684e035292d2ad3386aabdc4d25";>47fb7ab Fix deprecation warning on Python 3.11 (https://github-redirect.dependabot.com/certifi/python-certifi/issues/199";>#199) https://github.com/certifi/python-certifi/commit/b0b48e059995f455ac1e79b3ad373ad4ef355516";>b0b48e0 fixes https://github-redirect.dependabot.com/certifi/python-certifi/issues/198";>#198 -- update link in license See full diff in https://github.com/certifi/python-certifi/compare/2022.06.15...2022.12.07";>compare view [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/apache/lucene-jira-archive/network/alerts). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org