[PR] Move synonym map off-heap for SynonymGraphFilter [lucene]
msfroh opened a new pull request, #13054: URL: https://github.com/apache/lucene/pull/13054 ### Description This stores the synonym map's FST and word lookup off-heap in a separate, configurable directory. The initial implementation is rough, but the unit tests pass with this change randomly enabled. Obvious things that need work are: 1. I tried to do something like a codec, but not really a codec for the synonym map files. For a solution that could evolve over time, we should probably at least write something to the metadata file saying what format was used. 2. Right now it makes no effort to detect changes to the synonym files. I would suggest that SynonymGraphFilterFactory rebuild the directory if a checksum of the input files doesn't match a value recorded in the metadata file. 3. I don't think I like the random seeks in `OffHeapBytesRefHashLike`, but I don't see an alternative (besides moving it on-heap). Given that the original issue was only about moving the FST off-heap, maybe we can keep the word dictionary on-heap. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] `SynonymGraphFilter` should read FSTs off-heap? [lucene]
msfroh commented on issue #13005: URL: https://github.com/apache/lucene/issues/13005#issuecomment-1916284724 I have a (rough) PR to address this: https://github.com/apache/lucene/pull/13054. I also moved the output word lookup off-heap, but it requires a random seek (within a hopefully MMapped file) before every lookup. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change `set.removeAll(list)` to `list.forEach(set::remove)` [lucene]
sabi0 commented on PR #13052: URL: https://github.com/apache/lucene/pull/13052#issuecomment-1916306380 I've added a comment explaining the `forEach`. [Opening a bug](https://bugs.openjdk.org/) in OpenJDK requires "OpenJDK Author" status which I do not have. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Align instanceof check with type cast [lucene]
sabi0 commented on code in PR #13039: URL: https://github.com/apache/lucene/pull/13039#discussion_r1470768641 ## lucene/core/src/java/org/apache/lucene/analysis/tokenattributes/PayloadAttributeImpl.java: ## @@ -62,8 +62,7 @@ public boolean equals(Object other) { return true; } -if (other instanceof PayloadAttribute) { - PayloadAttributeImpl o = (PayloadAttributeImpl) other; +if (other instanceof PayloadAttributeImpl o) { Review Comment: Thank you for linking to the pattern-matching PR. I've removed the pattern variable part of this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change `set.removeAll(list)` to `list.forEach(set::remove)` [lucene]
uschindler commented on PR #13052: URL: https://github.com/apache/lucene/pull/13052#issuecomment-1916340120 > I've added a comment explaining the `forEach`. > > [Opening a bug](https://bugs.openjdk.org/) in OpenJDK requires "OpenJDK Author" status which I do not have. I can do that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change `set.removeAll(list)` to `list.forEach(set::remove)` [lucene]
jpountz commented on code in PR #13052: URL: https://github.com/apache/lucene/pull/13052#discussion_r1470784089 ## lucene/core/src/java/org/apache/lucene/index/UpgradeIndexMergePolicy.java: ## @@ -106,7 +106,11 @@ public MergeSpecification findForcedMerges( // the resulting set contains all segments that are left over // and will be merged to one additional segment: for (final OneMerge om : spec.merges) { -oldSegments.keySet().removeAll(om.segments); +// om.segments.forEach(::remove) is used here instead of oldSegments.keySet().removeAll() +// for performance reasons; om.segments size is always greater than oldSegments size, and Review Comment: This sounds incorrect? `om.segments` is a subset of `oldSegments.keySet()` since the merge policy is only allowed to merge segments that are in `oldSegments.keySet()`. So the problem you are describing only occurs when `oldSegments.size() == om.segments.size()`, ie. the merge policy merges all mergeable segments in one go, which is probably a common case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Make static final Set immutable [lucene]
sabi0 opened a new pull request, #13055: URL: https://github.com/apache/lucene/pull/13055 EnumSet.of() returns a mutable Set that should not be used for static final constants. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Replace `new HashSet<>(Arrays.asList())` with `EnumSet.of()` [lucene]
sabi0 commented on PR #13051: URL: https://github.com/apache/lucene/pull/13051#issuecomment-1916363126 > static final constants should be unmodifiable sets > Could you open an issue about this? #13055 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change `set.removeAll(list)` to `list.forEach(set::remove)` [lucene]
sabi0 commented on code in PR #13052: URL: https://github.com/apache/lucene/pull/13052#discussion_r1470850110 ## lucene/core/src/java/org/apache/lucene/index/UpgradeIndexMergePolicy.java: ## @@ -106,7 +106,11 @@ public MergeSpecification findForcedMerges( // the resulting set contains all segments that are left over // and will be merged to one additional segment: for (final OneMerge om : spec.merges) { -oldSegments.keySet().removeAll(om.segments); +// om.segments.forEach(::remove) is used here instead of oldSegments.keySet().removeAll() +// for performance reasons; om.segments size is always greater than oldSegments size, and Review Comment: Fixed. Thank you for explaining this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change `set.removeAll(list)` to `list.forEach(set::remove)` [lucene]
sabi0 commented on code in PR #13052: URL: https://github.com/apache/lucene/pull/13052#discussion_r1470850110 ## lucene/core/src/java/org/apache/lucene/index/UpgradeIndexMergePolicy.java: ## @@ -106,7 +106,11 @@ public MergeSpecification findForcedMerges( // the resulting set contains all segments that are left over // and will be merged to one additional segment: for (final OneMerge om : spec.merges) { -oldSegments.keySet().removeAll(om.segments); +// om.segments.forEach(::remove) is used here instead of oldSegments.keySet().removeAll() +// for performance reasons; om.segments size is always greater than oldSegments size, and Review Comment: Fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
s1monw commented on PR #13046: URL: https://github.com/apache/lucene/pull/13046#issuecomment-1916447395 > We should PnP this! What on earth means PnP? Mike, check out this search: https://www.google.com/search?q=pnp+acronym wikipedia FTW -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Optimize counts on two clause term disjunctions [lucene]
jfreden commented on PR #13036: URL: https://github.com/apache/lucene/pull/13036#issuecomment-1916440041 I added code to only apply the optimization `if count(term-with-less-docs)/count(term-with-more-docs) < 0.1` and it yielded a way better result. Will investigate the term cache idea too since there is still a slowdown of `CountHigHigh`. ``` TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff p-value CountOrHighHigh 43.48 (3.1%) 42.94 (3.1%) -1.2% ( -7% -5%) 0.201 CountAndHighMed 117.33 (1.6%) 116.68 (2.2%) -0.6% ( -4% -3%) 0.357 CountAndHighHigh 13.40 (1.4%) 13.34 (1.6%) -0.5% ( -3% -2%) 0.305 CountOrHighMed 44.02 (2.4%) 85.10 (4.3%) 93.3% ( 84% - 102%) 0.000 ``` ``` TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff p-value HighIntervalsOrdered2.45 (15.8%)2.37 (16.8%) -3.3% ( -30% - 34%) 0.526 BrowseDateSSDVFacets1.77 (9.7%)1.71 (13.2%) -3.1% ( -23% - 21%) 0.394 IntNRQ 26.64 (9.5%) 25.91 (13.2%) -2.8% ( -23% - 22%) 0.447 HighTermMonthSort 4247.44 (5.4%) 4174.43 (6.8%) -1.7% ( -13% - 11%) 0.379 MedIntervalsOrdered4.40 (9.0%)4.34 (9.3%) -1.5% ( -18% - 18%) 0.597 OrNotHighLow 1013.74 (2.8%) 1000.22 (4.0%) -1.3% ( -7% -5%) 0.223 CountOrHighHigh 43.48 (3.1%) 42.94 (3.1%) -1.2% ( -7% -5%) 0.201 OrHighHigh 45.23 (7.6%) 44.71 (7.1%) -1.1% ( -14% - 14%) 0.624 LowIntervalsOrdered 75.80 (6.3%) 74.94 (7.0%) -1.1% ( -13% - 12%) 0.591 OrHighLow 490.67 (4.2%) 485.46 (3.7%) -1.1% ( -8% -7%) 0.394 LowTerm 782.27 (3.7%) 774.48 (3.3%) -1.0% ( -7% -6%) 0.366 OrNotHighMed 327.78 (2.4%) 325.10 (2.6%) -0.8% ( -5% -4%) 0.295 OrHighMedDayTaxoFacets7.49 (4.0%)7.44 (4.3%) -0.7% ( -8% -7%) 0.574 AndHighLow 1052.61 (4.4%) 1044.91 (4.7%) -0.7% ( -9% -8%) 0.610 AndHighMedDayTaxoFacets 31.62 (1.4%) 31.39 (2.3%) -0.7% ( -4% -3%) 0.245 MedSloppyPhrase4.55 (2.2%)4.52 (3.4%) -0.6% ( -6% -5%) 0.494 HighSloppyPhrase7.84 (2.6%)7.80 (3.2%) -0.6% ( -6% -5%) 0.535 CountAndHighMed 117.33 (1.6%) 116.68 (2.2%) -0.6% ( -4% -3%) 0.357 HighTermDayOfYearSort 365.06 (3.2%) 363.03 (3.3%) -0.6% ( -6% -6%) 0.590 Wildcard 35.27 (2.2%) 35.08 (2.6%) -0.6% ( -5% -4%) 0.470 AndHighMed 149.03 (5.1%) 148.24 (4.7%) -0.5% ( -9% -9%) 0.736 MedSpanNear4.61 (2.7%)4.58 (3.3%) -0.5% ( -6% -5%) 0.588 MedPhrase 28.98 (4.8%) 28.83 (5.1%) -0.5% ( -9% -9%) 0.743 CountAndHighHigh 13.40 (1.4%) 13.34 (1.6%) -0.5% ( -3% -2%) 0.305 HighPhrase8.65 (4.5%)8.61 (5.4%) -0.5% ( -9% -9%) 0.765 LowSloppyPhrase 29.81 (2.4%) 29.70 (2.8%) -0.4% ( -5% -4%) 0.639 OrNotHighHigh 294.82 (3.6%) 293.71 (2.9%) -0.4% ( -6% -6%) 0.719 MedTerm 719.95 (6.3%) 717.27 (5.2%) -0.4% ( -11% - 11%) 0.839 Respell 70.19 (2.4%) 69.94 (2.1%) -0.4% ( -4% -4%) 0.607 PKLookup 288.07 (2.7%) 287.18 (3.4%) -0.3% ( -6% -5%) 0.748 Prefix3 411.30 (1.9%) 410.09 (2.5%) -0.3% ( -4% -4%) 0.675 OrHighMed 226.84 (5.1%) 226.17 (4.0%) -0.3% ( -8% -9%) 0.842 LowPhrase 48.24 (4.1%) 48.15 (4.3%) -0.2% ( -8% -8%) 0.889 OrHighNotHigh 464.84 (3.1%) 464.02 (3.5%) -0.2% ( -6% -6%) 0.866
Re: [PR] Change `set.removeAll(list)` to `list.forEach(set::remove)` [lucene]
uschindler commented on code in PR #13052: URL: https://github.com/apache/lucene/pull/13052#discussion_r1470981973 ## lucene/core/src/java/org/apache/lucene/index/UpgradeIndexMergePolicy.java: ## @@ -106,7 +106,11 @@ public MergeSpecification findForcedMerges( // the resulting set contains all segments that are left over // and will be merged to one additional segment: for (final OneMerge om : spec.merges) { -oldSegments.keySet().removeAll(om.segments); +// om.segments.forEach(::remove) is used here instead of oldSegments.keySet().removeAll() +// for performance reasons; om.segments size is always greater than oldSegments size, and Review Comment: New comment looks much better. To conclude, because I was also on wrong trip yesterday late evening: The code `oldSegments.keySet().removeAll(om.segments);` behaves like the following according to the docs: > This implementation determines which is the smaller of this set and the specified collection, by invoking the size method on each. If this set has fewer elements, then the implementation iterates over this set, checking each element returned by the iterator in turn to see if it is contained in the specified collection. If it is so contained, it is removed from this set with the iterator's remove method. If the specified collection has fewer elements, then the implementation iterates over the specified collection, removing from this set each element returned by the iterator, using this set's remove method. Translated to the example: If `oldSegments.keySet()` set has fewer elements, then the implementation iterates over `oldSegments.keySet()` set, checking each element returned by the iterator in turn to see if it is contained in `om.segments`. If it is so contained, it is removed from this set with the iterator's remove method. **If `om.segments` has fewer elements, then the implementation iterates over `om.segments`, removing from this set each element returned by the iterator, using this set's remove method.** So we would like to see the second (bold) part to be executed. Unfortunetely the spec only talks about "fewer", so we have no idea what happens on the boundary! Let's check here: https://github.com/openjdk/jdk/blob/fd8adf308357355bd33916ad80e2328c35434e5a/src/java.base/share/classes/java/util/AbstractSet.java#L166-L182 On the boundary, unfortunately it executes the slow code (contains in list, lower in the above code part). So the problem only exists if both collections have equal size. Nevertheless I will open a bug report. It should only prefer the shorter collection if both are sets. Changing this would be fine in later JDK releases, because the spec of `Set` does not specify how the method should behave. The code has more problems, because the symmetry of `contains()` for both sides is not always guaranteed (e.g. for TreeSets with case insensitive order). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change `set.removeAll(list)` to `list.forEach(set::remove)` [lucene]
uschindler merged PR #13052: URL: https://github.com/apache/lucene/pull/13052 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Make static final Set immutable [lucene]
uschindler commented on PR #13055: URL: https://github.com/apache/lucene/pull/13055#issuecomment-1916562601 Please add a changes.txt, as this is theoretically a backwards incompatible change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Align instanceof check with type cast [lucene]
uschindler merged PR #13039: URL: https://github.com/apache/lucene/pull/13039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Align instanceof check with type cast [lucene]
uschindler commented on PR #13039: URL: https://github.com/apache/lucene/pull/13039#issuecomment-1916576302 I added the changes entry to the existing one for you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Align instanceof check with type cast [lucene]
sabi0 commented on PR #13039: URL: https://github.com/apache/lucene/pull/13039#issuecomment-1916580962 Thank you. I will make sure the new PRs have them out of the box. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Make static final Set immutable [lucene]
uschindler merged PR #13055: URL: https://github.com/apache/lucene/pull/13055 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Make static final Set immutable [lucene]
uschindler commented on PR #13055: URL: https://github.com/apache/lucene/pull/13055#issuecomment-1916591556 I backported it. Like the previous PR this caused a conflict due to different stop tags, but this was easy to solve. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
uschindler commented on code in PR #13046: URL: https://github.com/apache/lucene/pull/13046#discussion_r1471018363 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/BackwardsCompatibilityTestBase.java: ## @@ -0,0 +1,252 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.backward_index; + +import com.carrotsearch.randomizedtesting.annotations.Name; +import java.io.IOException; +import java.io.InputStream; +import java.lang.reflect.Field; +import java.lang.reflect.Modifier; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.text.ParseException; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Set; +import java.util.regex.Matcher; +import java.util.regex.Pattern; +import org.apache.lucene.codecs.Codec; +import org.apache.lucene.index.DirectoryReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.SegmentReader; +import org.apache.lucene.store.Directory; +import org.apache.lucene.tests.util.LuceneTestCase; +import org.apache.lucene.tests.util.TestUtil; +import org.apache.lucene.util.BytesRef; +import org.apache.lucene.util.IOUtils; +import org.apache.lucene.util.Version; +import org.junit.After; +import org.junit.Before; + +public abstract class BackwardsCompatibilityTestBase extends LuceneTestCase { + + protected final Version version; + private static final Version LATEST_PREVIOUS_MAJOR = getLatestPreviousMajorVersion(); + protected final String indexPattern; + protected static final Set BINARY_SUPPORTED_VERSIONS; + + static { +String[] oldVersions = +new String[] { + "8.0.0", "8.0.0", "8.1.0", "8.1.0", "8.1.1", "8.1.1", "8.2.0", "8.2.0", "8.3.0", "8.3.0", + "8.3.1", "8.3.1", "8.4.0", "8.4.0", "8.4.1", "8.4.1", "8.5.0", "8.5.0", "8.5.1", "8.5.1", + "8.5.2", "8.5.2", "8.6.0", "8.6.0", "8.6.1", "8.6.1", "8.6.2", "8.6.2", "8.6.3", "8.6.3", + "8.7.0", "8.7.0", "8.8.0", "8.8.0", "8.8.1", "8.8.1", "8.8.2", "8.8.2", "8.9.0", "8.9.0", + "8.10.0", "8.10.0", "8.10.1", "8.10.1", "8.11.0", "8.11.0", "8.11.1", "8.11.1", "8.11.2", + "8.11.2", "9.0.0", "9.1.0", "9.2.0", "9.3.0", "9.4.0", "9.4.1", "9.4.2", "9.5.0", "9.6.0", + "9.7.0", "9.8.0", "9.9.0", "9.9.1", "9.9.2", "9.10.0", "10.0.0", +}; + +Set binaryVersions = new HashSet<>(); +for (String version : oldVersions) { + try { +Version v = Version.parse(version); +assertTrue("Unsupported binary version: " + v, v.major >= Version.MIN_SUPPORTED_MAJOR - 1); +binaryVersions.add(v); + } catch (ParseException ex) { +throw new RuntimeException(ex); + } +} +List allCurrentVersions = getAllCurrentVersions(); +for (Version version : allCurrentVersions) { + // make sure we never miss a version. + assertTrue("Version: " + version + " missing", binaryVersions.remove(version)); +} +BINARY_SUPPORTED_VERSIONS = binaryVersions; + } + + /** + * This is a base constructor for parameterized BWC tests. The constructor arguments are provided + * by {@link com.carrotsearch.randomizedtesting.RandomizedRunner} during test execution. A {@link + * com.carrotsearch.randomizedtesting.annotations.ParametersFactory} specified in a subclass + * provides a list lists of arguments for the tests and RandomizedRunner will execute the test for + * each of the argument list. + * + * @param version the version this test should run for + * @param indexPattern an index pattern in order to open an index of see {@link + * #createPattern(String, String)} + */ + protected BackwardsCompatibilityTestBase( + @Name("version") Version version, @Name("pattern") String indexPattern) { +this.version = version; +this.indexPattern = indexPattern; + } + + Directory directory; + + @Override + @Before + public void setUp() throws Exception { +super.setUp(); +assertNull( +"Index name " + version + " should not exist found", +TestAncientIndicesCompatibility.class.getResourceAsStream( +indexName(LATEST_PREVIOUS_MAJOR)))
Re: [PR] Update int array growth calls [lucene]
stefanvodita commented on code in PR #12947: URL: https://github.com/apache/lucene/pull/12947#discussion_r1471158692 ## lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/TaskSequence.java: ## Review Comment: My reasoning must have been way off the day I wrote this. Thank you for noticing the mistake! I've reverted the changes to this file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[I] Reproducible failure in TestGeo3DPoint.testRandomBig [lucene]
easyice opened a new issue, #13056: URL: https://github.com/apache/lucene/issues/13056 ### Description ``` org.apache.lucene.spatial3d.TestGeo3DPoint > testRandomBig FAILED java.lang.AssertionError: FAIL: id=23785 should not have matched but did shape=GeoStandardPath: {planetmodel=PlanetModel.WGS84, width=2.5638182999069305E-5(0.0014689596802307306), points={[[lat=2.4457272005608357E-47, lon=0.0([X=1.0011188539901221, Y=0.0, Z=2.4484636121979335E-47])], [lat=-1.0718146506637272, lon=0.017453291479645996([X=0.47775442364187093, Y=0.008339233987253964, Z=-0.876439379159])], [lat=1.2798499685240037, lon=-0.4737449978411123([X=0.2547649811064312, Y=-0.13061449414092574, Z=0.9560925461700265])]]}} bounds=XYZBounds: [xmin=0.25474324433710577 xmax=1.0011188549901222 ymin=-0.1802138209379591 ymax=0.008364853863677697 zmin=-0.8767899152769958 zmax=0.9561195469506941] world bounds=( minX=-1.0011188539901221 maxX=1.0011188539901221 minY=-1.0011188539901221 maxY=1.0011188539901221 minZ=-0.997762292019756 maxZ=0.997762292019756 quantized point=[X=1.001118853604479, Y=-1.9836761499747446E-5, Z=2.3309121299774915E-10] within shape? false within bounds? true unquantized point=[lat=2.4457272005608357E-47, lon=-1.9814467377547217E-5([X=1.001118853793596, Y=-1.983663687213671E-5, Z=2.4484636121979335E-47])] within shape? true within bounds? true docID=23785 deleted?=false query=PointInGeo3DShapeQuery: field=point: Shape: GeoStandardPath: {planetmodel=PlanetModel.WGS84, width=2.5638182999069305E-5(0.0014689596802307306), points={[[lat=2.4457272005608357E-47, lon=0.0([X=1.0011188539901221, Y=0.0, Z=2.4484636121979335E-47])], [lat=-1.0718146506637272, lon=0.017453291479645996([X=0.47775442364187093, Y=0.008339233987253964, Z=-0.876439379159])], [lat=1.2798499685240037, lon=-0.4737449978411123([X=0.2547649811064312, Y=-0.13061449414092574, Z=0.9560925461700265])]]}} explanation: target is in leaf _0(10.0.0):C145628/1423:[diagnostics={os=Mac OS X, java.vendor=Oracle Corporation, java.runtime.version=17.0.7+8-LTS-224, timestamp=1706584548568, source=flush, lucene.version=10.0.0, os.version=13.6, os.arch=x86_64}]:[attributes={Lucene90StoredFieldsFormat.mode=BEST_SPEED}]:delGen=1 :id=df3rpl1vhr0vzenewjvq2zj8r of full reader StandardDirectoryReader(segments:3:nrt _0(10.0.0):C145628/1423:[diagnostics={os=Mac OS X, java.vendor=Oracle Corporation, java.runtime.version=17.0.7+8-LTS-224, timestamp=1706584548568, source=flush, lucene.version=10.0.0, os.version=13.6, os.arch=x86_64}]:[attributes={Lucene90StoredFieldsFormat.mode=BEST_SPEED}]:delGen=1 :id=df3rpl1vhr0vzenewjvq2zj8r) full BKD path to target doc: Cell(x=-1.0011188539901221 TO 1.0011188539901221 y=-1.0011188539901221 TO 1.0011188539901221 z=-0.9977622923536127 TO 0.9977622923536126); Shape relationship = OVERLAPS; Quantized point within cell = true; Unquantized point within cell = true Cell(x=0.047844370442552187 TO 1.0011188539901221 y=-1.0011188539901221 TO 1.0011188539901221 z=-0.9977622923536127 TO 0.9977622923536126); Shape relationship = OVERLAPS; Quantized point within cell = true; Unquantized point within cell = true Cell(x=0.047844370442552187 TO 1.0011188539901221 y=-1.0011188539901221 TO 4.661824259954982E-10 z=-0.9977622923536127 TO 0.9977622923536126); Shape relationship = OVERLAPS; Quantized point within cell = true; Unquantized point within cell = true Cell(x=0.047844370442552187 TO 1.0011188539901221 y=-1.0011188539901221 TO 4.661824259954982E-10 z=-0.9977622923536127 TO 4.661824259954982E-10); Shape relationship = OVERLAPS; Quantized point within cell = true; Unquantized point within cell = true Cell(x=0.047844370442552187 TO 1.0011188539901221 y=-0.08227179995049738 TO 4.661824259954982E-10 z=-0.9977622923536127 TO 4.661824259954982E-10); Shape relationship = OVERLAPS; Quantized point within cell = true; Unquantized point within cell = true Cell(x=0.047844370442552187 TO 1.0011188539901221 y=-0.08227179995049738 TO 4.661824259954982E-10 z=-0.5035355968293294 TO 4.661824259954982E-10); Shape relationship = OVERLAPS; Quantized point within cell = true; Unquantized point within cell = true Cell(x=1.0011187391566934 TO 1.0011188539901221 y=-0.08227179995049738 TO 4.661824259954982E-10 z=-0.5035355968293294 TO 4.661824259954982E-10); Shape relationship = OVERLAPS; Quantized point within cell = true; Unquantized point within cell = true Cell(x=1.0011187391566934 TO 1.0011188539901221 y=-0.08227179995049738 TO 4.661824259954982E-10 z=-0.5035355968293294 TO 4.661824259954982E-10); Shape relationship = OVERLAPS; Quantized point within cell = true; Unquantized point within cell = true Cell(x=1.0011187391566934 TO 1.0011188539901221 y=-0.08227179995049738 TO 4.661824259954982
Re: [PR] Fix too many open files Exception for TestConcurrentMergeScheduler [lucene]
easyice commented on PR #13035: URL: https://github.com/apache/lucene/pull/13035#issuecomment-1917236900 Pushed a new fix for reproducible test failure `TestIndexWriterThreadsToSegments.testManyThreadsClose`: ``` ./gradlew test --tests TestIndexWriterThreadsToSegments.testManyThreadsClose -Dtests.seed=DFAE4EC5F20E1CD4 -Dtests.nightly=true -Dtests.locale=zgh-MA -Dtests.timezone=SystemV/EST5EDT -Dtests.asserts=true -Dtests.file.encoding=UTF-8 ``` ``` java.nio.file.FileSystemException: /Users//tests-tmp/lucene.index.TestIndexWriterThreadsToSegments_DFAE4EC5F20E1CD4-001/index-NIOFSDirectory-001/_hk_Lucene90FieldsIndexfile_pointers_ua.tmp: Too many open files > at org.apache.lucene.tests.mockfile.HandleLimitFS.onOpen(HandleLimitFS.java:67) > at org.apache.lucene.tests.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:82) > at org.apache.lucene.tests.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:163) > at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:200) > at java.base/java.nio.file.Files.newOutputStream(Files.java:228) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[I] Reproducible failure in TestParentBlockJoinFloatKnnVectorQuery.testScoringWithMultipleChildren [lucene]
easyice opened a new issue, #13057: URL: https://github.com/apache/lucene/issues/13057 ### Description ``` org.apache.lucene.search.join.TestParentBlockJoinFloatKnnVectorQuery > testScoringWithMultipleChildren FAILED java.lang.AssertionError: expected:<1.0> but was:<0.019607843831181526> at __randomizedtesting.SeedInfo.seed([8F867B6F7689D69C:A19AB97A658D176E]:0) at junit@4.13.1/org.junit.Assert.fail(Assert.java:89) at junit@4.13.1/org.junit.Assert.failNotEquals(Assert.java:835) at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:555) at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:685) at org.apache.lucene.search.join.ParentBlockJoinKnnVectorQueryTestCase.assertScorerResults(ParentBlockJoinKnnVectorQueryTestCase.java:341) at org.apache.lucene.search.join.ParentBlockJoinKnnVectorQueryTestCase.testScoringWithMultipleChildren(ParentBlockJoinKnnVectorQueryTestCase.java:210) at org.apache.lucene.search.join.TestParentBlockJoinFloatKnnVectorQuery.testScoringWithMultipleChildren(TestParentBlockJoinFloatKnnVectorQuery.java:37) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ``` ### Gradle command to reproduce ./gradlew test --tests TestParentBlockJoinFloatKnnVectorQuery.testScoringWithMultipleChildren -Dtests.seed=8F867B6F7689D69C -Dtests.nightly=true -Dtests.locale=en-GI -Dtests.timezone=Africa/Porto-Novo -Dtests.asserts=true -Dtests.file.encoding=UTF-8 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
mikemccand commented on PR #13046: URL: https://github.com/apache/lucene/pull/13046#issuecomment-1917387757 > > We should PnP this! > > What on earth means PnP? Mike, check out this search: https://www.google.com/search?q=pnp+acronym wikipedia FTW PnP = progress not perfection! Not plug and play, not transistors, heh. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
uschindler commented on PR #13046: URL: https://github.com/apache/lucene/pull/13046#issuecomment-1917488764 > > > We should PnP this! > > > > > > What on earth means PnP? Mike, check out this search: https://www.google.com/search?q=pnp+acronym wikipedia FTW > > PnP = progress not perfection! Not plug and play, not transistors, heh. We all need some drugs sometimes, especially the policeman. 💉💊💊💊 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
s1monw commented on PR #13046: URL: https://github.com/apache/lucene/pull/13046#issuecomment-1917508359 @uschindler I need to hear you speaking german to tell, you know that ;) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
uschindler commented on PR #13046: URL: https://github.com/apache/lucene/pull/13046#issuecomment-1917572859 There is no formatted() with locale. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
uschindler commented on code in PR #13046: URL: https://github.com/apache/lucene/pull/13046#discussion_r1471695464 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/BackwardsCompatibilityTestBase.java: ## @@ -0,0 +1,253 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.backward_index; + +import com.carrotsearch.randomizedtesting.annotations.Name; +import java.io.IOException; +import java.io.InputStream; +import java.lang.reflect.Field; +import java.lang.reflect.Modifier; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.text.ParseException; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Locale; +import java.util.Set; +import java.util.regex.Matcher; +import java.util.regex.Pattern; +import org.apache.lucene.codecs.Codec; +import org.apache.lucene.index.DirectoryReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.SegmentReader; +import org.apache.lucene.store.Directory; +import org.apache.lucene.tests.util.LuceneTestCase; +import org.apache.lucene.tests.util.TestUtil; +import org.apache.lucene.util.BytesRef; +import org.apache.lucene.util.IOUtils; +import org.apache.lucene.util.Version; +import org.junit.After; +import org.junit.Before; + +public abstract class BackwardsCompatibilityTestBase extends LuceneTestCase { + + protected final Version version; + private static final Version LATEST_PREVIOUS_MAJOR = getLatestPreviousMajorVersion(); + protected final String indexPattern; + protected static final Set BINARY_SUPPORTED_VERSIONS; + + static { +String[] oldVersions = +new String[] { + "8.0.0", "8.0.0", "8.1.0", "8.1.0", "8.1.1", "8.1.1", "8.2.0", "8.2.0", "8.3.0", "8.3.0", + "8.3.1", "8.3.1", "8.4.0", "8.4.0", "8.4.1", "8.4.1", "8.5.0", "8.5.0", "8.5.1", "8.5.1", + "8.5.2", "8.5.2", "8.6.0", "8.6.0", "8.6.1", "8.6.1", "8.6.2", "8.6.2", "8.6.3", "8.6.3", + "8.7.0", "8.7.0", "8.8.0", "8.8.0", "8.8.1", "8.8.1", "8.8.2", "8.8.2", "8.9.0", "8.9.0", + "8.10.0", "8.10.0", "8.10.1", "8.10.1", "8.11.0", "8.11.0", "8.11.1", "8.11.1", "8.11.2", + "8.11.2", "9.0.0", "9.1.0", "9.2.0", "9.3.0", "9.4.0", "9.4.1", "9.4.2", "9.5.0", "9.6.0", + "9.7.0", "9.8.0", "9.9.0", "9.9.1", "9.9.2", "9.10.0", "10.0.0", +}; + +Set binaryVersions = new HashSet<>(); +for (String version : oldVersions) { + try { +Version v = Version.parse(version); +assertTrue("Unsupported binary version: " + v, v.major >= Version.MIN_SUPPORTED_MAJOR - 1); +binaryVersions.add(v); + } catch (ParseException ex) { +throw new RuntimeException(ex); + } +} +List allCurrentVersions = getAllCurrentVersions(); +for (Version version : allCurrentVersions) { + // make sure we never miss a version. + assertTrue("Version: " + version + " missing", binaryVersions.remove(version)); +} +BINARY_SUPPORTED_VERSIONS = binaryVersions; + } + + /** + * This is a base constructor for parameterized BWC tests. The constructor arguments are provided + * by {@link com.carrotsearch.randomizedtesting.RandomizedRunner} during test execution. A {@link + * com.carrotsearch.randomizedtesting.annotations.ParametersFactory} specified in a subclass + * provides a list lists of arguments for the tests and RandomizedRunner will execute the test for + * each of the argument list. + * + * @param version the version this test should run for + * @param indexPattern an index pattern in order to open an index of see {@link + * #createPattern(String, String)} + */ + protected BackwardsCompatibilityTestBase( + @Name("version") Version version, @Name("pattern") String indexPattern) { +this.version = version; +this.indexPattern = indexPattern; + } + + Directory directory; + + @Override + @Before + public void setUp() throws Exception { +super.setUp(); +assertNull( +"Index name " + version + " should not exist found", +TestAncientIndicesCompatibility.class.getResourceAsStream( +indexNam
Re: [PR] Optimize counts on two clause term disjunctions [lucene]
jpountz commented on PR #13036: URL: https://github.com/apache/lucene/pull/13036#issuecomment-1917595110 Thanks @jfreden, the heuristic looks sensible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
uschindler commented on code in PR #13046: URL: https://github.com/apache/lucene/pull/13046#discussion_r1471716431 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/BackwardsCompatibilityTestBase.java: ## @@ -0,0 +1,253 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.backward_index; + +import com.carrotsearch.randomizedtesting.annotations.Name; +import java.io.IOException; +import java.io.InputStream; +import java.lang.reflect.Field; +import java.lang.reflect.Modifier; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.text.ParseException; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Locale; +import java.util.Set; +import java.util.regex.Matcher; +import java.util.regex.Pattern; +import org.apache.lucene.codecs.Codec; +import org.apache.lucene.index.DirectoryReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.SegmentReader; +import org.apache.lucene.store.Directory; +import org.apache.lucene.tests.util.LuceneTestCase; +import org.apache.lucene.tests.util.TestUtil; +import org.apache.lucene.util.BytesRef; +import org.apache.lucene.util.IOUtils; +import org.apache.lucene.util.Version; +import org.junit.After; +import org.junit.Before; + +public abstract class BackwardsCompatibilityTestBase extends LuceneTestCase { + + protected final Version version; + private static final Version LATEST_PREVIOUS_MAJOR = getLatestPreviousMajorVersion(); + protected final String indexPattern; + protected static final Set BINARY_SUPPORTED_VERSIONS; + + static { +String[] oldVersions = +new String[] { + "8.0.0", "8.0.0", "8.1.0", "8.1.0", "8.1.1", "8.1.1", "8.2.0", "8.2.0", "8.3.0", "8.3.0", + "8.3.1", "8.3.1", "8.4.0", "8.4.0", "8.4.1", "8.4.1", "8.5.0", "8.5.0", "8.5.1", "8.5.1", + "8.5.2", "8.5.2", "8.6.0", "8.6.0", "8.6.1", "8.6.1", "8.6.2", "8.6.2", "8.6.3", "8.6.3", + "8.7.0", "8.7.0", "8.8.0", "8.8.0", "8.8.1", "8.8.1", "8.8.2", "8.8.2", "8.9.0", "8.9.0", + "8.10.0", "8.10.0", "8.10.1", "8.10.1", "8.11.0", "8.11.0", "8.11.1", "8.11.1", "8.11.2", + "8.11.2", "9.0.0", "9.1.0", "9.2.0", "9.3.0", "9.4.0", "9.4.1", "9.4.2", "9.5.0", "9.6.0", + "9.7.0", "9.8.0", "9.9.0", "9.9.1", "9.9.2", "9.10.0", "10.0.0", +}; + +Set binaryVersions = new HashSet<>(); +for (String version : oldVersions) { + try { +Version v = Version.parse(version); +assertTrue("Unsupported binary version: " + v, v.major >= Version.MIN_SUPPORTED_MAJOR - 1); +binaryVersions.add(v); + } catch (ParseException ex) { +throw new RuntimeException(ex); + } +} +List allCurrentVersions = getAllCurrentVersions(); +for (Version version : allCurrentVersions) { + // make sure we never miss a version. + assertTrue("Version: " + version + " missing", binaryVersions.remove(version)); +} +BINARY_SUPPORTED_VERSIONS = binaryVersions; + } + + /** + * This is a base constructor for parameterized BWC tests. The constructor arguments are provided + * by {@link com.carrotsearch.randomizedtesting.RandomizedRunner} during test execution. A {@link + * com.carrotsearch.randomizedtesting.annotations.ParametersFactory} specified in a subclass + * provides a list lists of arguments for the tests and RandomizedRunner will execute the test for + * each of the argument list. + * + * @param version the version this test should run for + * @param indexPattern an index pattern in order to open an index of see {@link + * #createPattern(String, String)} + */ + protected BackwardsCompatibilityTestBase( + @Name("version") Version version, @Name("pattern") String indexPattern) { +this.version = version; +this.indexPattern = indexPattern; + } + + Directory directory; + + @Override + @Before + public void setUp() throws Exception { +super.setUp(); +assertNull( +"Index name " + version + " should not exist found", +TestAncientIndicesCompatibility.class.getResourceAsStream( +indexNam
Re: [I] Advance to first position of 1 in BitSet before iterating the lead in BitSetConjunctionDISI? [lucene]
jpountz commented on issue #13024: URL: https://github.com/apache/lucene/issues/13024#issuecomment-1917610302 I'm not sure I like this idea, which feels quite arbitrary: why would there be a big gap of matching doc IDs towards the start of the doc ID space, and not in the middle or towards the end? This `BitSetConjunctionDISI` was mostly added to mimic the way `FilteredQuery` used to work (before we replaced filters with queries and `FilteredQuery` with `FILTER` clauses on `BooleanQuery`) to avoid performance regressions. Maybe we can improve the heuristic regarding when to use it, which could in-turn help the case you're looking into? For instance we currently evaluate filters via `Bits#get` rather than `BitSet#nextSetBit` when their cost is greater than the minimum cost across all required clauses. In a similar context, `IndexOrDocValuesQuery` only uses doc values to run queries when their cost is 8x greater than the lead cost, maybe it would be a better threshold for evaluating filters via `Bits#get` as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Modernize BWC testing with parameterized tests [lucene]
uschindler commented on code in PR #13046: URL: https://github.com/apache/lucene/pull/13046#discussion_r1471723409 ## lucene/backward-codecs/src/test/org/apache/lucene/backward_index/BackwardsCompatibilityTestBase.java: ## @@ -0,0 +1,253 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.backward_index; + +import com.carrotsearch.randomizedtesting.annotations.Name; +import java.io.IOException; +import java.io.InputStream; +import java.lang.reflect.Field; +import java.lang.reflect.Modifier; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.text.ParseException; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Locale; +import java.util.Set; +import java.util.regex.Matcher; +import java.util.regex.Pattern; +import org.apache.lucene.codecs.Codec; +import org.apache.lucene.index.DirectoryReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.SegmentReader; +import org.apache.lucene.store.Directory; +import org.apache.lucene.tests.util.LuceneTestCase; +import org.apache.lucene.tests.util.TestUtil; +import org.apache.lucene.util.BytesRef; +import org.apache.lucene.util.IOUtils; +import org.apache.lucene.util.Version; +import org.junit.After; +import org.junit.Before; + +public abstract class BackwardsCompatibilityTestBase extends LuceneTestCase { + + protected final Version version; + private static final Version LATEST_PREVIOUS_MAJOR = getLatestPreviousMajorVersion(); + protected final String indexPattern; + protected static final Set BINARY_SUPPORTED_VERSIONS; + + static { +String[] oldVersions = +new String[] { + "8.0.0", "8.0.0", "8.1.0", "8.1.0", "8.1.1", "8.1.1", "8.2.0", "8.2.0", "8.3.0", "8.3.0", + "8.3.1", "8.3.1", "8.4.0", "8.4.0", "8.4.1", "8.4.1", "8.5.0", "8.5.0", "8.5.1", "8.5.1", + "8.5.2", "8.5.2", "8.6.0", "8.6.0", "8.6.1", "8.6.1", "8.6.2", "8.6.2", "8.6.3", "8.6.3", + "8.7.0", "8.7.0", "8.8.0", "8.8.0", "8.8.1", "8.8.1", "8.8.2", "8.8.2", "8.9.0", "8.9.0", + "8.10.0", "8.10.0", "8.10.1", "8.10.1", "8.11.0", "8.11.0", "8.11.1", "8.11.1", "8.11.2", + "8.11.2", "9.0.0", "9.1.0", "9.2.0", "9.3.0", "9.4.0", "9.4.1", "9.4.2", "9.5.0", "9.6.0", + "9.7.0", "9.8.0", "9.9.0", "9.9.1", "9.9.2", "9.10.0", "10.0.0", +}; + +Set binaryVersions = new HashSet<>(); +for (String version : oldVersions) { + try { +Version v = Version.parse(version); +assertTrue("Unsupported binary version: " + v, v.major >= Version.MIN_SUPPORTED_MAJOR - 1); +binaryVersions.add(v); + } catch (ParseException ex) { +throw new RuntimeException(ex); + } +} +List allCurrentVersions = getAllCurrentVersions(); +for (Version version : allCurrentVersions) { + // make sure we never miss a version. + assertTrue("Version: " + version + " missing", binaryVersions.remove(version)); +} +BINARY_SUPPORTED_VERSIONS = binaryVersions; + } + + /** + * This is a base constructor for parameterized BWC tests. The constructor arguments are provided + * by {@link com.carrotsearch.randomizedtesting.RandomizedRunner} during test execution. A {@link + * com.carrotsearch.randomizedtesting.annotations.ParametersFactory} specified in a subclass + * provides a list lists of arguments for the tests and RandomizedRunner will execute the test for + * each of the argument list. + * + * @param version the version this test should run for + * @param indexPattern an index pattern in order to open an index of see {@link + * #createPattern(String, String)} + */ + protected BackwardsCompatibilityTestBase( + @Name("version") Version version, @Name("pattern") String indexPattern) { +this.version = version; +this.indexPattern = indexPattern; + } + + Directory directory; + + @Override + @Before + public void setUp() throws Exception { +super.setUp(); +assertNull( +"Index name " + version + " should not exist found", +TestAncientIndicesCompatibility.class.getResourceAsStream( +indexNam
Re: [PR] clean up smoketester GPG leaks [lucene]
janhoy commented on PR #12882: URL: https://github.com/apache/lucene/pull/12882#issuecomment-1917641774 @hurutoriya Did you do any testing on the changes? If not, we need to QA that the script is not broken by this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org