[PR] Do not use mock merge policy for TestFuzzyQuery#testFuzziness [lucene]
easyice opened a new pull request, #13070: URL: https://github.com/apache/lucene/pull/13070 git bisect shows this commit as the perpetrator: f7cab1645017d863331b42900581b67d3591e2da ``` > org.junit.ComparisonFailure: expected: but was: > at __randomizedtesting.SeedInfo.seed([7DF2C3FF35FEFFC6:36C7D4343E606C7C]:0) > at org.junit.Assert.assertEquals(Assert.java:117) > at org.junit.Assert.assertEquals(Assert.java:146) > at org.apache.lucene.search.TestFuzzyQuery.testFuzziness(TestFuzzyQuery.java:156) ``` Reproduce command: ``` ./gradlew test --tests TestFuzzyQuery.testFuzziness -Dtests.seed=7DF2C3FF35FEFFC6 -Dtests.nightly=true -Dtests.locale=ru-KG -Dtests.timezone=America/Virgin -Dtests.asserts=true -Dtests.file.encoding=UTF-8 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use growNoCopy in some places [lucene]
easyice commented on PR #12951: URL: https://github.com/apache/lucene/pull/12951#issuecomment-1925637932 @epotyom @dweiss Hi, can you help me to merge if it looks okay? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[I] Reproducible failure in TestParentBlockJoinFloatKnnVectorQuery.testSkewedIndex [lucene]
easyice opened a new issue, #13071: URL: https://github.com/apache/lucene/issues/13071 ### Description ``` > org.junit.ComparisonFailure: expected:<[6]> but was:<[14]> > at __randomizedtesting.SeedInfo.seed([3332C40C31EA01FF:3CFF33AC05C3913B]:0) > at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:117) > at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:146) > at org.apache.lucene.search.join.ParentBlockJoinKnnVectorQueryTestCase.assertIdMatches(ParentBlockJoinKnnVectorQueryTestCase.java:325) > at org.apache.lucene.search.join.ParentBlockJoinKnnVectorQueryTestCase.testSkewedIndex(ParentBlockJoinKnnVectorQueryTestCase.java:277) > at org.apache.lucene.search.join.TestParentBlockJoinFloatKnnVectorQuery.testSkewedIndex(TestParentBlockJoinFloatKnnVectorQuery.java:37) ``` ### Gradle command to reproduce ``` ./gradlew test --tests TestParentBlockJoinFloatKnnVectorQuery.testSkewedIndex -Dtests.seed=3332C40C31EA01FF -Dtests.nightly=true -Dtests.locale=fr-MU -Dtests.timezone=Asia/Choibalsan -Dtests.asserts=true -Dtests.file.encoding=UTF-8 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Contributing a deep-learning, BERT-based analyzer [lucene]
lmessinger commented on issue #13065: URL: https://github.com/apache/lucene/issues/13065#issuecomment-1925739977 I mean, create just the tokens - the lemmas / wordpieces -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use growNoCopy in some places [lucene]
dweiss merged PR #12951: URL: https://github.com/apache/lucene/pull/12951 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use growNoCopy in some places [lucene]
dweiss commented on PR #12951: URL: https://github.com/apache/lucene/pull/12951#issuecomment-1925823962 I've backported to branch_9x as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Move `brToString(BytesRef)` to `ToStringUtils` [lucene]
uschindler commented on code in PR #13068: URL: https://github.com/apache/lucene/pull/13068#discussion_r1477431195 ## lucene/core/src/java/org/apache/lucene/util/ToStringUtils.java: ## @@ -32,11 +32,37 @@ public static void byteArray(StringBuilder buffer, byte[] bytes) { private static final char[] HEX = "0123456789abcdef".toCharArray(); + /** + * Unlike {@link Long#toHexString(long)} returns a String with a "0x" prefix and all the leading + * zeros. + */ public static String longHex(long x) { char[] asHex = new char[16]; for (int i = 16; --i >= 0; x >>>= 4) { asHex[i] = HEX[(int) x & 0x0F]; } return "0x" + new String(asHex); } + + @SuppressWarnings("unused") + public static String brToString(BytesRef b) { +if (b == null) { + return "null"; +} +try { + return b.utf8ToString() + " " + b; +} catch (Throwable t) { Review Comment: This method should not catch Throwable, can we not make a multi-catch out of it and explicitely: - `AssertionFailedError` (if we hit an Assertion) - `RuntimeException` (anything is wrong and we have wrong offsets or bytes array is too short or incomplete surrogates -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Move `brToString(BytesRef)` to `ToStringUtils` [lucene]
sabi0 commented on code in PR #13068: URL: https://github.com/apache/lucene/pull/13068#discussion_r1477434373 ## lucene/core/src/java/org/apache/lucene/util/ToStringUtils.java: ## @@ -32,11 +32,37 @@ public static void byteArray(StringBuilder buffer, byte[] bytes) { private static final char[] HEX = "0123456789abcdef".toCharArray(); + /** + * Unlike {@link Long#toHexString(long)} returns a String with a "0x" prefix and all the leading + * zeros. + */ public static String longHex(long x) { char[] asHex = new char[16]; for (int i = 16; --i >= 0; x >>>= 4) { asHex[i] = HEX[(int) x & 0x0F]; } return "0x" + new String(asHex); } + + @SuppressWarnings("unused") + public static String brToString(BytesRef b) { +if (b == null) { + return "null"; +} +try { + return b.utf8ToString() + " " + b; +} catch (Throwable t) { Review Comment: Nice catch. Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Optimize counts on 2-clauses disjunctions [lucene]
jpountz closed issue #12644: Optimize counts on 2-clauses disjunctions URL: https://github.com/apache/lucene/issues/12644 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Optimize counts on 2-clauses disjunctions [lucene]
jpountz commented on issue #12644: URL: https://github.com/apache/lucene/issues/12644#issuecomment-1926402796 Adressed via https://github.com/apache/lucene/pull/13036. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org