[GitHub] [lucene] jpountz commented on pull request #12051: Fix wrong assertion in TestBooleanQuery.testQueryMatchesCount

2023-01-01 Thread GitBox
jpountz commented on PR #12051: URL: https://github.com/apache/lucene/pull/12051#issuecomment-1368388218 Thanks for catching this. Would it also work if we fixed indexing to sometimes index other values, e.g. replacing `if (random().nextBoolean()) {` with `if (i != 3 && random().nextBoolean

[GitHub] [lucene] jpountz opened a new pull request, #12053: Allow reusing indexed binary fields.

2023-01-01 Thread GitBox
jpountz opened a new pull request, #12053: URL: https://github.com/apache/lucene/pull/12053 Today Lucene allows creating indexed binary fields, e.g. via `StringField(String, BytesRef, Field.Store)`, but not reusing them: calling `setBytesValue` on a `StringField` throws. This commit

[GitHub] [lucene] jpountz opened a new pull request, #12054: Introduce a new `KeywordField`.

2023-01-01 Thread GitBox
jpountz opened a new pull request, #12054: URL: https://github.com/apache/lucene/pull/12054 `KeywordField` is a combination of `StringField` and `SortedSetDocValuesField`, similarly to how `LongField` is a combination of `LongPoint` and `SortedNumericDocValuesField`. This makes it easier fo

[GitHub] [lucene] jpountz opened a new pull request, #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-01-01 Thread GitBox
jpountz opened a new pull request, #12055: URL: https://github.com/apache/lucene/pull/12055 Currently multi-term queries with a filter rewrite internally rewrite to a disjunction if 16 terms or less match the query. Otherwise postings lists of matching terms are collected into a `DocIdSetBu

[GitHub] [lucene] jpountz commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-01-01 Thread GitBox
jpountz commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1368422269 Here is what luceneutil gives on wikimedium10m: ``` TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff p-value

[GitHub] [lucene] jpountz commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-01-01 Thread GitBox
jpountz commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1368423502 For the record, the reason why we're seeing a speedup here is because prefix and wildcard queries produce constant scores, so the query can early terminate once 1,000 hits have been coll

[GitHub] [lucene] msokolov merged pull request #12047: fix typo analysis-kuromoji

2023-01-01 Thread GitBox
msokolov merged PR #12047: URL: https://github.com/apache/lucene/pull/12047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

[GitHub] [lucene] twosom commented on pull request #12047: fix typo analysis-kuromoji

2023-01-01 Thread GitBox
twosom commented on PR #12047: URL: https://github.com/apache/lucene/pull/12047#issuecomment-1368474848 @msokolov Thanks~! and Happy New Year!👻 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [lucene] msokolov commented on a diff in pull request #12029: introduce support in KnnVectorQuery for getters/setters

2023-01-01 Thread GitBox
msokolov commented on code in PR #12029: URL: https://github.com/apache/lucene/pull/12029#discussion_r1059769467 ## lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java: ## @@ -33,6 +33,7 @@ import org.apache.lucene.store.Directory; import org.apache.lucene.ut

[GitHub] [lucene] msokolov commented on pull request #12048: Move HNSW parameters to the HnswGraphBuilder class

2023-01-01 Thread GitBox
msokolov commented on PR #12048: URL: https://github.com/apache/lucene/pull/12048#issuecomment-1368477965 Sorry, I don't see this being any better than the current situation; aside from tests, the parameters are only used in HnswVectorsFormat where they are currently defined, so I think we

[GitHub] [lucene] msokolov commented on issue #11354: Reuse HNSW graphs when merging segments? [LUCENE-10318]

2023-01-01 Thread GitBox
msokolov commented on issue #11354: URL: https://github.com/apache/lucene/issues/11354#issuecomment-1368479497 HI Jack, thanks for persisting and returning to this. I haven't had a chance to review the PR yet, just looking at the results here I have a few questions. First, it looks to me as

[GitHub] [lucene] rmuir commented on pull request #12053: Allow reusing indexed binary fields.

2023-01-01 Thread GitBox
rmuir commented on PR #12053: URL: https://github.com/apache/lucene/pull/12053#issuecomment-1368481511 > I considered an alternative that consisted of failing if calling `setBytesValue` on a field that is indexed and tokenized Can we just do this instead? I think an important p

[GitHub] [lucene] rmuir commented on pull request #12053: Allow reusing indexed binary fields.

2023-01-01 Thread GitBox
rmuir commented on PR #12053: URL: https://github.com/apache/lucene/pull/12053#issuecomment-1368481827 and yeah, you don't have such checks on numeric values, but numeric values don't have TokenStream tokenization. Being consistent with them makes no sense, that isn't what this is about.

[GitHub] [lucene] rmuir commented on pull request #12053: Allow reusing indexed binary fields.

2023-01-01 Thread GitBox
rmuir commented on PR #12053: URL: https://github.com/apache/lucene/pull/12053#issuecomment-1368482600 the fact that the tests pass with this change is really upsetting too. we should at least add checks for the type of luser moments we want to prevent, e.g. calling setBytesRef on a fucking

[GitHub] [lucene] zhaih commented on pull request #12051: Fix wrong assertion in TestBooleanQuery.testQueryMatchesCount

2023-01-01 Thread GitBox
zhaih commented on PR #12051: URL: https://github.com/apache/lucene/pull/12051#issuecomment-1368503412 Yeah it should work unless we later come up with some way to quickly pull out count in that situation as well. But I think the assertion here may not be necessary because I see you

[GitHub] [lucene] rmuir opened a new pull request, #12056: Update to error-prone 2.17

2023-01-01 Thread GitBox
rmuir opened a new pull request, #12056: URL: https://github.com/apache/lucene/pull/12056 I investigated each of the new checks, nothing really interesting except an incorrect javadoc link (discovered manually) linking to Object.finalize() -- This is an automated message from the Apache G

[GitHub] [lucene] rmuir opened a new issue, #12057: Forbidden-apis "built-in" signatures don't appear to be working?

2023-01-01 Thread GitBox
rmuir opened a new issue, #12057: URL: https://github.com/apache/lucene/issues/12057 ### Description I was looking at new error-prone checks in #12056 and one fails on Object.finalize Because the method is in the built-in JDK deprecated list (e.g. https://github.com/policeman-

[GitHub] [lucene] rmuir commented on issue #12057: Forbidden-apis "built-in" signatures don't appear to be working?

2023-01-01 Thread GitBox
rmuir commented on issue #12057: URL: https://github.com/apache/lucene/issues/12057#issuecomment-1368524586 Here's how to reproduce: apply this patch, then run `gradlew check -x test`. I would expect the build to fail, because we added a deprecated finalizer. Maybe forbidden doesn't fail be

[GitHub] [lucene] rmuir commented on issue #12057: Forbidden-apis "built-in" signatures don't appear to be working?

2023-01-01 Thread GitBox
rmuir commented on issue #12057: URL: https://github.com/apache/lucene/issues/12057#issuecomment-1368527335 Confirmed that's the issue, if i add a `super.finalize()` call to my finalizer, then forbidden fails. I will edit the issue. So we may need to use a different tool (javac, ecj)

[GitHub] [lucene] rmuir commented on issue #12057: ban finalizers in the build somehow (worst-case: use error-prone)

2023-01-01 Thread GitBox
rmuir commented on issue #12057: URL: https://github.com/apache/lucene/issues/12057#issuecomment-1368539424 Currently there is no good way with ECJ/javac, unless we fail on all deprecations, which is very noisy at the moment. We can probably do it better with ECJ if we enable all their depr

[GitHub] [lucene] uschindler opened a new pull request, #12058: Fix detection of Hotspot in TestRamUsageEstimator so it works with OpenJ9 that has the bean, but without properties

2023-01-01 Thread GitBox
uschindler opened a new pull request, #12058: URL: https://github.com/apache/lucene/pull/12058 This improves the test, which fails with OpenJ9 VMs, due to the following problem: - OpenJ9 returns the HotspotMXBean, but it is empty and has no properties. So we can't detect compressed point

[GitHub] [lucene] uschindler commented on pull request #12058: Fix detection of Hotspot in TestRamUsageEstimator so it works with OpenJ9 that has the bean, but without properties

2023-01-01 Thread GitBox
uschindler commented on PR #12058: URL: https://github.com/apache/lucene/pull/12058#issuecomment-1368556378 Thanks Robert! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [lucene] uschindler merged pull request #12058: Fix detection of Hotspot in TestRamUsageEstimator so it works with OpenJ9 that has the bean, but without properties

2023-01-01 Thread GitBox
uschindler merged PR #12058: URL: https://github.com/apache/lucene/pull/12058 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

[GitHub] [lucene] rmuir commented on a diff in pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-01-01 Thread GitBox
rmuir commented on code in PR #12055: URL: https://github.com/apache/lucene/pull/12055#discussion_r1059804843 ## lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreWrapper.java: ## @@ -183,23 +182,31 @@ private WeightOrDocIdSet rewrite(LeafReaderContext co

[GitHub] [lucene] rmuir commented on a diff in pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-01-01 Thread GitBox
rmuir commented on code in PR #12055: URL: https://github.com/apache/lucene/pull/12055#discussion_r1059807197 ## lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreWrapper.java: ## @@ -183,23 +182,31 @@ private WeightOrDocIdSet rewrite(LeafReaderContext co

[GitHub] [lucene] rmuir commented on a diff in pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-01-01 Thread GitBox
rmuir commented on code in PR #12055: URL: https://github.com/apache/lucene/pull/12055#discussion_r1059807649 ## lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreWrapper.java: ## @@ -183,23 +182,31 @@ private WeightOrDocIdSet rewrite(LeafReaderContext co

[GitHub] [lucene] uschindler closed issue #8485: TestIndexWriterOnError.testCheckpoint fails on IBM J9 [LUCENE-7432]

2023-01-01 Thread GitBox
uschindler closed issue #8485: TestIndexWriterOnError.testCheckpoint fails on IBM J9 [LUCENE-7432] URL: https://github.com/apache/lucene/issues/8485 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [lucene] uschindler commented on issue #7580: Reproducible fieldcache AIOOBE only on J9 [LUCENE-6522]

2023-01-01 Thread GitBox
uschindler commented on issue #7580: URL: https://github.com/apache/lucene/issues/7580#issuecomment-1368575244 This seems fixed now, tets no longer fails. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [lucene] uschindler closed issue #7580: Reproducible fieldcache AIOOBE only on J9 [LUCENE-6522]

2023-01-01 Thread GitBox
uschindler closed issue #7580: Reproducible fieldcache AIOOBE only on J9 [LUCENE-6522] URL: https://github.com/apache/lucene/issues/7580 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] uschindler closed issue #7579: org.apache.xerces.util is a protected pkg on IBM J9 [LUCENE-6521]

2023-01-01 Thread GitBox
uschindler closed issue #7579: org.apache.xerces.util is a protected pkg on IBM J9 [LUCENE-6521] URL: https://github.com/apache/lucene/issues/7579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] uschindler commented on issue #7579: org.apache.xerces.util is a protected pkg on IBM J9 [LUCENE-6521]

2023-01-01 Thread GitBox
uschindler commented on issue #7579: URL: https://github.com/apache/lucene/issues/7579#issuecomment-1368575466 This is fixed in J9, as it now uses OpenJDK class library. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [lucene] uschindler closed issue #7575: mockfilesystem tests fail with IBM jdk [LUCENE-6517]

2023-01-01 Thread GitBox
uschindler closed issue #7575: mockfilesystem tests fail with IBM jdk [LUCENE-6517] URL: https://github.com/apache/lucene/issues/7575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [lucene] uschindler commented on issue #7575: mockfilesystem tests fail with IBM jdk [LUCENE-6517]

2023-01-01 Thread GitBox
uschindler commented on issue #7575: URL: https://github.com/apache/lucene/issues/7575#issuecomment-1368575612 This should no longer be an issue, as OpenJ9 uses the OpenJDK class library now. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [lucene] uschindler commented on issue #7614: TestQueryTemplateManager always fails on J9 [LUCENE-6556]

2023-01-01 Thread GitBox
uschindler commented on issue #7614: URL: https://github.com/apache/lucene/issues/7614#issuecomment-1368575794 This is no longer an issue, all tests pass, because OpenJ9 now uses the OpenJDK class library and no longer Harmony. -- This is an automated message from the Apache Git Service.

[GitHub] [lucene] uschindler closed issue #7614: TestQueryTemplateManager always fails on J9 [LUCENE-6556]

2023-01-01 Thread GitBox
uschindler closed issue #7614: TestQueryTemplateManager always fails on J9 [LUCENE-6556] URL: https://github.com/apache/lucene/issues/7614 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [lucene] uschindler commented on issue #7580: Reproducible fieldcache AIOOBE only on J9 [LUCENE-6522]

2023-01-01 Thread GitBox
uschindler commented on issue #7580: URL: https://github.com/apache/lucene/issues/7580#issuecomment-1368576005 In addition, Lucene has no fieldcache anymore. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [lucene] uschindler closed issue #5001: TestNRTManager hangs with IBM JRE [LUCENE-3928]

2023-01-01 Thread GitBox
uschindler closed issue #5001: TestNRTManager hangs with IBM JRE [LUCENE-3928] URL: https://github.com/apache/lucene/issues/5001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] uschindler commented on issue #5001: TestNRTManager hangs with IBM JRE [LUCENE-3928]

2023-01-01 Thread GitBox
uschindler commented on issue #5001: URL: https://github.com/apache/lucene/issues/5001#issuecomment-1368577762 This test now passes with IBM Semeru / OpenJ9 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [lucene] jasirkt commented on issue #11701: Deadlock in AnalysisSPILoader [LUCENE-10665]

2023-01-01 Thread GitBox
jasirkt commented on issue #11701: URL: https://github.com/apache/lucene/issues/11701#issuecomment-1368693920 > In which verison did you see this? 9.1.0 Thanks for fixing. It works now! -- This is an automated message from the Apache Git Service. To respond to the message, pl