[I] Test failures in TestHnswByteVectorGraph.testSortedAndUnsortedIndicesReturnSameResults [lucene]

2024-03-24 Thread via GitHub
vsop-479 opened a new issue, #13210: URL: https://github.com/apache/lucene/issues/13210 ### Description @mayya-sharipova Please take a look when you get a chance. ### Gradle command to reproduce gradlew test --tests TestHnswByteVectorGraph.testSortedAndUnsortedIndice

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-24 Thread via GitHub
vigyasharma commented on code in PR #13202: URL: https://github.com/apache/lucene/pull/13202#discussion_r1536974998 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingKnnCollectorManager.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] upgrade snowball to 26db1ab9adbf437f37a6facd3ee2aad1da9eba03 [lucene]

2024-03-24 Thread via GitHub
rmuir commented on PR #13209: URL: https://github.com/apache/lucene/pull/13209#issuecomment-2017054138 ok, i found the issue. I removed the now-unnecessary `opens`. Nice test :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-24 Thread via GitHub
vigyasharma commented on code in PR #13202: URL: https://github.com/apache/lucene/pull/13202#discussion_r1536969204 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingKnnCollectorManager.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-24 Thread via GitHub
vigyasharma commented on code in PR #13202: URL: https://github.com/apache/lucene/pull/13202#discussion_r1536968149 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingKnnCollectorManager.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Fix TestIndexWriter.testDeleteUnusedFiles' failure on Windows 11 [lucene]

2024-03-24 Thread via GitHub
vigyasharma commented on PR #13183: URL: https://github.com/apache/lucene/pull/13183#issuecomment-2017047772 Thanks for persisting on it @vsop-479. I think we should dig more into what has changed in Windows 11+ before we change the assert condition on this test. Does it impact any o

Re: [PR] Add Romanian stopwords with s&t with comma [lucene]

2024-03-24 Thread via GitHub
rmuir commented on PR #12172: URL: https://github.com/apache/lucene/pull/12172#issuecomment-2017041764 The PR is not stale, see #13209 . I've been waiting on an official release to play it safe, but it has been too long: I think we should move forward here? I'd like to merge this after #132

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-24 Thread via GitHub
kaivalnp commented on PR #13202: URL: https://github.com/apache/lucene/pull/13202#issuecomment-2017007823 Thanks for the review @vigyasharma! > Apart from those, one divergent behavior is that we won't be raising some form of `TimeExceededException` when we timeout and terminate the s

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-24 Thread via GitHub
kaivalnp commented on code in PR #13202: URL: https://github.com/apache/lucene/pull/13202#discussion_r1536943966 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingKnnCollectorManager.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-24 Thread via GitHub
vigyasharma commented on PR #13202: URL: https://github.com/apache/lucene/pull/13202#issuecomment-2016947960 Separately, should we deprecate `TimeLimitingCollector` ? It doesn't use `QueryTimeout` and I don't think we're using it anywhere. -- This is an automated message from the Apache G

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-24 Thread via GitHub
vigyasharma commented on PR #13202: URL: https://github.com/apache/lucene/pull/13202#issuecomment-2016947180 > ### baseline with `null` timeout > ``` > 0.965 1.10 100 50 16 100 16750 1.00 post-filter > 0.992 2.82 100 400 16

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-03-24 Thread via GitHub
vigyasharma commented on code in PR #13202: URL: https://github.com/apache/lucene/pull/13202#discussion_r1536908711 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingKnnCollectorManager.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Made DocIdsWriter use DISI when reading documents with an IntersectVisitor [lucene]

2024-03-24 Thread via GitHub
antonha commented on PR #13149: URL: https://github.com/apache/lucene/pull/13149#issuecomment-2016941949 > Before merging I'd be curious to better understand why the JVM doesn't optimize this better. Presumably, it should be able to resolve the virtual call once for the entire for loop rath

[I] Find more classes in main branch that can be converted to record classes [lucene]

2024-03-24 Thread via GitHub
uschindler opened a new issue, #13207: URL: https://github.com/apache/lucene/issues/13207 ### Description This is an overview issue about all classes in current Lucene's main branch to be ported over to `record` classes. Record classes are fine for immutable data structures which hav

Re: [PR] Add support for posix_madvise to Java 21 MMapDirectory [lucene]

2024-03-24 Thread via GitHub
uschindler commented on PR #13196: URL: https://github.com/apache/lucene/pull/13196#issuecomment-2016797422 Maybe it is easier to see results on benchmarking when it is in main branch. I am waiting for final review by @jpountz and then merge this. Backporting to 9.x is also planned and shou

Re: [PR] Convert IOContext, MergeInfo, and FlushInfo to record classes [lucene]

2024-03-24 Thread via GitHub
uschindler commented on PR #13205: URL: https://github.com/apache/lucene/pull/13205#issuecomment-2016796962 I will apply those changes after #13196, because if doing this before, backporting of the other PR gets harder (#13196 has changes to IOContext, too). -- This is an automated messag

Re: [PR] LUCENE-4056: Japanese Tokenizer (Kuromoji) cannot build UniDic dictionary [lucene]

2024-03-24 Thread via GitHub
mocobeta commented on PR #12517: URL: https://github.com/apache/lucene/pull/12517#issuecomment-2016769417 Hi, sorry for my late reply. I quickly checked the built dictionary size. The latest Unidic is fairly (to me, insanely) large - its total size is 1.6G. https://clrd.ninjal.ac.jp/un

Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

2024-03-24 Thread via GitHub
dnhatn commented on PR #13190: URL: https://github.com/apache/lucene/pull/13190#issuecomment-2016724945 Elasticsearch CI has identified an issue related to this change. The PerFieldDocValuesFormat and PerFieldPostingsFormat, which mutate and reset the fieldInfos of the mergeState while exec