[GitHub] [lucene] iverase commented on pull request #12460: Allow reading binary doc values as a DataInput

2023-08-21 Thread via GitHub
iverase commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1685783623 I am currently not planing to replace any of the usages as I am not familiar with them. Note that some of them encode data in big endian while DataOutput/DataInput uses little endian sin

[GitHub] [lucene-solr] azagniotov commented on pull request #935: LUCENE-4056: Japanese Tokenizer (Kuromoji) cannot build UniDic dictionary

2023-08-21 Thread via GitHub
azagniotov commented on PR #935: URL: https://github.com/apache/lucene-solr/pull/935#issuecomment-1685887305 Hello Team, May I inquire where are we on this? ### TL;DR In the meanwhile, I attempted and succeeded to build the [unidic-cwj-202302_full](https://clrd.ninjal.ac

[GitHub] [lucene] SevenCss commented on issue #7820: CheckIndex cannot "fix" indexes that have individual segments with missing or corrupt .si files because sanity checks will fail trying to read the

2023-08-21 Thread via GitHub
SevenCss commented on issue #7820: URL: https://github.com/apache/lucene/issues/7820#issuecomment-1685896438 @mikemccand Appreciated for your response. Exactly, after i manually removed the broken one `segments_a7`, the index could recover successfully. However, i'm trying to figure

[GitHub] [lucene] gsmiller commented on pull request #12417: forutil add vectorized and scalar code

2023-08-21 Thread via GitHub
gsmiller commented on PR #12417: URL: https://github.com/apache/lucene/pull/12417#issuecomment-1686776595 @ChrisHegarty I was considering some experimentation with [vectorized prefix sum implementations](https://en.algorithmica.org/hpc/algorithms/prefix/), but saw your comment above stating

[GitHub] [lucene] stefanvodita commented on pull request #12337: Index arbitrary fields in taxonomy docs

2023-08-21 Thread via GitHub
stefanvodita commented on PR #12337: URL: https://github.com/apache/lucene/pull/12337#issuecomment-1686832362 The commit I pushed makes `DirectoryTaxonomyReader.getInternalIndexReader` public. We also stop relying on the full path field. I’m not sure why I thought we needed it, we can use `

[GitHub] [lucene] javanna commented on pull request #12499: Simplify task executor for concurrent operations

2023-08-21 Thread via GitHub
javanna commented on PR #12499: URL: https://github.com/apache/lucene/pull/12499#issuecomment-1686902882 @sohami I will open a follow-up to offload single slices too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [lucene] javanna merged pull request #12499: Simplify task executor for concurrent operations

2023-08-21 Thread via GitHub
javanna merged PR #12499: URL: https://github.com/apache/lucene/pull/12499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] javanna commented on pull request #12499: Simplify task executor for concurrent operations

2023-08-21 Thread via GitHub
javanna commented on PR #12499: URL: https://github.com/apache/lucene/pull/12499#issuecomment-1686951882 Thanks all for looking! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [lucene] almogtavor commented on issue #12406: Register nested queries (ToParentBlockJoinQuery) to Lucene Monitor

2023-08-21 Thread via GitHub
almogtavor commented on issue #12406: URL: https://github.com/apache/lucene/issues/12406#issuecomment-1687057323 @romseygeek @dweiss @uschindler @dsmiley @gsmiller @javanna @benwtrent I'd love to get feedback from you on the subject -- This is an automated message from the Apache Git Serv

[GitHub] [lucene] javanna opened a new pull request, #12515: Offload single slice to executor

2023-08-21 Thread via GitHub
javanna opened a new pull request, #12515: URL: https://github.com/apache/lucene/pull/12515 When an executor is set to the IndexSearcher, we should try and offload most of the computation to such executor. Ideally, the caller thread would only do light coordination work, and the executor is

[GitHub] [lucene] javanna commented on pull request #12499: Simplify task executor for concurrent operations

2023-08-21 Thread via GitHub
javanna commented on PR #12499: URL: https://github.com/apache/lucene/pull/12499#issuecomment-1687122736 @sohami here it is: #12515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [lucene] javanna opened a new pull request, #12516: Unwrap execution exceptions cause and rethrow as is when possible

2023-08-21 Thread via GitHub
javanna opened a new pull request, #12516: URL: https://github.com/apache/lucene/pull/12516 When performing concurrent search, we may get an execution exception from one or more slices. In that case, we'd like to rethrow the cause of the execution exception, which we do by wrapping it into

[GitHub] [lucene] javanna commented on pull request #12516: Unwrap execution exceptions cause and rethrow as is when possible

2023-08-21 Thread via GitHub
javanna commented on PR #12516: URL: https://github.com/apache/lucene/pull/12516#issuecomment-1687152027 Another one that you may be interested in @reta @sohami -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [lucene] iverase commented on pull request #12512: Remove unused variable in BKDWriter

2023-08-21 Thread via GitHub
iverase commented on PR #12512: URL: https://github.com/apache/lucene/pull/12512#issuecomment-1687195803 Sure, it is probably a left over from another change. Now that we are here I think we should rename `scratch1` to `scratch`? -- This is an automated message from the Apache Git Service

[GitHub] [lucene] reta commented on a diff in pull request #12516: Unwrap execution exceptions cause and rethrow as is when possible

2023-08-21 Thread via GitHub
reta commented on code in PR #12516: URL: https://github.com/apache/lucene/pull/12516#discussion_r1300778596 ## lucene/core/src/java/org/apache/lucene/search/TaskExecutor.java: ## @@ -57,6 +58,12 @@ final List invokeAll(Collection> tasks) { } catch (InterruptedException

[GitHub] [lucene] reta commented on a diff in pull request #12516: Unwrap execution exceptions cause and rethrow as is when possible

2023-08-21 Thread via GitHub
reta commented on code in PR #12516: URL: https://github.com/apache/lucene/pull/12516#discussion_r1300779063 ## lucene/core/src/java/org/apache/lucene/search/TaskExecutor.java: ## @@ -57,6 +58,12 @@ final List invokeAll(Collection> tasks) { } catch (InterruptedException

[GitHub] [lucene] iverase commented on issue #12514: Could we add more index for BKD LeafNode?

2023-08-21 Thread via GitHub
iverase commented on issue #12514: URL: https://github.com/apache/lucene/issues/12514#issuecomment-1687280508 I am not sure this is the right trade off. The BKD tree was developed to perform efficient range queries. If your use case is to perform efficient `PointInSetQuery`, you might be be

[GitHub] [lucene] easyice commented on pull request #12512: Remove unused variable in BKDWriter

2023-08-21 Thread via GitHub
easyice commented on PR #12512: URL: https://github.com/apache/lucene/pull/12512#issuecomment-1687334045 @iverase It is a good idea, this seems clearer, I've renamed `scratch1` to `scratch` -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [lucene] iverase commented on pull request #12512: Remove unused variable in BKDWriter

2023-08-21 Thread via GitHub
iverase commented on PR #12512: URL: https://github.com/apache/lucene/pull/12512#issuecomment-1687356896 LGTM, Thanks @easyice ! Could you please add a CHANGES entry under 9.8.0? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [lucene] easyice commented on pull request #12512: Remove unused variable in BKDWriter

2023-08-21 Thread via GitHub
easyice commented on PR #12512: URL: https://github.com/apache/lucene/pull/12512#issuecomment-1687412992 Thanks for @iverase and @benwtrent, the CHANGES.txt has updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t