[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-07 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076 ] kkewwei edited comment on LUCENE-10448 at 3/7/22, 9:25 AM: --- [

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-07 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076 ] kkewwei edited comment on LUCENE-10448 at 3/7/22, 9:31 AM: --- [

[GitHub] [lucene] mayya-sharipova opened a new pull request #734: LUCENE-10408 Test correction checksum

2022-03-07 Thread GitBox
mayya-sharipova opened a new pull request #734: URL: https://github.com/apache/lucene/pull/734 Use double instead of float to test vector values checksum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[jira] [Commented] (LUCENE-10408) Better dense encoding of doc Ids in Lucene91HnswVectorsFormat

2022-03-07 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502211#comment-17502211 ] Mayya Sharipova commented on LUCENE-10408: -- [~julietibs] I've changed a dataty

[GitHub] [lucene] mayya-sharipova commented on pull request #734: LUCENE-10408 Test: correct type of checksum

2022-03-07 Thread GitBox
mayya-sharipova commented on pull request #734: URL: https://github.com/apache/lucene/pull/734#issuecomment-1060581581 As @jtibshirani reported, currently without this patch this test fails: ```txt ./gradlew test --tests TestPerFieldKnnVectorsFormat.testVectorValuesReportCorrect

[jira] [Comment Edited] (LUCENE-10408) Better dense encoding of doc Ids in Lucene91HnswVectorsFormat

2022-03-07 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502211#comment-17502211 ] Mayya Sharipova edited comment on LUCENE-10408 at 3/7/22, 11:31 AM: -

[jira] [Comment Edited] (LUCENE-10408) Better dense encoding of doc Ids in Lucene91HnswVectorsFormat

2022-03-07 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502211#comment-17502211 ] Mayya Sharipova edited comment on LUCENE-10408 at 3/7/22, 11:31 AM: -

[GitHub] [lucene] msokolov commented on pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-07 Thread GitBox
msokolov commented on pull request #732: URL: https://github.com/apache/lucene/pull/732#issuecomment-1060653437 Ah, I see - sorry if I jumped the gun and now your commit is recorded against the wrong github id! Sadly there is no fixing github. -- This is an automated message from the Apa

[GitHub] [lucene] msokolov edited a comment on pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-07 Thread GitBox
msokolov edited a comment on pull request #732: URL: https://github.com/apache/lucene/pull/732#issuecomment-1060653437 Ah, I see - sorry if I jumped the gun and now your commit is recorded against the wrong github id! Sadly there is no fixing git history. -- This is an automated message

[jira] [Commented] (LUCENE-10425) count aggregation optimization inside one segment in log scenario

2022-03-07 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502254#comment-17502254 ] Michael Sokolov commented on LUCENE-10425: -- > Thinking through this a bit more

[jira] [Commented] (LUCENE-10427) OLAP likewise rollup during segment merge process

2022-03-07 Thread Suhan Mao (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502344#comment-17502344 ] Suhan Mao commented on LUCENE-10427: [~jpountz] sorry to interrupt you, could you s

[GitHub] [lucene] magibney commented on pull request #380: LUCENE-10171 - Fix dictionary-based OpenNLPLemmatizerFilterFactory caching issue

2022-03-07 Thread GitBox
magibney commented on pull request #380: URL: https://github.com/apache/lucene/pull/380#issuecomment-1060886236 Yes, essentially. Perhaps something like?: ``` * LUCENE-10171: OpenNLPOpsFactory.getLemmatizerDictionary(String, ResourceLoader) now returns a DictionaryLemmatizer obje

[GitHub] [lucene] epugh commented on pull request #380: LUCENE-10171 - Fix dictionary-based OpenNLPLemmatizerFilterFactory caching issue

2022-03-07 Thread GitBox
epugh commented on pull request #380: URL: https://github.com/apache/lucene/pull/380#issuecomment-1060934405 CHANGES.txt does change a lot ;-).Nice patch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [lucene] Yuti-G commented on pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-07 Thread GitBox
Yuti-G commented on pull request #732: URL: https://github.com/apache/lucene/pull/732#issuecomment-1060939698 > Ah, I see - sorry if I jumped the gun and now your commit is recorded against the wrong github id! Sadly there is no fixing git history. No worries! I think it is against m

[GitHub] [lucene] dblock opened a new pull request #735: Fix: typo + +minScore.

2022-03-07 Thread GitBox
dblock opened a new pull request #735: URL: https://github.com/apache/lucene/pull/735 # Description A typo snuck into https://github.com/apache/lucene/pull/711. # Solution Fix typo, remove `+ +`. # Checklist Please review the following and check all that ap

[GitHub] [lucene] jtibshirani commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-07 Thread GitBox
jtibshirani commented on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1061011001 @rmuir's perspective makes total sense to me too, that we should stream to the format instead of buffering on disk within `IndexingChain`. One related thought: in a scen

[jira] [Commented] (LUCENE-10430) Literal double quotes cause exception in class RegExp

2022-03-07 Thread Holger Rehn (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502516#comment-17502516 ] Holger Rehn commented on LUCENE-10430: -- Thanks for the hint! I totally missed the

[GitHub] [lucene] jtibshirani commented on pull request #734: LUCENE-10408 Test: correct type of checksum

2022-03-07 Thread GitBox
jtibshirani commented on pull request #734: URL: https://github.com/apache/lucene/pull/734#issuecomment-1061145754 I just noticed: we need a similar fix for `BaseKnnVectorsFormatTestCase#testSparseVectors`. I saw this test fail in CI because of a slight checksum mismatch (unfortunately the

[GitHub] [lucene] jtibshirani removed a comment on pull request #734: LUCENE-10408 Test: correct type of checksum

2022-03-07 Thread GitBox
jtibshirani removed a comment on pull request #734: URL: https://github.com/apache/lucene/pull/734#issuecomment-1061145754 I just noticed: we need a similar fix for `BaseKnnVectorsFormatTestCase#testSparseVectors`. I saw this test fail in CI because of a slight checksum mismatch (unfortuna

[jira] [Created] (LUCENE-10457) LuceneTestCase.createTempDir could randomly return symbolic links

2022-03-07 Thread Mike Drob (Jira)
Mike Drob created LUCENE-10457: -- Summary: LuceneTestCase.createTempDir could randomly return symbolic links Key: LUCENE-10457 URL: https://issues.apache.org/jira/browse/LUCENE-10457 Project: Lucene - Cor

[jira] [Commented] (LUCENE-10457) LuceneTestCase.createTempDir could randomly return symbolic links

2022-03-07 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502623#comment-17502623 ] Robert Muir commented on LUCENE-10457: -- I don't think createTempDir should do this

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-07 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076 ] kkewwei edited comment on LUCENE-10448 at 3/8/22, 12:42 AM:

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-07 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076 ] kkewwei edited comment on LUCENE-10448 at 3/8/22, 12:43 AM:

[jira] [Commented] (LUCENE-10457) LuceneTestCase.createTempDir could randomly return symbolic links

2022-03-07 Thread Mike Drob (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502663#comment-17502663 ] Mike Drob commented on LUCENE-10457: > we should suffer the complexity of fixing al

[jira] [Created] (LUCENE-10458) BoundedDocSetIdIterator may supply error count in Weigth#count(LeafReaderContext) when missingValue enables

2022-03-07 Thread Lu Xugang (Jira)
Lu Xugang created LUCENE-10458: -- Summary: BoundedDocSetIdIterator may supply error count in Weigth#count(LeafReaderContext) when missingValue enables Key: LUCENE-10458 URL: https://issues.apache.org/jira/browse/LUCEN

[jira] [Commented] (LUCENE-10458) BoundedDocSetIdIterator may supply error count in Weigth#count(LeafReaderContext) when missingValue enables

2022-03-07 Thread Lu Xugang (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502773#comment-17502773 ] Lu Xugang commented on LUCENE-10458: That is why in BoundedDocSetIdIterator#advance

[jira] [Comment Edited] (LUCENE-10458) BoundedDocSetIdIterator may supply error count in Weigth#count(LeafReaderContext) when missingValue enables

2022-03-07 Thread Lu Xugang (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502773#comment-17502773 ] Lu Xugang edited comment on LUCENE-10458 at 3/8/22, 7:32 AM:

[GitHub] [lucene] LuXugang opened a new pull request #736: LUCENE-10458: BoundedDocSetIdIterator may supply error count in Weigth#count(LeafReaderContext) when missingValue enables

2022-03-07 Thread GitBox
LuXugang opened a new pull request #736: URL: https://github.com/apache/lucene/pull/736 When IndexSortSortedNumericDocValuesRangeQuery can take advantage of index sort, Weight#count will use BoundedDocSetIdIterator's lastDoc and firstDoc to calculate count, but if missingValue enables, tho