[GitHub] [lucene] zouxiang1993 opened a new pull request, #12602: Reduce collection operations when minShouldMatch == 0.

2023-09-27 Thread via GitHub
zouxiang1993 opened a new pull request, #12602: URL: https://github.com/apache/lucene/pull/12602 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [lucene] benwtrent commented on pull request #12590: Allow implementers of AbstractKnnVectorQuery to access final topK results

2023-09-27 Thread via GitHub
benwtrent commented on PR #12590: URL: https://github.com/apache/lucene/pull/12590#issuecomment-1737897805 > Overriding rewrite (first calling super.rewrite and looking at the rewritten Query) I would say don't call `super.rewrite` and just copy paste things. Or use a collector

[GitHub] [lucene] iverase opened a new issue, #12601: Reproducible TestDrillSideways failure

2023-09-27 Thread via GitHub
iverase opened a new issue, #12601: URL: https://github.com/apache/lucene/issues/12601 ### Description The following gradle command fails reproducibly on main with the following error: ``` > java.lang.AssertionError > at __randomizedtesting.SeedInfo.see

[GitHub] [lucene] iverase commented on a diff in pull request #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
iverase commented on code in PR #12600: URL: https://github.com/apache/lucene/pull/12600#discussion_r1338780906 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -290,6 +312,23 @@ public byte readByte(long pos) throws IOException { } }

[GitHub] [lucene] uschindler commented on a diff in pull request #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
uschindler commented on code in PR #12600: URL: https://github.com/apache/lucene/pull/12600#discussion_r1338741133 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -290,6 +312,23 @@ public byte readByte(long pos) throws IOException { }

[GitHub] [lucene] uschindler commented on a diff in pull request #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
uschindler commented on code in PR #12600: URL: https://github.com/apache/lucene/pull/12600#discussion_r1338738357 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -290,6 +312,23 @@ public byte readByte(long pos) throws IOException { }

[GitHub] [lucene] iverase commented on a diff in pull request #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
iverase commented on code in PR #12600: URL: https://github.com/apache/lucene/pull/12600#discussion_r1338733436 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -290,6 +312,23 @@ public byte readByte(long pos) throws IOException { } }

[GitHub] [lucene] uschindler commented on a diff in pull request #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
uschindler commented on code in PR #12600: URL: https://github.com/apache/lucene/pull/12600#discussion_r1338712090 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -290,6 +312,23 @@ public byte readByte(long pos) throws IOException { }

[GitHub] [lucene] uschindler commented on pull request #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
uschindler commented on PR #12600: URL: https://github.com/apache/lucene/pull/12600#issuecomment-1737527376 I changed to draft until all is tested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [lucene] uschindler commented on pull request #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
uschindler commented on PR #12600: URL: https://github.com/apache/lucene/pull/12600#issuecomment-1737520431 I need more time to review this as I am a bit crowded. Sorry for possible delays. If you get tests running with all kinds of mmap, just give me some feedback. -- This is an a

[GitHub] [lucene] uschindler commented on pull request #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
uschindler commented on PR #12600: URL: https://github.com/apache/lucene/pull/12600#issuecomment-1737512784 Hi, > @uschindler I tried to provide implementations for the `MemorySegmentIndexInputProvider`. I am a bit confused because I run `TestMmapDirectory` several times hoping it wi

[GitHub] [lucene] iverase opened a new pull request, #12600: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
iverase opened a new pull request, #12600: URL: https://github.com/apache/lucene/pull/12600 Adds a new method to RandomAccessInput tio bulk read bytes into a provided byte array. The default implementation reads byte by byte but faster implementations are provided for all lucene implementat

[GitHub] [lucene] iverase opened a new issue, #12599: Add readBytes method to RandomAccessInput

2023-09-27 Thread via GitHub
iverase opened a new issue, #12599: URL: https://github.com/apache/lucene/issues/12599 We can currently read a RandomAccessInput byte by byte but it does not provide a method to bulk read a chunck of bytes, similary to what DataInput provides with DataInput#readBytes(byte[], int, int). Ther

[GitHub] [lucene] gf2121 closed issue #12598: FST#Compiler allocates too much memory

2023-09-27 Thread via GitHub
gf2121 closed issue #12598: FST#Compiler allocates too much memory URL: https://github.com/apache/lucene/issues/12598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[GitHub] [lucene] iverase merged pull request #12594: Add length method to RandomAccessInput

2023-09-27 Thread via GitHub
iverase merged PR #12594: URL: https://github.com/apache/lucene/pull/12594 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] iverase closed issue #12592: Add length method to RandomAccessInput

2023-09-27 Thread via GitHub
iverase closed issue #12592: Add length method to RandomAccessInput URL: https://github.com/apache/lucene/issues/12592 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[GitHub] [lucene] s1monw commented on pull request #12595: Make IndexWriter#flushNextBuffer also apply deletes if necessary

2023-09-27 Thread via GitHub
s1monw commented on PR #12595: URL: https://github.com/apache/lucene/pull/12595#issuecomment-1737068633 @jpountz I added a simplified version of your idea that applies deletes if they consume more RAM than the DWPT we are flushing. The applyAllDeletes() only has an impact if the `FlushContr

[GitHub] [lucene] gf2121 opened a new issue, #12598: FST#Compiler allocates too much memory

2023-09-27 Thread via GitHub
gf2121 opened a new issue, #12598: URL: https://github.com/apache/lucene/issues/12598 ### Description https://blunders.io/jfr-demo/indexing-4kb-2023.09.25.18.03.36/allocations-drill-down The allocation profile for nightly indexing benchmark shows that `FST#ByteStore` occupies

[GitHub] [lucene] romseygeek commented on pull request #12594: Add length method to RandomAccessInput

2023-09-27 Thread via GitHub
romseygeek commented on PR #12594: URL: https://github.com/apache/lucene/pull/12594#issuecomment-1736976414 > Because IndexInput already defines a method called length, so adding a size method means that all the IndexInput that implements RandomAccessInput (which there are a few) will have

[GitHub] [lucene] iverase commented on pull request #12594: Add length method to RandomAccessInput

2023-09-27 Thread via GitHub
iverase commented on PR #12594: URL: https://github.com/apache/lucene/pull/12594#issuecomment-1736975225 > One suggestion: given that ByteBuffersDataInput already has a size method, why not name the new interface method size as well? The two terms are more or less interchangeable, and that

[GitHub] [lucene] romseygeek commented on pull request #12594: Add length method to RandomAccessInput

2023-09-27 Thread via GitHub
romseygeek commented on PR #12594: URL: https://github.com/apache/lucene/pull/12594#issuecomment-1736960788 I think it's fine to add simple methods to interfaces like this in point releases. Extending RandomAccessInput would be a pretty expert use of lucene. One suggestion: given tha