Re: [PR] Use a hint to specify READONCE IOContext [lucene]

2025-04-16 Thread via GitHub


uschindler commented on PR #14509:
URL: https://github.com/apache/lucene/pull/14509#issuecomment-2808928300

   Could we get some context/issue what this is about?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] OptimisticKnnVectorQuery [lucene]

2025-04-16 Thread via GitHub


benwtrent commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2809670383

   > , I'm in favor of fixing it as a magic number that replicates something 
close to the current behavior (or better recall if we can and retain the same 
latency as we found with lambda=16) and letting users tune further using 
fanout. I think these are roughly equivalent and I don't think we should be 
exposing a lot of knobs.
   
   I am for this as well. Exposing more and more knobs makes things way too 
complicated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Change uses of withReadAdvice to use hints instead [lucene]

2025-04-16 Thread via GitHub


thecoop opened a new pull request, #14510:
URL: https://github.com/apache/lucene/pull/14510

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] OptimisticKnnVectorQuery [lucene]

2025-04-16 Thread via GitHub


msokolov commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2810290056

   As far as making this the default, that sounds OK to me, but let's not 
backport until we've had a chance to verify no harm for a while in some 
pre-production environments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] [Bug] Fix for stored fields force merge regression [lucene]

2025-04-16 Thread via GitHub


bharath-techie opened a new pull request, #14512:
URL: https://github.com/apache/lucene/pull/14512

   ### Description
   
   Resolves https://github.com/apache/lucene/issues/14463 
   
   I have made changes similar to https://github.com/apache/lucene/pull/13985 
to update read advice to sequential during merge.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Use a hint to specify READONCE IOContext [lucene]

2025-04-16 Thread via GitHub


thecoop opened a new pull request, #14509:
URL: https://github.com/apache/lucene/pull/14509

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use a hint to specify READONCE IOContext [lucene]

2025-04-16 Thread via GitHub


thecoop commented on PR #14509:
URL: https://github.com/apache/lucene/pull/14509#issuecomment-2808933733

   This is following on from https://github.com/apache/lucene/pull/14482. This 
is in draft, as I wanted to see that this refactoring worked cleanly based on 
the changes in #14482, but needs that PR merged first before it can go in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Tone down TestIndexWriterDelete.testDeleteAllRepeated (OOMs sometimes) [lucene]

2025-04-16 Thread via GitHub


uschindler commented on issue #14508:
URL: https://github.com/apache/lucene/issues/14508#issuecomment-2809265915

   > [@msfroh](https://github.com/msfroh) would you know how to tackle this 
better than forcing FSDirectory in this test?
   
   Policeman Jenkins poor NVME disk!!! The new server is only online since 
a few weeks and it already eats up the "percentage_used" smart counter:
   
   ```
   root@serv1 ~ # nvme smart-log /dev/nvme0
   Smart Log for NVME device:nvme0 namespace-id:
   critical_warning: 0
   temperature : 32 °C (305 K)
   available_spare : 100%
   available_spare_threshold   : 10%
   percentage_used : 4%
   endurance group critical warning summary: 0
   Data Units Read : 15428184 (7.90 TB)
   Data Units Written  : 46860669 (23.99 TB)
   host_read_commands  : 251666556
   host_write_commands : 787078812
   controller_busy_time: 3115
   power_cycles: 11
   power_on_hours  : 423
   unsafe_shutdowns: 4
   media_errors: 0
   num_err_log_entries : 0
   Warning Temperature Time: 0
   Critical Composite Temperature Time : 0
   Temperature Sensor 1   : 32 °C (305 K)
   Temperature Sensor 2   : 34 °C (307 K)
   Thermal Management T1 Trans Count   : 0
   Thermal Management T2 Trans Count   : 0
   Thermal Management T1 Total Time: 0
   Thermal Management T2 Total Time: 0
   root@serv1 ~ # nvme smart-log /dev/nvme1
   Smart Log for NVME device:nvme1 namespace-id:
   critical_warning: 0
   temperature : 34 °C (307 K)
   available_spare : 100%
   available_spare_threshold   : 5%
   percentage_used : 3%
   endurance group critical warning summary: 0
   Data Units Read : 10776282 (5.52 TB)
   Data Units Written  : 45766086 (23.43 TB)
   host_read_commands  : 345725745
   host_write_commands : 865033828
   controller_busy_time: 495
   power_cycles: 28
   power_on_hours  : 310
   unsafe_shutdowns: 12
   media_errors: 0
   num_err_log_entries : 69
   Warning Temperature Time: 0
   Critical Composite Temperature Time : 0
   Temperature Sensor 1   : 34 °C (307 K)
   Thermal Management T1 Trans Count   : 0
   Thermal Management T2 Trans Count   : 0
   Thermal Management T1 Total Time: 0
   Thermal Management T2 Total Time: 0
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] deps(java): bump com.carrotsearch.randomizedtesting:randomizedtesting-runner from 2.8.1 to 2.8.3 [lucene]

2025-04-16 Thread via GitHub


dweiss merged PR #14504:
URL: https://github.com/apache/lucene/pull/14504


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] deps(java): bump xerces:xercesImpl from 2.12.0 to 2.12.2 [lucene]

2025-04-16 Thread via GitHub


dweiss merged PR #14502:
URL: https://github.com/apache/lucene/pull/14502


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Ensuring skip list is read for fields indexed with only DOCS [lucene]

2025-04-16 Thread via GitHub


expani opened a new pull request, #14511:
URL: https://github.com/apache/lucene/pull/14511

   ### Description
   
   Fix for https://github.com/apache/lucene/issues/14445 
   
   Falling back to return a SlowImpactsEnum for all default cases but ensuring 
skip data is read for case where field is indexed with `IndexOptions.DOCS` by 
returning a non competitive impact. 
   
   This is required because we stopped storing a default impact for such cases 
from 912Postings Format 
   
   
https://github.com/apache/lucene/blob/main/lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene99/Lucene99PostingsWriter.java#L275
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Ensuring skip list is read for fields indexed with only DOCS [lucene]

2025-04-16 Thread via GitHub


expani commented on code in PR #14511:
URL: https://github.com/apache/lucene/pull/14511#discussion_r2047829308


##
lucene/core/src/java/org/apache/lucene/codecs/lucene103/Lucene103PostingsReader.java:
##
@@ -1310,7 +1317,7 @@ public List getImpacts(int level) {
 return readImpacts(level1SerializedImpacts, level1Impacts);
   }
 }
-return DUMMY_IMPACTS;
+return NON_COMPETITIVE_IMPACTS;

Review Comment:
   Good catch. 
   This is unused after the change, so can remove it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Ensuring skip list is read for fields indexed with only DOCS [lucene]

2025-04-16 Thread via GitHub


msfroh commented on code in PR #14511:
URL: https://github.com/apache/lucene/pull/14511#discussion_r2047793102


##
lucene/core/src/java/org/apache/lucene/codecs/lucene103/Lucene103PostingsReader.java:
##
@@ -1310,7 +1317,7 @@ public List getImpacts(int level) {
 return readImpacts(level1SerializedImpacts, level1Impacts);
   }
 }
-return DUMMY_IMPACTS;
+return NON_COMPETITIVE_IMPACTS;

Review Comment:
   This was the only reference to `DUMMY_IMPACTS`, right? Can we remove it?



##
lucene/core/src/java/org/apache/lucene/codecs/lucene103/Lucene103PostingsReader.java:
##
@@ -282,6 +288,10 @@ public PostingsEnum postings(
   @Override
   public ImpactsEnum impacts(FieldInfo fieldInfo, BlockTermState state, int 
flags)
   throws IOException {
+if (state.docFreq <= BLOCK_SIZE) {
+  // no skip data
+  return new SlowImpactsEnum(postings(fieldInfo, state, null, flags));
+}

Review Comment:
   This is essentially taking the place of `DUMMY_IMPACTS`, right? 
   
   It's the thing that kicks in on tail blocks, which is what `DUMMY_IMPACTS` 
was there for. (I'm trying to make sure I understand the change.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[I] IndexWriter forceMergeDeletes should return its MergeSpec [lucene]

2025-04-16 Thread via GitHub


vigyasharma opened a new issue, #14515:
URL: https://github.com/apache/lucene/issues/14515

   IndexWriter provides a `forceMergeDeletes` API which triggers force merging 
of all segments that have deleted documents, allowing users to expunge deletes 
up to a configurable delete percentage (set via 
`setForceMergeDeletesPctAllowed()`). 
   
   The API provides a blocking variant, which waits until the merges complete, 
and a non-blocking variant, that starts the merges in background threads and 
returns. For the non-blocking version, it would be nice to have the ability to 
monitor if merges have completed. Turns out, all we need for this, is to return 
the `MergeSpecification` that defines the merges triggered by the API.
   
   Indeed, the blocking variant of this API itself uses this spec to wait until 
all merges have completed. This is what would happen if you were using the 
`ConcurrentMergeScheduler` which will start merges in background, but invoked 
the API with `doWait=true`. However, there are benefits to being able to 
monitor from outside the API, like waiting only unto a max timeout, or 
reporting metrics on the progress of these merges.
   
   The change here is to change this API return type from void to 
`MergePolicy.MergeSpecification` and return the `spec` object.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Backport Bot [lucene]

2025-04-16 Thread via GitHub


jainankitk commented on issue #14496:
URL: https://github.com/apache/lucene/issues/14496#issuecomment-2811880891

   We have similar backport workflow in OpenSearch, that might be useful - 
https://github.com/opensearch-project/OpenSearch/blob/main/.github/workflows/backport.yml.
 Just need to add the backport- label and it creates backport PR once 
the labeled PR is merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org