Re: [PR] Allow skip cache factor to be updated dynamically [lucene]
jpountz commented on code in PR #14412: URL: https://github.com/apache/lucene/pull/14412#discussion_r2019755560 ## lucene/CHANGES.txt: ## @@ -47,6 +47,7 @@ Other - * GITHUB#14229: Bump minimum required Java version to 25 +* GITHUB#14412: Allow skip cache factor to be updated dynamically. (Sagar Upadhyaya) Review Comment: Can you move the CHANGES entry under version 10.2 below, section "New Features"? ## lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java: ## @@ -142,6 +142,20 @@ public LRUQueryCache( missCount = new LongAdder(); } + public float getSkipCacheFactor() { +return skipCacheFactor; + } + + /** + * This setter enables the skipCacheFactor to be updated dynamically. + * + * @param skipCacheFactor determines whether we need to skip the cache operation based on cost and + * lead cost. + */ + public void setSkipCacheFactor(float skipCacheFactor) { Review Comment: Can you explain what this factor does, e.g. by copying this from the ctor docs: "clauses whose cost is {@code skipCacheFactor} times more than the cost of the top-level query will not be cached in order to not slow down queries too much." ## lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java: ## @@ -142,6 +142,20 @@ public LRUQueryCache( missCount = new LongAdder(); } + public float getSkipCacheFactor() { Review Comment: Can you add javadocs here too? Lucene aims at having all public APIs documented to make javadocs nicer to browse by users. In this case it could mostly refer to the setter, something like ``` Get the skip cache factor. @see #setSkipCacheFactor ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Allow skip cache factor to be updated dynamically [lucene]
kkewwei commented on PR #14412: URL: https://github.com/apache/lucene/pull/14412#issuecomment-2763254947 I am wondering if we should also make `maxRamBytesUsed` dynamic as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] cache preset dict for LZ4WithPresetDictDecompressor [lucene]
kkewwei commented on code in PR #14397: URL: https://github.com/apache/lucene/pull/14397#discussion_r2019766795 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsReader.java: ## @@ -512,6 +512,7 @@ private void doReset(int docID) throws IOException { bytes.offset = bytes.length = 0; for (int decompressed = 0; decompressed < totalLength; ) { final int toDecompress = Math.min(totalLength - decompressed, chunkSize); +decompressor.reset(); decompressor.decompress(fieldsStream, toDecompress, 0, toDecompress, spare); Review Comment: It seems that `reset` is essential. When the block changes, we must discard the cache in time, this operation can only be detected from external. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] quick exit on filter query matching no docs when rewriting knn query [lucene]
bugmakerr commented on PR #14418: URL: https://github.com/apache/lucene/pull/14418#issuecomment-2763256416 > Can you help me understand what work this change helps save? Thanks for your reply. I think if the filter query matches no docs, the knn query must also match no docs. Before this small change, we always executed search tasks for each leaf reader even if the filter query is `MatchNoDocsQuery`. https://github.com/apache/lucene/blob/a7ff87fdb5479a2f1e89746d88c80023b0aecbb1/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L89-L98 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Allow skip cache factor to be updated dynamically [lucene]
sgup432 commented on code in PR #14412: URL: https://github.com/apache/lucene/pull/14412#discussion_r2019868684 ## lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java: ## @@ -142,6 +142,20 @@ public LRUQueryCache( missCount = new LongAdder(); } + public float getSkipCacheFactor() { Review Comment: Sure! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Fix HistogramCollector to not create zero-count buckets. [lucene]
jpountz commented on code in PR #14421: URL: https://github.com/apache/lucene/pull/14421#discussion_r2019755410 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java: ## @@ -279,7 +279,9 @@ public void collect(DocIdStream stream) throws IOException { public void finish() throws IOException { // Put counts that we computed in the int[] back into the hash map. for (int i = 0; i < counts.length; ++i) { -collectorCounts.addTo(leafMinBucket + i, counts[i]); +if (counts[i] != 0) { + collectorCounts.addTo(leafMinBucket + i, counts[i]); +} } checkMaxBuckets(collectorCounts.size(), maxBuckets); Review Comment: I think it's ok the way it is. The end goal is to prevent unbounded heap allocation. In this case, the amount of excess heap we may allocate is bounded by 1024 entries, so I'd err on the side of simplicity by not checking the number of buckets in the loop? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Allow skip cache factor to be updated dynamically [lucene]
sgup432 commented on code in PR #14412: URL: https://github.com/apache/lucene/pull/14412#discussion_r2019868684 ## lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java: ## @@ -142,6 +142,20 @@ public LRUQueryCache( missCount = new LongAdder(); } + public float getSkipCacheFactor() { Review Comment: Sure. Added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Allow skip cache factor to be updated dynamically [lucene]
sgup432 commented on code in PR #14412: URL: https://github.com/apache/lucene/pull/14412#discussion_r2019870708 ## lucene/CHANGES.txt: ## @@ -47,6 +47,7 @@ Other - * GITHUB#14229: Bump minimum required Java version to 25 +* GITHUB#14412: Allow skip cache factor to be updated dynamically. (Sagar Upadhyaya) Review Comment: Added in desired section. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org