[PR] [KNN] Add comment and remove duplicate code [lucene]

2024-07-20 Thread via GitHub


dungba88 opened a new pull request, #13594:
URL: https://github.com/apache/lucene/pull/13594

   ### Description
   
   - Add comments to methods for easier to understand their behaviors
   - Remove duplicate codes to createBitSet used in both the main top-k KNN 
query and vector similarity query
   - Also remove unused variable `weight`
   
   To discuss: The `createFilterWeight` might also merit to be shared with the 
vector similarity query. There is a discrepancy between the 2 classes: the 
top-k KNN query enhance the filter so that only documents having values on the 
KNN field will be matched, but the vector similarity query doesn't. I think 
they should share similar behavior as they are both vector search.
   
   Moreover, should we make their behavior more closely so that they can 
share/extend one another? The top-k KNN query has the main logic inside 
rewriteQuery while vector-similarity query has the main logic in Weight.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-20 Thread via GitHub


ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1685433524


##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInputProvider.java:
##
@@ -157,6 +157,8 @@ static Arena getSharedArena(
   arenas.computeIfAbsent(key, s -> new RefCountedSharedArena(s, () -> 
arenas.remove(s)));
   if (refCountedArena.acquire()) {
 return refCountedArena;
+  } else {
+arenas.remove(key);

Review Comment:
   Thanks, this is nice. Done. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-20 Thread via GitHub


ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1685433634


##
lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java:
##
@@ -47,9 +48,12 @@ boolean acquire() {
 int value;
 while (true) {
   value = state.get();
-  if (value < OPEN) {
+  if (value >= LIMIT) {

Review Comment:
   I pushed an implementation of this that allows a total of 1024 acquires per 
an instance.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-20 Thread via GitHub


ChrisHegarty commented on PR #13570:
URL: https://github.com/apache/lucene/pull/13570#issuecomment-2241152532

   Subject to review comment, I think the only thing that remains is whether we 
want to allow the default number of total ref counting to be configurable, say 
by a system property or something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org