Re: [PR] LUCENE-10641: IndexSearcher#setTimeout should also abort query rewrites, point ranges and vector searches [lucene]

2023-10-29 Thread via GitHub


Deepika0510 commented on PR #12345:
URL: https://github.com/apache/lucene/pull/12345#issuecomment-1784082772

   However, to get that `TimeoutLeafReader` in use, we would need to go through 
the `ReaderContext` class route(?). In a way we would need some mechanism in 
`ReaderClass` to know if timeout is applied and then return `TimeoutLeafReader` 
object.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Use growNoCopy for SortingStoredFieldsConsumer#NO_COMPRESSION [lucene]

2023-10-29 Thread via GitHub


gf2121 opened a new pull request, #12733:
URL: https://github.com/apache/lucene/pull/12733

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] LUCENE-10641: IndexSearcher#setTimeout should also abort query rewrites, point ranges and vector searches [lucene]

2023-10-29 Thread via GitHub


mikemccand commented on PR #12345:
URL: https://github.com/apache/lucene/pull/12345#issuecomment-1784151630

   Hmm I'm confused: why would you need to get to the `TimeoutLeafReader`?  
Don't you create this timeout reader, passing the timeout to it (which will 
apply to all queries) and then you don't need to get to it anymore?  Just catch 
the timeout exception when running searching?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Always collect sparsely in TaxonomyFacets & switch to dense if there are enough unique labels [lucene]

2023-10-29 Thread via GitHub


mikemccand commented on issue #12576:
URL: https://github.com/apache/lucene/issues/12576#issuecomment-1784152872

   Could we make the collection dynamic?  Collect into a sparse structure at 
first, and if it gets too big, switch to dense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Clean up ByteBlockPool [lucene]

2023-10-29 Thread via GitHub


mikemccand commented on PR #12506:
URL: https://github.com/apache/lucene/pull/12506#issuecomment-1784153400

   +1 to move to `oal.index`, and make it package private if possible?  
`ByteSlicePool` name sounds good to me :)  Naming is the hardest part!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Clean up ByteBlockPool [lucene]

2023-10-29 Thread via GitHub


mikemccand commented on code in PR #12506:
URL: https://github.com/apache/lucene/pull/12506#discussion_r1375466035


##
lucene/core/src/java/org/apache/lucene/util/ByteBlockPool.java:
##
@@ -129,21 +143,22 @@ public ByteBlockPool(Allocator allocator) {
   }
 
   /**
-   * Resets the pool to its initial state reusing the first buffer and fills 
all buffers with 
-   * 0 bytes before they reused or passed to {@link 
Allocator#recycleByteBlocks(byte[][],
-   * int, int)}. Calling {@link ByteBlockPool#nextBuffer()} is not needed 
after reset.
+   * Resets the pool to its initial state, reusing the first buffer and 
filling all buffers with
+   * {@code 0} bytes before they are reused or passed to {@link

Review Comment:
   +1 to make this cleaner (who zeros).  Why does byte slicing even require 
pre-zero'd buffers?
   
   Maybe open a follow-on issue for this?  This change is already great.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] StringsToAutomaton#build to take List as parameter instead of Collection [lucene]

2023-10-29 Thread via GitHub


mikemccand commented on PR #12427:
URL: https://github.com/apache/lucene/pull/12427#issuecomment-1784156071

   > Below are the results(looks all good to me). Let me know what do you 
think? Thanks!
   
   +1 -- looks like just noise to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[I] Should reseting a ByteBlockPool zero out the buffers? [lucene]

2023-10-29 Thread via GitHub


stefanvodita opened a new issue, #12734:
URL: https://github.com/apache/lucene/issues/12734

   ### Description
   
   `ByteBlockPool.reset` can fill the buffers we're recycling with zeros.
   1. Do we need the buffers to be filled with zeros? Is there some implicit 
assumption if we were to reuse them that we would only encounter zeros?
   2. If yes, should the responsibility move from `ByteBlockPool` to the 
`Allocator`'s `recycleByteBlocks` method? The `ByteBlockPool` loses its 
reference to the buffers it's just zeroed, so it seems outside its concern to 
zero the buffers. The  `Allocator` could zero the buffers in its recycle method 
if it needs to.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Clean up ByteBlockPool [lucene]

2023-10-29 Thread via GitHub


stefanvodita commented on code in PR #12506:
URL: https://github.com/apache/lucene/pull/12506#discussion_r1375520679


##
lucene/core/src/java/org/apache/lucene/util/ByteBlockPool.java:
##
@@ -129,21 +143,22 @@ public ByteBlockPool(Allocator allocator) {
   }
 
   /**
-   * Resets the pool to its initial state reusing the first buffer and fills 
all buffers with 
-   * 0 bytes before they reused or passed to {@link 
Allocator#recycleByteBlocks(byte[][],
-   * int, int)}. Calling {@link ByteBlockPool#nextBuffer()} is not needed 
after reset.
+   * Resets the pool to its initial state, reusing the first buffer and 
filling all buffers with
+   * {@code 0} bytes before they are reused or passed to {@link

Review Comment:
   Right, there's enough going on in this PR already. I opened an issue to look 
into this separately: https://github.com/apache/lucene/issues/12734



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Clean up ByteBlockPool [lucene]

2023-10-29 Thread via GitHub


stefanvodita commented on PR #12506:
URL: https://github.com/apache/lucene/pull/12506#issuecomment-1784239004

   Thanks @mikemccand! I just pushed a commit that does that move.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] FSTCompiler's NodeHash should fully duplicate `byte[]` slices from the growing FST [lucene]

2023-10-29 Thread via GitHub


dungba88 commented on issue #12714:
URL: https://github.com/apache/lucene/issues/12714#issuecomment-1784418885

   If we are to move to value-based LRU cache and no longer fall back to 
reading FST when items are not in the map, I'm wondering why wouldn't we just 
use LinkedHashMap (or any doubly-linked link based LRU map)? It sounds to me 
that we could write through to such map (write to both the map and the main 
FST), and evict the node of least-recently write or read?
   
   Concretely, we could add a Node struct like
   
   ```
   class CacheNode {
   long address;
   UnCompiledNode node;
   }
   ```
   
   And then use `LinkedHashMap`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org