stefanvodita commented on issue #12989:
URL: https://github.com/apache/lucene/issues/12989#issuecomment-1878171579

   @msfroh - I was looking into this as well and had some thoughts about how to 
do it.
   
   We could replace 
[`ParallelTaxonomyArrays`](https://github.com/apache/lucene/blob/7b8aece125aabff2823626d5b939abf4747f63a7/lucene/facet/src/java/org/apache/lucene/facet/taxonomy/ParallelTaxonomyArrays.java)
 with a new interface that offers three operations for each of the arrays:
   ```
   interface ChunkedParallelTaxonomyArrays {
   
     /* Record new entry. */
     public void appendParent(int parent);
   
     /* Retrieve this ordinal's parent. From the user's perspective, this is 
like an array look-up. */
     public int getParent(int ord);
   
     /* There are some places where we need to know how many parents exist in 
total. */
     public int sizeParents();
   
     // Same for children and siblings
     ...
   }
   ```
   
   To implement this inteface, we could use an 
[`IntBlockPool`](https://github.com/apache/lucene/blob/7b8aece125aabff2823626d5b939abf4747f63a7/lucene/core/src/java/org/apache/lucene/util/IntBlockPool.java).
 We would allocate new int buffers in the block pool as needed and preserve the 
block pool across 
[`DirectoryTaxonomyReader`](https://github.com/apache/lucene/blob/7b8aece125aabff2823626d5b939abf4747f63a7/lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java)
 refreshes.
   
   There are definitely some disadvantages with the block pool idea:
   1. We're preserving a mutable data-structure across taxonomy refreshes. 
There is 
[precedent](https://github.com/apache/lucene/blob/7b8aece125aabff2823626d5b939abf4747f63a7/lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java#L172)
 though, with the caches in `DirectoryTaxonomyReader`.
   2. We would be slightly overallocating by having the last buffer in the pool 
not be completely used, but I think this is a good trade-off to take for the 
increased efficiency and simplicity.
   
   What do you think, did you have something else in mind?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to