stefanvodita commented on issue #12989: URL: https://github.com/apache/lucene/issues/12989#issuecomment-1878171579
@msfroh - I was looking into this as well and had some thoughts about how to do it. We could replace [`ParallelTaxonomyArrays`](https://github.com/apache/lucene/blob/7b8aece125aabff2823626d5b939abf4747f63a7/lucene/facet/src/java/org/apache/lucene/facet/taxonomy/ParallelTaxonomyArrays.java) with a new interface that offers three operations for each of the arrays: ``` interface ChunkedParallelTaxonomyArrays { /* Record new entry. */ public void appendParent(int parent); /* Retrieve this ordinal's parent. From the user's perspective, this is like an array look-up. */ public int getParent(int ord); /* There are some places where we need to know how many parents exist in total. */ public int sizeParents(); // Same for children and siblings ... } ``` To implement this inteface, we could use an [`IntBlockPool`](https://github.com/apache/lucene/blob/7b8aece125aabff2823626d5b939abf4747f63a7/lucene/core/src/java/org/apache/lucene/util/IntBlockPool.java). We would allocate new int buffers in the block pool as needed and preserve the block pool across [`DirectoryTaxonomyReader`](https://github.com/apache/lucene/blob/7b8aece125aabff2823626d5b939abf4747f63a7/lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java) refreshes. There are definitely some disadvantages with the block pool idea: 1. We're preserving a mutable data-structure across taxonomy refreshes. There is [precedent](https://github.com/apache/lucene/blob/7b8aece125aabff2823626d5b939abf4747f63a7/lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java#L172) though, with the caches in `DirectoryTaxonomyReader`. 2. We would be slightly overallocating by having the last buffer in the pool not be completely used, but I think this is a good trade-off to take for the increased efficiency and simplicity. What do you think, did you have something else in mind? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org