msfroh commented on code in PR #13028:
URL: https://github.com/apache/lucene/pull/13028#discussion_r1463869892


##########
lucene/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestTaxonomyIndexArrays.java:
##########
@@ -59,4 +74,88 @@ public void testMultiplesOfChunkSize() {
       ordinal = newOrdinal;
     }
   }
+
+  public void testConstructFromEmptyIndex() throws IOException {
+    Directory dir = newDirectory();
+
+    // Produce empty index
+    new IndexWriter(dir, newIndexWriterConfig(null)).close();
+
+    IndexReader reader = DirectoryReader.open(dir);
+
+    TaxonomyIndexArrays tia = new TaxonomyIndexArrays(reader);
+    assertEquals(0, tia.parents().length());
+
+    tia = new TaxonomyIndexArrays(reader, tia);
+    assertEquals(0, tia.parents().length());
+
+    reader.close();
+    dir.close();
+  }
+
+  public void testRefresh() throws IOException {
+    Directory dir = newDirectory();
+
+    // Write two chunks worth of ordinals whose parents are ordinals 1 or 2
+    TaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(dir);
+    taxoWriter.addCategory(new FacetLabel("a")); // ordinal 1
+    taxoWriter.addCategory(new FacetLabel("b")); // ordinal 2
+    for (int i = 0; i < 2 * TaxonomyIndexArrays.CHUNK_SIZE; i++) {

Review Comment:
   If you start this with `i = 3`, you should end up with exactly two 
`CHUNK_SIZE` worth of ordinals.
   
   I believe you would hit the condition that I called out above where 
`initParents` won't get called.



##########
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/TaxonomyIndexArrays.java:
##########
@@ -80,7 +81,8 @@ public int length() {
 
   public TaxonomyIndexArrays(IndexReader reader) throws IOException {
     int[][] parentArray = allocateChunkedArray(reader.maxDoc(), 0);
-    if (parentArray.length > 0) {
+    assert parentArray.length > 0;
+    if (parentArray[parentArray.length - 1].length > 0) {

Review Comment:
   Does this condition work when we grow to a multiple of `CHUNK_SIZE`?
   
   Let's say the total size was 8190, then we add two more ordinals. Now we 
have a chunk of size 8192, followed by "tail chunk" of size 0. With this 
condition, I think we would fail to read the two new elements from `reader`.
   
   I think you might be able to verify it with `testRefresh`, where you should 
find that the `parents` arrays are always:
   
   ```
   parents[0][0] == INVALID
   parents[0][1] == 0
   parents[0][2] == 0
   parents[0][3] == 1
   parents[0][4] == 2
   parents[0][5] == 1
   parents[0][6] == 2
   ...
   parents[0][n] == 1 (if n%2 == 1)
   parents[0][n] == 2 (if n%2 == 0)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to