Jackie-Jiang commented on code in PR #10905:
URL: https://github.com/apache/pinot/pull/10905#discussion_r1237648011


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/startree/v2/builder/MultipleTreesBuilder.java:
##########
@@ -61,8 +62,10 @@ public class MultipleTreesBuilder implements Closeable {
   private final List<StarTreeV2BuilderConfig> _builderConfigs;
   private final BuildMode _buildMode;
   private final File _segmentDirectory;
+  private final File _indexDirectory;

Review Comment:
   (minor) For convention, we usually call it `_indexDir` so that it is easier 
to map names



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/startree/v2/builder/MultipleTreesBuilder.java:
##########
@@ -148,9 +180,54 @@ public void build()
       StarTreeIndexMapUtils
           .storeToFile(indexMaps, new File(_segmentDirectory, 
StarTreeV2Constants.INDEX_MAP_FILE_NAME));
       FileUtils.forceDelete(starTreeIndexDir);
+      _existingStarTrees.cleanOutputDirectory();
     }
 
-    LOGGER.info("Finished building {} star-trees in {}ms", numStarTrees, 
System.currentTimeMillis() - startTime);
+    LOGGER.info("Newly built start-trees {}.\n Reused star-trees {}.", 
numStarTrees - reusedStarTrees, reusedStarTrees);
+    LOGGER.info("Finished building star-trees in {}ms", 
System.currentTimeMillis() - startTime);

Review Comment:
   (minor) Combine them into one single log line
   `Finished building {} star-trees ({} reused) in {}ms`



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/startree/v2/builder/MultipleTreesBuilder.java:
##########
@@ -81,9 +84,10 @@ public MultipleTreesBuilder(List<StarTreeV2BuilderConfig> 
builderConfigs, File i
     _builderConfigs = builderConfigs;
     _buildMode = buildMode;
     _segmentDirectory = SegmentDirectoryPaths.findSegmentDirectory(indexDir);
+    _indexDirectory = indexDir;
     _metadataProperties =
         CommonsConfigurationUtils.fromFile(new File(_segmentDirectory, 
V1Constants.MetadataKeys.METADATA_FILE_NAME));
-    
Preconditions.checkState(!_metadataProperties.containsKey(MetadataKey.STAR_TREE_COUNT),
 "Star-tree already exists");
+    _existingStarTrees = getExistingStarTrees();

Review Comment:
   We might not want to always separate all the trees. Ideally the separator 
can track all the star-tree metadata (merge `SeparatedStarTreesMetadata` into 
the separator), and only extract the tree that can be reused into the target 
folder. Currently we always copy all the trees into separate files, even though 
they might not be reusable. This way we don't need the temp folder as well.



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/startree/v2/builder/MultipleTreesBuilder.java:
##########
@@ -148,9 +180,54 @@ public void build()
       StarTreeIndexMapUtils
           .storeToFile(indexMaps, new File(_segmentDirectory, 
StarTreeV2Constants.INDEX_MAP_FILE_NAME));
       FileUtils.forceDelete(starTreeIndexDir);
+      _existingStarTrees.cleanOutputDirectory();
     }
 
-    LOGGER.info("Finished building {} star-trees in {}ms", numStarTrees, 
System.currentTimeMillis() - startTime);
+    LOGGER.info("Newly built start-trees {}.\n Reused star-trees {}.", 
numStarTrees - reusedStarTrees, reusedStarTrees);
+    LOGGER.info("Finished building star-trees in {}ms", 
System.currentTimeMillis() - startTime);
+  }
+
+  /**
+   * Extracts the individual star-trees from the combined index file and 
returns {@link SeparatedStarTreesMetadata}
+   * which contains the information about the output directory where the 
extracted star-trees are located and also
+   * has the builder configs of each of the star-tree.
+   */
+  private SeparatedStarTreesMetadata extractStarTrees()
+      throws Exception {
+    SeparatedStarTreesMetadata metadata = new 
SeparatedStarTreesMetadata(_segmentDirectory);
+    if (_metadataProperties.containsKey(MetadataKey.STAR_TREE_COUNT)) {
+      File indexFile = new File(_segmentDirectory, 
StarTreeV2Constants.INDEX_FILE_NAME);
+      File indexMapFile = new File(_segmentDirectory, 
StarTreeV2Constants.INDEX_MAP_FILE_NAME);
+      try (StarTreeIndexSeparator separator =
+          new StarTreeIndexSeparator(indexMapFile, indexFile,
+              _metadataProperties.getInt(MetadataKey.STAR_TREE_COUNT))) {
+        separator.separate(metadata.getOutputDirectory());
+        
metadata.setBuilderConfigList(separator.extractBuilderConfigs(_metadataProperties));

Review Comment:
   Consider making these fields read-only (set them in the constructor). 
`SeparatedStarTreesMetadata` should be a read-only helper class to extract 
fields from the metadata



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/startree/v2/builder/MultipleTreesBuilder.java:
##########
@@ -148,9 +180,54 @@ public void build()
       StarTreeIndexMapUtils
           .storeToFile(indexMaps, new File(_segmentDirectory, 
StarTreeV2Constants.INDEX_MAP_FILE_NAME));
       FileUtils.forceDelete(starTreeIndexDir);
+      _existingStarTrees.cleanOutputDirectory();

Review Comment:
   Should we move this to `close()` so that it can be cleaned up when exception 
happens?



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/startree/v2/builder/MultipleTreesBuilder.java:
##########
@@ -115,13 +120,33 @@ public MultipleTreesBuilder(@Nullable 
List<StarTreeIndexConfig> indexConfigs, bo
     }
   }
 
+  private SeparatedStarTreesMetadata getExistingStarTrees()
+      throws Exception {
+    try {
+      if (_metadataProperties.containsKey(MetadataKey.STAR_TREE_COUNT)) {
+        // Extract existing startrees
+        // clean star-tree related files and configs once all the star-trees 
are separated and extracted
+        SeparatedStarTreesMetadata existingStarTrees = extractStarTrees();
+        StarTreeBuilderUtils.removeStarTrees(_indexDirectory);
+        _metadataProperties.refresh();
+        return existingStarTrees;
+      } else {
+        return new SeparatedStarTreesMetadata(_segmentDirectory);

Review Comment:
   Consider returning `null` here to avoid overhead (`null` basically 
represents there is no existing star-tree)



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/startree/v2/builder/MultipleTreesBuilder.java:
##########
@@ -136,9 +161,16 @@ public void build()
       for (int i = 0; i < numStarTrees; i++) {
         StarTreeV2BuilderConfig builderConfig = _builderConfigs.get(i);
         Configuration metadataProperties = 
_metadataProperties.subset(MetadataKey.getStarTreePrefix(i));
-        try (SingleTreeBuilder singleTreeBuilder = 
getSingleTreeBuilder(builderConfig, starTreeIndexDir, _segment,
-            metadataProperties, _buildMode)) {
-          singleTreeBuilder.build();
+        if (_existingStarTrees.containsTree(builderConfig)) {

Review Comment:
   We can directly get index here to avoid one extra lookup



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/startree/v2/builder/MultipleTreesBuilder.java:
##########
@@ -148,9 +180,54 @@ public void build()
       StarTreeIndexMapUtils
           .storeToFile(indexMaps, new File(_segmentDirectory, 
StarTreeV2Constants.INDEX_MAP_FILE_NAME));
       FileUtils.forceDelete(starTreeIndexDir);
+      _existingStarTrees.cleanOutputDirectory();
     }
 
-    LOGGER.info("Finished building {} star-trees in {}ms", numStarTrees, 
System.currentTimeMillis() - startTime);
+    LOGGER.info("Newly built start-trees {}.\n Reused star-trees {}.", 
numStarTrees - reusedStarTrees, reusedStarTrees);
+    LOGGER.info("Finished building star-trees in {}ms", 
System.currentTimeMillis() - startTime);
+  }
+
+  /**
+   * Extracts the individual star-trees from the combined index file and 
returns {@link SeparatedStarTreesMetadata}
+   * which contains the information about the output directory where the 
extracted star-trees are located and also
+   * has the builder configs of each of the star-tree.
+   */
+  private SeparatedStarTreesMetadata extractStarTrees()
+      throws Exception {
+    SeparatedStarTreesMetadata metadata = new 
SeparatedStarTreesMetadata(_segmentDirectory);
+    if (_metadataProperties.containsKey(MetadataKey.STAR_TREE_COUNT)) {

Review Comment:
   (minor) This check is redundant



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to