xinyiZzz commented on code in PR #10170:
URL: https://github.com/apache/doris/pull/10170#discussion_r982558867


##########
fe/fe-core/src/main/java/org/apache/doris/analysis/TupleDescriptor.java:
##########
@@ -159,6 +164,80 @@ public void setTable(TableIf tbl) {
         table = tbl;
     }
 
+    public Set<Long> getSampleTabletIds() {
+        return sampleTabletIds;
+    }
+
+    /**
+     * First, determine how many rows to sample from each partition according 
to the number of partitions.
+     * Then determine the number of Tablets to be selected for each partition 
according to the average number
+     * of rows of Tablet,
+     * If seek is not specified, the specified number of Tablets are 
pseudo-randomly selected from each partition.
+     * If seek is specified, it will be selected sequentially from the seek 
tablet of the partition.

Review Comment:
   The default seek 0, selects the first n tablets.
   
   If the user wants to randomly sample multiple times, select different tablet 
ids, and hope that the results of each random sampling can be reproduced, seek 
is required.
   
   If the seek is the same, the selected tablet is also the same.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to