xinyiZzz commented on code in PR #10170: URL: https://github.com/apache/doris/pull/10170#discussion_r982558867
########## fe/fe-core/src/main/java/org/apache/doris/analysis/TupleDescriptor.java: ########## @@ -159,6 +164,80 @@ public void setTable(TableIf tbl) { table = tbl; } + public Set<Long> getSampleTabletIds() { + return sampleTabletIds; + } + + /** + * First, determine how many rows to sample from each partition according to the number of partitions. + * Then determine the number of Tablets to be selected for each partition according to the average number + * of rows of Tablet, + * If seek is not specified, the specified number of Tablets are pseudo-randomly selected from each partition. + * If seek is specified, it will be selected sequentially from the seek tablet of the partition. Review Comment: The default seek 0, selects the first n tablets. If the user wants to randomly sample multiple times, select different tablet ids, and hope that the results of each random sampling can be reproduced, seek is required. If the seek is the same, the selected tablet is also the same. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org