This is an automated email from the ASF dual-hosted git repository. dlmarion pushed a commit to branch elasticity in repository https://gitbox.apache.org/repos/asf/accumulo.git
The following commit(s) were added to refs/heads/elasticity by this push: new 2bee648380 Remove todo from SplitUtils (#4484) 2bee648380 is described below commit 2bee6483805ef2329aefb9385110289391c6266a Author: Dave Marion <dlmar...@apache.org> AuthorDate: Fri May 10 08:03:19 2024 -0400 Remove todo from SplitUtils (#4484) I don't know that there is a reliable way to determine what the splits should be given the information that we have in the tablet metadata. The todo suggested running tests and doing some compactions, etc. But I think that it's really going to be situation dependent. Users can apply iterators to perform aggregation, deletes, etc. at compaction time that could influence the split points greatly. I think the better option here is to document compactions should be run first to get more a [...] --- .../src/main/java/org/apache/accumulo/server/split/SplitUtils.java | 7 ------- 1 file changed, 7 deletions(-) diff --git a/server/base/src/main/java/org/apache/accumulo/server/split/SplitUtils.java b/server/base/src/main/java/org/apache/accumulo/server/split/SplitUtils.java index 8f64c461bc..5cf5c9edc6 100644 --- a/server/base/src/main/java/org/apache/accumulo/server/split/SplitUtils.java +++ b/server/base/src/main/java/org/apache/accumulo/server/split/SplitUtils.java @@ -188,13 +188,6 @@ public class SplitUtils { } public static int calculateDesiredSplits(long esitimatedSize, long splitThreshold) { - // ELASTICITY_TODO tablets used to always split into 2 tablets. Now the split operation will - // split into many. How does this impact a tablet with many files and the estimated sizes after - // split vs the old method. Need to run test where we add lots of data to a single tablet, - // change the split thresh, wait for splits, then look at the estimated sizes, then compact and - // look at the sizes after. For example if a tablet has 10M of data and the split thesh is set - // to 100K, what will the est sizes look like across the tablets after splitting and then after - // compacting? return (int) Math.floor((double) esitimatedSize / (double) splitThreshold); }