Jackie-Jiang commented on code in PR #16845:
URL: https://github.com/apache/pinot/pull/16845#discussion_r2396034481
##########
pinot-controller/src/main/java/org/apache/pinot/controller/recommender/data/generator/StringGenerator.java:
##########
@@ -51,8 +50,8 @@ public StringGenerator(Integer cardinality, Double
numberOfValuesPerEntry, Integ
int initValueSize = lengthOfEachString - _counterLength;
Preconditions.checkState(initValueSize >= 0,
String.format("Cannot generate %d unique string with length %d",
_cardinality, lengthOfEachString));
- _initialValue = RandomStringUtils.randomAlphabetic(initValueSize);
- _rand = new Random(System.currentTimeMillis());
+ _rand = new Random(0L);
Review Comment:
Could you elaborate more on this change? Ideally we want randomization
within the test to get more confidence.
Also, seems this util is not just for testing purpose, and changing it might
have other side effect
##########
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/creator/DictionariesTest.java:
##########
@@ -83,12 +83,12 @@ public class DictionariesTest implements
PinotBuffersAfterMethodCheckRule {
private static TableConfig _tableConfig;
- @AfterClass
+ @AfterMethod
public static void cleanup() {
FileUtils.deleteQuietly(INDEX_DIR);
}
- @BeforeClass
+ @BeforeMethod
Review Comment:
Creating a new segment per test could be expensive. Any specific reason to
change this?
##########
pinot-controller/src/test/java/org/apache/pinot/controller/recommender/realtime/provisioning/MemoryEstimatorTest.java:
##########
@@ -50,7 +50,7 @@ public void testSegmentGenerator()
assertEquals(extract(metadata, "column.colFloatMV.cardinality =
(\\d+)"), "250");
assertEquals(extract(metadata, "column.colString.cardinality = (\\d+)"),
"300");
assertEquals(extract(metadata, "column.colStringMV.cardinality =
(\\d+)"), "350");
- assertEquals(extract(metadata, "column.colBytes.cardinality = (\\d+)"),
"400");
+ assertEquals(extract(metadata, "column.colBytes.cardinality = (\\d+)"),
"443");
Review Comment:
This doesn't seem correct. Why is the error rate so high (more than 10%)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]