laskoviymishka commented on code in PR #16696:
URL: https://github.com/apache/iceberg/pull/16696#discussion_r3387885301


##########
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/RandomData.java:
##########
@@ -178,7 +183,7 @@ private boolean isNull() {
 
     @Override
     public Object list(Types.ListType list, Supplier<Object> elementResult) {
-      int numElements = random.nextInt(20);
+      int numElements = random.nextInt(MAX_COLLECTION_SIZE);

Review Comment:
   Minor, since the comment already explains the multiplicative behavior: the 
range stays `[0, MAX_COLLECTION_SIZE)`, so empty collections are still possible 
and now show up at 1/10 instead of 1/20. Not a problem — empty collections are 
legal — but worth a half-line in the comment noting zero-length is in range so 
nobody reads the cap as a minimum.
   
   (Same four call sites in v4.0 and v4.1.)



##########
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/RandomData.java:
##########
@@ -52,6 +52,11 @@ public class RandomData {
   // Default percentage of number of values that are null for optional fields
   public static final float DEFAULT_NULL_PERCENTAGE = 0.05f;
 
+  // Exclusive upper bound on the number of elements generated for each 
list/map.
+  // Applied per nesting level, so deeply-nested schemas multiply quickly; keep
+  // this small enough to avoid combinatorial blow-up in heavily-nested tests.
+  private static final int MAX_COLLECTION_SIZE = 10;

Review Comment:
   `nextInt(MAX_COLLECTION_SIZE)` is exclusive, so this actually caps 
collections at 9 elements, not 10 — the name reads like "max 10" and that's a 
trap for the next person who reaches for it. I'd either rename to something 
like `COLLECTION_SIZE_BOUND` to signal it's an exclusive bound, or bump the 
value to 11 if 10 elements is genuinely the intent. wdyt?
   
   (Same constant in v4.0 and v4.1.)



##########
spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/data/RandomData.java:
##########
@@ -182,7 +187,7 @@ private boolean isNull() {
 
     @Override
     public Object list(Types.ListType list, Supplier<Object> elementResult) {
-      int numElements = random.nextInt(20);
+      int numElements = random.nextInt(MAX_COLLECTION_SIZE);

Review Comment:
   Same change applied identically here and in v4.1 — comment text is 
byte-for-byte consistent across all three copies, which is good. No drift to 
worry about.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to