laskoviymishka commented on code in PR #16696:
URL: https://github.com/apache/iceberg/pull/16696#discussion_r3387885301
##########
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/RandomData.java:
##########
@@ -178,7 +183,7 @@ private boolean isNull() {
@Override
public Object list(Types.ListType list, Supplier<Object> elementResult) {
- int numElements = random.nextInt(20);
+ int numElements = random.nextInt(MAX_COLLECTION_SIZE);
Review Comment:
Minor, since the comment already explains the multiplicative behavior: the
range stays `[0, MAX_COLLECTION_SIZE)`, so empty collections are still possible
and now show up at 1/10 instead of 1/20. Not a problem — empty collections are
legal — but worth a half-line in the comment noting zero-length is in range so
nobody reads the cap as a minimum.
(Same four call sites in v4.0 and v4.1.)
##########
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/RandomData.java:
##########
@@ -52,6 +52,11 @@ public class RandomData {
// Default percentage of number of values that are null for optional fields
public static final float DEFAULT_NULL_PERCENTAGE = 0.05f;
+ // Exclusive upper bound on the number of elements generated for each
list/map.
+ // Applied per nesting level, so deeply-nested schemas multiply quickly; keep
+ // this small enough to avoid combinatorial blow-up in heavily-nested tests.
+ private static final int MAX_COLLECTION_SIZE = 10;
Review Comment:
`nextInt(MAX_COLLECTION_SIZE)` is exclusive, so this actually caps
collections at 9 elements, not 10 — the name reads like "max 10" and that's a
trap for the next person who reaches for it. I'd either rename to something
like `COLLECTION_SIZE_BOUND` to signal it's an exclusive bound, or bump the
value to 11 if 10 elements is genuinely the intent. wdyt?
(Same constant in v4.0 and v4.1.)
##########
spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/data/RandomData.java:
##########
@@ -182,7 +187,7 @@ private boolean isNull() {
@Override
public Object list(Types.ListType list, Supplier<Object> elementResult) {
- int numElements = random.nextInt(20);
+ int numElements = random.nextInt(MAX_COLLECTION_SIZE);
Review Comment:
Same change applied identically here and in v4.1 — comment text is
byte-for-byte consistent across all three copies, which is good. No drift to
worry about.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]