aokolnychyi commented on code in PR #8660:
URL: https://github.com/apache/iceberg/pull/8660#discussion_r1338931458
##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkWriteRequirements.java:
##########
@@ -20,20 +20,24 @@
import org.apache.spark.sql.connector.distributions.Distribution;
import org.apache.spark.sql.connector.distributions.Distributions;
+import org.apache.spark.sql.connector.distributions.UnspecifiedDistribution;
import org.apache.spark.sql.connector.expressions.SortOrder;
/** A set of requirements such as distribution and ordering reported to Spark
during writes. */
public class SparkWriteRequirements {
public static final SparkWriteRequirements EMPTY =
- new SparkWriteRequirements(Distributions.unspecified(), new
SortOrder[0]);
+ new SparkWriteRequirements(Distributions.unspecified(), new
SortOrder[0], 0);
Review Comment:
Yes, it matches the default value in `RequiresDistributionAndOrdering` and
means no preference.
```
/**
* Returns the advisory (not guaranteed) shuffle partition size in bytes for
this write.
* <p>
* Implementations may override this to indicate the preferable partition
size in shuffles
* performed to satisfy the requested distribution. Note that Spark doesn't
support setting
* the advisory partition size for {@link UnspecifiedDistribution}, the
query will fail if
* the advisory partition size is set but the distribution is unspecified.
Data sources may
* either request a particular number of partitions via {@link
#requiredNumPartitions()} or
* a preferred partition size, not both.
* <p>
* Data sources should be careful with large advisory sizes as it will
impact the writing
* parallelism and may degrade the overall job performance.
* <p>
* Note this value only acts like a guidance and Spark does not guarantee
the actual and advisory
* shuffle partition sizes will match. Ignored if the adaptive execution is
disabled.
*
* @return the advisory partition size, any value less than 1 means no
preference.
*/
default long advisoryPartitionSizeInBytes() { return 0; }
```
##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkWriteRequirements.java:
##########
@@ -47,4 +51,8 @@ public SortOrder[] ordering() {
public boolean hasOrdering() {
return ordering.length != 0;
}
+
+ public long advisoryPartitionSize() {
+ return distribution instanceof UnspecifiedDistribution ? 0 :
advisoryPartitionSize;
Review Comment:
Will add.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]