wypoon commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1986161732
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +186,43 @@ public Statistics estimateStatistics() {
protected Statist
huaxingao commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2260861732
Thanks a lot @RussellSpitzer! Also thanks to @szehon-ho @karuppayya @findepi
@singhpk234 for helping with this PR!
--
This is an automated message from the Apache Git Service.
To re
RussellSpitzer merged PR #10659:
URL: https://github.com/apache/iceberg/pull/10659
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@ic
RussellSpitzer commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2260822697
Thanks @huaxingao and @karuppayya ! This is a great addition to the Spark
capabilities
--
This is an automated message from the Apache Git Service.
To respond to the message, p
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1697203213
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +185,45 @@ public Statistics estimateStatistics() {
protected Stat
sfc-gh-rspitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1697089697
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +185,45 @@ public Statistics estimateStatistics() {
protecte
sfc-gh-rspitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1697086825
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +185,45 @@ public Statistics estimateStatistics() {
protecte
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693697534
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -130,6 +142,98 @@ public void testEstimatedRowCount() throws
NoSuchTabl
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693697463
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +185,46 @@ public Statistics estimateStatistics() {
protected Stat
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693627295
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -130,6 +142,98 @@ public void testEstimatedRowCount() throws
NoSuc
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693627142
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -130,6 +142,98 @@ public void testEstimatedRowCount() throws
NoSuc
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693625673
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +185,46 @@ public Statistics estimateStatistics() {
protected
huaxingao commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2253538504
@jeesou @saitharun15
Thanks for testing this out! The column stats are not yet accurate because I
still need to retrieve the numOfNulls, min, and max from the manifest files. I
wi
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693621270
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -130,6 +142,61 @@ public void testEstimatedRowCount() throws
NoSuchTabl
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693621129
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +184,37 @@ public Statistics estimateStatistics() {
protected Stat
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693320217
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -734,6 +801,21 @@ private Expression[] expressions(Expression...
e
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693320217
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -734,6 +801,21 @@ private Expression[] expressions(Expression...
e
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693310824
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -130,6 +142,61 @@ public void testEstimatedRowCount() throws
NoSuc
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693310824
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -130,6 +142,61 @@ public void testEstimatedRowCount() throws
NoSuc
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693306852
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +184,37 @@ public Statistics estimateStatistics() {
protected
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1693301909
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -189,9 +192,8 @@ protected Statistics estimateStatistics(Snapshot snaps
saitharun15 commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2252205197
> Hi @huaxingao , @karuppayya cc : @RussellSpitzer We were running some
tests on Spark with the latest codes. We took the changes of the previous PR
#10288, along with this PR chang
jeesou commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2252176015
Hi @huaxingao , @karuppayya
cc : @RussellSpitzer
We were running some tests on Spark with the latest codes.
We took the changes of the previous PR
https://github.com/apache/iceb
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1684997348
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkColumnStatistics.java:
##
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1684996986
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java:
##
@@ -347,4 +347,12 @@ private boolean executorCacheLocalityEnabledInternal() {
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1684968196
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkColumnStatistics.java:
##
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Found
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1684966782
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java:
##
@@ -347,4 +347,12 @@ private boolean executorCacheLocalityEnabledInternal() {
findepi commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1683949890
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +184,37 @@ public Statistics estimateStatistics() {
protected Statis
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1683659774
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +184,37 @@ public Statistics estimateStatistics() {
protected Stat
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1683659065
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -189,9 +192,8 @@ protected Statistics estimateStatistics(Snapshot snapshot)
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1683657327
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +184,37 @@ public Statistics estimateStatistics() {
protected Stat
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1683657244
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java:
##
@@ -90,4 +90,8 @@ private SparkSQLProperties() {}
public static final Stri
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1683657141
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +184,37 @@ public Statistics estimateStatistics() {
protected Stat
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1683657052
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java:
##
@@ -90,4 +90,8 @@ private SparkSQLProperties() {}
public static final Stri
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1681647818
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +184,37 @@ public Statistics estimateStatistics() {
protected
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1681617619
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java:
##
@@ -90,4 +90,8 @@ private SparkSQLProperties() {}
public static final
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1681617271
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java:
##
@@ -90,4 +90,8 @@ private SparkSQLProperties() {}
public static final
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1681613073
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java:
##
@@ -90,4 +90,8 @@ private SparkSQLProperties() {}
public static final
karuppayya commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1681586861
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java:
##
@@ -90,4 +90,8 @@ private SparkSQLProperties() {}
public static final Str
karuppayya commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1681584714
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Sta
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1680162546
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Stat
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1680162406
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Stat
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1680162221
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Stat
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1680162100
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Stat
huaxingao commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1680161964
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Stat
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1678354207
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -130,6 +180,58 @@ public void testEstimatedRowCount() throws
NoSuc
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1678353575
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -97,6 +117,36 @@ public static Object[][] parameters() {
};
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1678346146
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1678344577
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1678342896
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkChangelogScan.java:
##
@@ -88,7 +89,7 @@ class SparkChangelogScan implements Scan,
Supp
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1678341022
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/ColStats.java:
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
RussellSpitzer commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1678331241
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java:
##
@@ -347,4 +347,12 @@ private boolean executorCacheLocalityEnabledInternal() {
jeesou commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2229015200
> @jeesou I will be creating a PR for the procedure for Analyze action, when
#10288 is merged . Currently the spec supports only NDV. For more stats we will
need to make spec chnages and
findepi commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1676177423
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Statis
karuppayya commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1676148283
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Sta
findepi commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1675581619
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Statis
karuppayya commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1674793152
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Sta
szehon-ho commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1674693714
##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java:
##
@@ -97,6 +117,36 @@ public static Object[][] parameters() {
};
}
singhpk234 commented on code in PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#discussion_r1674683253
##
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java:
##
@@ -175,7 +181,25 @@ public Statistics estimateStatistics() {
protected Sta
karuppayya commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2221288014
@jeesou I will be creating a PR for the procedure for Analyze action.
Currently the spec supports only NDV. For more stats we will need to make
spec chnages and other corresponding
jeesou commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2220881056
Hi @huaxingao , @karuppayya, just to clear out a few aspects,
This PR would be a continuation of
https://github.com/apache/iceberg/pull/10288.
To the above mentioned PR we would
huaxingao commented on PR #10659:
URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2215051692
cc @szehon-ho @karuppayya
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
huaxingao opened a new pull request, #10659:
URL: https://github.com/apache/iceberg/pull/10659
Co-authored-by: Huaxin Gao
Co-authored-by: Karuppayya Rajendran
This PR adds the column stats support, so Iceberg can report column stats to
Spark engine for CBO.
--
This is an automat
63 matches
Mail list logo