Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-16 Thread via GitHub
RussellSpitzer commented on PR #11035: URL: https://github.com/apache/iceberg/pull/11035#issuecomment-2418066570 Thanks @jeesou for the PR, @aokolnychyi , @karuppayya , @huaxingao , @guykhazma all for reviewing. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-16 Thread via GitHub
RussellSpitzer merged PR #11035: URL: https://github.com/apache/iceberg/pull/11035 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-16 Thread via GitHub
jeesou commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1803622801 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -911,9 +1027,17 @@ private void checkColStatisticsReported( assertThat

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-16 Thread via GitHub
RussellSpitzer commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1803219170 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -911,9 +1027,17 @@ private void checkColStatisticsReported( as

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-16 Thread via GitHub
RussellSpitzer commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1803219170 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -911,9 +1027,17 @@ private void checkColStatisticsReported( as

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-15 Thread via GitHub
jeesou commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1802355766 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -198,25 +198,31 @@ protected Statistics estimateStatistics(Snapshot snapshot)

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-15 Thread via GitHub
jeesou commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1802349064 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -911,9 +1027,17 @@ private void checkColStatisticsReported( assertThat

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-15 Thread via GitHub
RussellSpitzer commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1802049314 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -198,25 +198,31 @@ protected Statistics estimateStatistics(Snapshot sn

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-10-15 Thread via GitHub
RussellSpitzer commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1802045566 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -911,9 +1027,17 @@ private void checkColStatisticsReported( as

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-23 Thread via GitHub
aokolnychyi commented on PR #11035: URL: https://github.com/apache/iceberg/pull/11035#issuecomment-2370250512 I'll check tomorrow. Sorry for the delay! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-23 Thread via GitHub
jeesou commented on PR #11035: URL: https://github.com/apache/iceberg/pull/11035#issuecomment-2368935077 Hi @aokolnychyi could you please help review this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-22 Thread via GitHub
jeesou commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1770624635 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -198,27 +198,31 @@ protected Statistics estimateStatistics(Snapshot snapshot)

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-22 Thread via GitHub
jeesou commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1770621280 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -198,27 +198,31 @@ protected Statistics estimateStatistics(Snapshot snapshot)

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-20 Thread via GitHub
huaxingao commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1769350856 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -198,27 +198,31 @@ protected Statistics estimateStatistics(Snapshot snapsho

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-20 Thread via GitHub
huaxingao commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1769350721 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -198,27 +198,31 @@ protected Statistics estimateStatistics(Snapshot snapsho

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-14 Thread via GitHub
guykhazma commented on PR #11035: URL: https://github.com/apache/iceberg/pull/11035#issuecomment-2350971627 @karuppayya @huaxingao @szehon-ho can you please help review this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-09-04 Thread via GitHub
jeesou commented on PR #11035: URL: https://github.com/apache/iceberg/pull/11035#issuecomment-2330597057 Hi @karuppayya , @aokolnychyi , @huaxingao kindly review this PR once. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-08-29 Thread via GitHub
jeesou commented on PR #11035: URL: https://github.com/apache/iceberg/pull/11035#issuecomment-2318875961 Hi Adding a enhancement in test case - For no stats scenario also, we were traversing over the expectedNDVs Map, which was empty, and thus the Assert was never reached, and it was

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-08-28 Thread via GitHub
aokolnychyi commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1735390156 ## core/src/main/java/org/apache/iceberg/puffin/StandardBlobTypes.java: ## @@ -26,4 +26,6 @@ private StandardBlobTypes() {} * href="https://datasketches.apach

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-08-28 Thread via GitHub
karuppayya commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1734965325 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -199,28 +199,24 @@ protected Statistics estimateStatistics(Snapshot snapsh

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-08-28 Thread via GitHub
guykhazma commented on code in PR #11035: URL: https://github.com/apache/iceberg/pull/11035#discussion_r1734554161 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -199,28 +199,24 @@ protected Statistics estimateStatistics(Snapshot snapsho

Re: [PR] Updating SparkScan to only read Apache DataSketches [iceberg]

2024-08-28 Thread via GitHub
jeesou commented on PR #11035: URL: https://github.com/apache/iceberg/pull/11035#issuecomment-2315071856 Hi @huaxingao, @karuppayya kindly review the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov