Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-11-08 Thread via GitHub
github-actions[bot] closed pull request #10336: Spark: Add SparkSQLProperty to control split-size URL: https://github.com/apache/iceberg/pull/10336 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-11-08 Thread via GitHub
github-actions[bot] commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2465928603 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-11-01 Thread via GitHub
github-actions[bot] commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2452745597 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-07-12 Thread via GitHub
sumedhsakdeo commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2226308996 Thank you so much @szehon-ho for your contribution to spark side. OPTIONS is indeed the right way to achieve this functionality. I was wondering if UPDATE and DELETE support is als

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-07-09 Thread via GitHub
szehon-ho commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2219566750 Following up, this is now possible on the spark side. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-06-21 Thread via GitHub
szehon-ho commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2182038718 The problem with this is that it does lead to some ambiguity as to what table the config is applying to (many queries read from several tables, for example) -- This is an automated

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-06-05 Thread via GitHub
sumedhsakdeo commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2151411434 Thanks @szehon-ho appreciate your PR https://github.com/apache/spark/pull/46707 Could you suggest a recommendation for this PR? Will your support for options in Spark SQL b

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-05-22 Thread via GitHub
szehon-ho commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2125733798 Yea its really something that would be great to fix in Spark. I hacked together another attempt https://github.com/apache/spark/pull/46707 based on the last comment in https://github

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-05-16 Thread via GitHub
sumedhsakdeo commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2116335906 Thanks Shardul for taking a look. Appreciate your review Anton and Amogh. Also adding @wmoustafa! -- This is an automated message from the Apache Git Service. To respond to the

[PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-05-15 Thread via GitHub
sumedhsakdeo opened a new pull request, #10336: URL: https://github.com/apache/iceberg/pull/10336 We have a scheduled job that deletes rows in an Iceberg table. The job is authored in SQL. Given we use CoW technique for data deletion the job would rewrite the files without the deleted rows.