Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-12-26 Thread via GitHub
atifiu commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1869488679 @paulpaul1076 It could be really great if you can add some explanation which you have understood regarding this as it might benefit others also. -- This is an automated message from

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-09 Thread via GitHub
paulpaul1076 commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1752528574 Thanks @RussellSpitzer you helped me with this in slack. I understand it now. I think the doc should add some extra explanation about this though. -- This is an automated mes

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1751714232 https://iceberg.apache.org/docs/latest/spark-writes/#writing-distribution-modes -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer closed issue #8729: write.target-file-size-bytes isn't respected when writing data URL: https://github.com/apache/iceberg/issues/8729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-07 Thread via GitHub
RussellSpitzer commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1751713732 Everything Amogh said is correct, write target file size is the max a writer will produce not the minimum. Amount of data written to a file is dependent on the amount of data

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
paulpaul1076 commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750288055 I increased spark.sql.adaptive.advisoryPartitionSizeInBytes and the files are still around 100MB in size. -- This is an automated message from the Apache Git Service. To respo

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
paulpaul1076 commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750200744 @amogh-jahagirdar the value of spark.sql.adaptive.advisoryPartitionSizeInBytes is the default 64MB. 1) Do I understand it correctly that the size of the uncompressed data

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
gzagarwal commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750189687 I also had the same problem , i added couple of spark properties and then the file size got increased from 50 to 250 MB around As Amogh pointed about the property "spark.sql.ada

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
amogh-jahagirdar commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750178671 Also what's your configured value for `spark.sql.adaptive.advisoryPartitionSizeInBytes`? that will also influence the Spark task size for your case as well (by default, the

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
amogh-jahagirdar commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750162049 >So this setting is not really "file size", it's more like "task size"? I would not say that. The docs I linked earlier put it concisely when it says "When writing dat

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-06 Thread via GitHub
paulpaul1076 commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1750131250 I just tried bumping up the value of this setting by times 10 (5368709120), and the file sizes are still around 100MB. -- This is an automated message from the Apache Git Serv