SHuixo commented on issue #6104:
URL: https://github.com/apache/iceberg/issues/6104#issuecomment-1309843130

   Think you @luoyuxia for your replay.
   
   Later, I tried again a few times and found that when the cumulative storage 
of iceberg data small files was relatively small, the flink 1.13.5 compressed 
file ran normally and could generate snapshot files.
   
   However, when the file volume accumulates a lot, it takes a long time to 
rewrite the data each time, and it is easy to cause **OOM** exceptions, here 
are my attempts at **Flink 1.13.5 / 1.15.2, iceberg 1.14.1**  and the log logs 
generated by the task.
   
   > The following figure shows that the compression task has been in the Map 
stage:
   <img width="872" alt="dag-13-1" 
src="https://user-images.githubusercontent.com/20868410/201018299-e64b3a02-3ff2-4d49-b1cc-e7bdf703f3aa.PNG";>
   
   > OOM exception information that occurs when the compression task occurs:
   
   **flink 1.13.5:**
   
[error-flink-1.13.5.log](https://github.com/apache/iceberg/files/9978008/error-flink-1.13.5.log)
   
   
   **flink 1.15.2:**
   
[error-flink-1.15.2.log](https://github.com/apache/iceberg/files/9978009/error-flink-1.15.2.log)
   
   
   Here I want to ask, if the data is continuously written to iceberg, the 
problem of data compression OOM is inevitable, and the compression time will 
become longer and longer.
   
    I see that there are API methods   **appendsBetween() / appendsAfter()** 
related to incremental compression in the source code, does this mean that 
incremental compression can be used to replace the repeated compression process 
of full data in the future?
   
   thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to