github-actions[bot] commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2623222418
This pull request has been closed due to lack of activity. This is not a
judgement on the merit of the PR in any way. It is just a way of keeping the PR
queue manageable. If y
github-actions[bot] closed pull request #5837: API,Core: Introduce metrics for
data files by file format
URL: https://github.com/apache/iceberg/pull/5837
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
github-actions[bot] commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2594191590
This pull request has been marked as stale due to 30 days of inactivity. It
will be closed in 1 week if no further activity occurs. If you think that’s
incorrect or this pull
Fokko commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2441220940
Hey @gaborkaszab sorry for not replying earlier, I was out on parental leave.
> E.g. Streaming ingest into AVRO for faster writes and then compact into
Parquet for faster reads.
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2441133791
@Fokko About the justification of this PR I recently found another use case
that could use this: Streaming ingestion using a different file format than the
compaction. E.g. Streaming
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2435424210
Hi @Fokko , @findepi ,
Is there anything I can do to make progress on this PR? The motivation is
clear, there is a need for this in a query engine, I think I could also address
the
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2363706681
Thanks for taking a look @findepi , @Fokko!
So far I don't see any reason why this can't be merged. Not as it is now but
probably reverting to the initial version that didn't ha
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2317080098
Hey @Fokko,
Thanks for your response and thanks for the explanation!
I might miss some pieces of information here, but checked the snapshot
summary in the metadata.jsons and
Fokko commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2315744223
First of all, sorry for not jumping into this earlier.
> 1) extra metrics never hurt
This is unfortunately not true. The metadata JSON grows quite big in bytes
very easily, and
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2312634994
> most queries operate on freshmost data, so they will see Parquet files
In general this is true but we still see users ending up tables with mixed
file formats and having queri
findepi commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2307633354
> for instance with Hive that used ORC format and with Impala that wrote
Parquet files.
that is likely addressed by preferred file format being a table-level
configuration?
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2304756917
Thanks for taking a look, @findepi !
I've seen users doing this. One of the motivation is that they gradually
move away from one file format into another. What I've seen is that Imp
findepi commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2304699180
> new metrics for the number of data files broken down by file format.
how common is it to have tables with mixed file formats?
--
This is an automated message from the Apache Git
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2301614609
Hey,
I saw that stale label was added to this PR due to inactivity. I removed it
since I still have the intention to merge this, however I find it pretty
difficult to get someone w
github-actions[bot] commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2295450317
This pull request has been marked as stale due to 30 days of inactivity. It
will be closed in 1 week if no further activity occurs. If you think that’s
incorrect or this pull
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2217415364
Let me involve an even wider set of committers here since this has been open
for a while now. Hopefully someone has some spare time to make this going
again. Any reviews are appreciat
gaborkaszab commented on PR #5837:
URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2164746423
Hey @nastra , @rdblue , @danielcweeks , @jbonofre ,
It's been a while since I worked on this PR but it got to my radar again
now. Would it be possible for any of you to take a look?
17 matches
Mail list logo