shardulm94 commented on PR #6327: URL: https://github.com/apache/iceberg/pull/6327#issuecomment-1340022074
> @shardulm94 do we know why dropping those constant fields is required in the read path? I see that this was introduced by #1191 but I'm lacking some historical context and would appreciate your input here. Here we are construct the "readSchema" i.e. the schema that should be passed to the underlying file reader e.g. ORC when reading the data file. We already have the value of constant columns per file available via Iceberg, so reading them from the file is unnecessary. Doing so, we avoid the read & deserialization cost for the constant columns and also avoid the memory footprint to store the column vector in memory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org