amogh-jahagirdar commented on code in PR #13555: URL: https://github.com/apache/iceberg/pull/13555#discussion_r2220529650
########## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/IcebergSource.java: ########## @@ -163,6 +165,14 @@ private Spark3Util.CatalogAndIdentifier catalogAndIdentifier(CaseInsensitiveStri selector = TAG_PREFIX + tag; } + String groupId = + options.getOrDefault( + SparkReadOptions.SCAN_TASK_SET_ID, + options.get(SparkWriteOptions.REWRITTEN_FILE_SCAN_TASK_SET_ID)); + if (groupId != null) { + selector = REWRITE_PREFIX + groupId.replace("-", ""); Review Comment: I'm not sure about a 120 character limit (did some investigation couldn't find anything related) but even then I think we're still safe for this rewrite case because for the rewrite case the identifier is the groupId UUID and then there's the "#" followed by the selector which is another UUID. This UUID gets mapped to the actual table reference in SparkTableCache during the compaction job. This combined is 81 characters. Actually saying this out loud made me realize that there's really no value to adding the same UUID again for the rewrite case, so we can actually just simplify this a bit :) So we should probably just do something like have the selector just be the word "rewrite" Then we just have a UUID + "#rewrite", which is then just 45 bytes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org