wooyeong opened a new pull request, #9455:
URL: https://github.com/apache/iceberg/pull/9455

   Issue: #9450 
   
   I've changed SparkTable to use name and effective snapshot id for checking 
equality.
   
   With the previous code I mentioned in #9450,
   ```diff
   -    return icebergTable.name().equals(that.icebergTable.name());
   +    return icebergTable.name().equals(that.icebergTable.name())
   +            && Objects.equals(branch, that.branch)
   +            && Objects.equals(snapshotId, that.snapshotId);
   ```
   
   the two refs with the same effective snapshot id don't get optimized as 
@ajantha-bhat stated.
   ```sql
   SELECT * FROM iceberg_except_test
   UNION
   SELECT * FROM iceberg_except_test
   VERSION AS OF '2024-01-01';
   
   == Parsed Logical Plan ==
   'Distinct
   +- 'Union false, false
      :- 'Project [*]
      :  +- 'UnresolvedRelation [iceberg_except_test], [], false
      +- 'Project [*]
         +- 'RelationTimeTravel 'UnresolvedRelation [iceberg_except_test], [], 
false, 2024-01-01
   
   == Analyzed Logical Plan ==
   id: string, a: string, b: timestamp
   Distinct
   +- Union false, false
      :- Project [id#30, a#31, b#32]
      :  +- SubqueryAlias local.iceberg_except_test
      :     +- RelationV2[id#30, a#31, b#32] local.iceberg_except_test 
local.iceberg_except_test
      +- Project [id#33, a#34, b#35]
         +- SubqueryAlias local.iceberg_except_test
            +- RelationV2[id#33, a#34, b#35] local.iceberg_except_test 
local.iceberg_except_test
   
   == Optimized Logical Plan ==
   Aggregate [id#30, a#31, b#32], [id#30, a#31, b#32]
   +- Union false, false
      :- RelationV2[id#30, a#31, b#32] local.iceberg_except_test
      +- RelationV2[id#33, a#34, b#35] local.iceberg_except_test
   ```
   
   With this patch, the same query is optimized as below:
   ```sql
   == Parsed Logical Plan ==
   'Distinct
   +- 'Union false, false
      :- 'Project [*]
      :  +- 'UnresolvedRelation [iceberg_except_test], [], false
      +- 'Project [*]
         +- 'RelationTimeTravel 'UnresolvedRelation [iceberg_except_test], [], 
false, 2024-01-01
   
   == Analyzed Logical Plan ==
   id: string, a: string, b: timestamp
   Distinct
   +- Union false, false
      :- Project [id#27, a#28, b#29]
      :  +- SubqueryAlias local.iceberg_except_test
      :     +- RelationV2[id#27, a#28, b#29] local.iceberg_except_test 
local.iceberg_except_test
      +- Project [id#30, a#31, b#32]
         +- SubqueryAlias local.iceberg_except_test
            +- RelationV2[id#30, a#31, b#32] local.iceberg_except_test 
local.iceberg_except_test
   
   == Optimized Logical Plan ==
   Aggregate [id#27, a#28, b#29], [id#27, a#28, b#29]
   +- RelationV2[id#27, a#28, b#29] local.iceberg_except_test
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to