szehon-ho commented on issue #6670: URL: https://github.com/apache/iceberg/issues/6670#issuecomment-1405863016
After a long debugging session with @RussellSpitzer and @dramaticlly , we think hashcode is the first part of the problem. The rest of the problem is that equals() is broken as well for all keys, and hashcode was the only thing the map relies on. *Analysis* File entries are read from ManifestReader, which uses AvroIterable with reuseContainer. Which means as you advance the iterator, the previous elements get set to the current one. Ref: https://github.com/apache/iceberg/blob/1.0.x/core/src/main/java/org/apache/iceberg/ManifestReader.java#L227 So the keys in the StructLikeMap's map become all equal to each other! ie. Map (partition=2022728, value = blah), (partition=2022728, value = blah2), (partition=2022728, value = blah3), when they all started with different partition values. So in fact, equals() has always been broken in this map for keys. The only thing keeping the map semi-working was hashcode, because java map first checks hashcode equality before object equality. And in this case, hashcode finally collided. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org