jackye1995 commented on code in PR #6655:
URL: https://github.com/apache/iceberg/pull/6655#discussion_r1085900059
##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java:
##########
@@ -67,11 +69,15 @@ public boolean caseSensitive() {
}
public boolean localityEnabled() {
- if (table.io() instanceof HadoopFileIO) {
- HadoopInputFile file = (HadoopInputFile)
table.io().newInputFile(table.location());
- String scheme = file.getFileSystem().getScheme();
- boolean defaultValue = LOCALITY_WHITELIST_FS.contains(scheme);
- return PropertyUtil.propertyAsBoolean(readOptions,
SparkReadOptions.LOCALITY, defaultValue);
+ if (table.io() instanceof HadoopFileIO || table.io() instanceof
ResolvingFileIO) {
Review Comment:
I am a bit concerned for people using `ResolvingFileIO` with this approach,
previously we will only create an input file to check locality if it's
`HadoopFileIO`, but now if the user is using `RevolvingFileIO` this operation
will be done for every single file even if it is not a HadoopFileIO for the
specific location.
I am wondering if we should make `ResolvingFileIO.implFromLocation` a public
method so we can know the FileIO used for the location without the need to open
input file.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]