[
https://issues.apache.org/jira/browse/HADOOP-16258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827615#comment-16827615
]
Masatake Iwasaki commented on HADOOP-16258:
-------------------------------------------
After HDFS-13176, every path element is encoded by URLEncoder#encode in
WebHdfsFileSystem#toUrl. Path is created based on the encoded string.
Path#initialize calls multi-argument constructor of
[java.net.URI|https://docs.oracle.com/javase/8/docs/api/java/net/URI.html]
which encodes chars such as ' ' and '%'. This is the reason why "dt=1" is
doubly encoded as "dt%253D1".
HDFS-13582 is the follow-up trying to apply URLEncoder to relevant path element
only. I think the code does not work as intended. Since the
{{pathAlreadyEncoded}} in the code below is always true, every path element is
still encoded as before.
{noformat}
try {
fspathUriDecoded = URLDecoder.decode(fspathUri.getPath(), "UTF-8");
pathAlreadyEncoded = true;
} catch (IllegalArgumentException ex) {
LOG.trace("Cannot decode URL encoded file", ex);
}
...
if (fsPathItem.matches(SPECIAL_FILENAME_CHARACTERS_REGEX) ||
pathAlreadyEncoded) {
fsPathEncodedItems.append(URLEncoder.encode(fsPathItem, "UTF-8"));
} else {
fsPathEncodedItems.append(fsPathItem);
}
{noformat}
> FileSystem.listLocatedStatus for path including '=' encodes it and returns
> FileNotFoundException
> ------------------------------------------------------------------------------------------------
>
> Key: HADOOP-16258
> URL: https://issues.apache.org/jira/browse/HADOOP-16258
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 3.2.0
> Reporter: Yuya Ebihara
> Assignee: Masatake Iwasaki
> Priority: Minor
> Labels: webhdfs
> Attachments: HADOOP-16258.001.patch
>
>
> Recently, we upgraded hadoop library from 2.7.7 to 3.2.0. This issue happens
> after the update. When we call FileSystem.listLocatedStatus with location
> 'webhdfs://hadoop-master:50070/user/hive/warehouse/test_part/dt=1', the
> internal calls are
> * 2.7.7
> [http://hadoop-master:50070/webhdfs/v1/user/hive/warehouse/test_part/dt=1?op=LISTSTATUS&user.name=xxx|http://hadoop-master:50070/webhdfs/v1/user/hive/warehouse/test_part/dt=1?op=LISTSTATUS&user.name=xxx%27,]
> * 3.2.0
> [http://hadoop-master:50070/webhdfs/v1/user/hive/warehouse/test_part/dt%253D1?op=LISTSTATUS&user.name=xxx]'
> As a result, it returns RemoteException with FileNotFoundException.
> {code:java}
> {"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
> /user/hive/warehouse/test_part/dt%3D1 does not exist."}}
> {code}
> Could you please tell me whether it's a bug and the way to avoid it?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]