[
https://issues.apache.org/jira/browse/HADOOP-14469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16051631#comment-16051631
]
Steve Loughran commented on HADOOP-14469:
-----------------------------------------
I know the size of the list may be less than the size of the status.length
value, but it will be rare (problem would have surfaced earlier). Remember that
the size doesn't stop the list being smaller than the actual list. What it does
do is preallocate enough space for that many objects, so avoiding extra memory
allocations & work as the list grows. Because we know the max size of the list,
we should use that.
so: use the length and we have a more efficient codepath in the default "no .
or .. coming back" case, and only waste the memory space of two status
allocations in the other case
> FTPFileSystem#listStatus get currentPath and parentPath at the same time,
> causing recursively list action endless
> -----------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-14469
> URL: https://issues.apache.org/jira/browse/HADOOP-14469
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs, tools/distcp
> Affects Versions: 3.0.0-alpha3
> Environment: ftp build by windows7 + Serv-U_64 12.1.0.8
> code runs any os
> Reporter: Hongyuan Li
> Assignee: Hongyuan Li
> Priority: Critical
> Attachments: HADOOP-14469-001.patch, HADOOP-14469-002.patch,
> HADOOP-14469-003.patch
>
>
> for some ftpsystems, liststatus method will return new Path(".") and new
> Path(".."), thus causing list op looping.for example, Serv-U
> We can see the logic in code below:
> {code}
> private FileStatus[] listStatus(FTPClient client, Path file)
> throws IOException {
> ……
> FileStatus[] fileStats = new FileStatus[ftpFiles.length];
> for (int i = 0; i < ftpFiles.length; i++) {
> fileStats[i] = getFileStatus(ftpFiles[i], absolute);
> }
> return fileStats;
> }
> {code}
> {code}
> public void test() throws Exception{
> FTPFileSystem ftpFileSystem = new FTPFileSystem();
> ftpFileSystem.initialize(new
> Path("ftp://test:[email protected]/").toUri(),
> new Configuration());
> FileStatus[] fileStatus = ftpFileSystem.listStatus(new Path("/new"));
> for(FileStatus fileStatus1 : fileStatus)
> System.out.println(fileStatus1);
> }
> {code}
> using test code below, the test results list below
> {code}
> FileStatus{path=ftp://test:[email protected]/new; isDirectory=true;
> modification_time=1496716980000; access_time=0; owner=user; group=group;
> permission=---------; isSymlink=false}
> FileStatus{path=ftp://test:[email protected]/; isDirectory=true;
> modification_time=1496716980000; access_time=0; owner=user; group=group;
> permission=---------; isSymlink=false}
> FileStatus{path=ftp://test:[email protected]/new/hadoop; isDirectory=true;
> modification_time=1496716980000; access_time=0; owner=user; group=group;
> permission=---------; isSymlink=false}
> FileStatus{path=ftp://test:[email protected]/new/HADOOP-14431-002.patch;
> isDirectory=false; length=2036; replication=1; blocksize=4096;
> modification_time=1495797780000; access_time=0; owner=user; group=group;
> permission=---------; isSymlink=false}
> FileStatus{path=ftp://test:[email protected]/new/HADOOP-14486-001.patch;
> isDirectory=false; length=1322; replication=1; blocksize=4096;
> modification_time=1496716980000; access_time=0; owner=user; group=group;
> permission=---------; isSymlink=false}
> FileStatus{path=ftp://test:[email protected]/new/hadoop-main;
> isDirectory=true; modification_time=1495797120000; access_time=0; owner=user;
> group=group; permission=---------; isSymlink=false}
> {code}
> In results above, {{FileStatus{path=ftp://test:[email protected]/new; ……}}
> is obviously the current Path, and
> {{FileStatus{path=ftp://test:[email protected]/;……}} is obviously the
> parent Path.
> So, if we want to walk the directory recursively, it will stuck.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]