This is an automated email from the ASF dual-hosted git repository.

twolf pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mina-sshd.git


The following commit(s) were added to refs/heads/master by this push:
     new d78c97d  [SSHD-1217] Update SFTP documentation
d78c97d is described below

commit d78c97db00d79d833347888cc2c0569df09cf101
Author: Thomas Wolf <tw...@apache.org>
AuthorDate: Thu Oct 28 21:39:21 2021 +0200

    [SSHD-1217] Update SFTP documentation
    
    Add a section on listing directories. Users have to be aware that an
    SftpFileSystem is a _remote file system_ and has different performance
    characteristics than a local file system.
---
 docs/sftp.md | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 91 insertions(+), 2 deletions(-)

diff --git a/docs/sftp.md b/docs/sftp.md
index 99a2625..5fb0acc 100644
--- a/docs/sftp.md
+++ b/docs/sftp.md
@@ -37,7 +37,7 @@ SftpSubsystemFactory factory = new 
SftpSubsystemFactory.Builder()
     .withExecutorServiceProvider(() -> ThreadUtils.noClose(mySpecialExecutor))
     .build();
 server.setSubsystemFactories(Collections.singletonList(factory));
-    
+
 ```
 
 ### `SftpEventListener`
@@ -139,7 +139,7 @@ try (ClientSession session = ...obtain session...) {
     try (SftpClient client = factory.createSftpClient(session)) {
         ... use the SFTP client...
     }
-    
+
     // NOTE: session is still alive here...
 }
 
@@ -446,6 +446,95 @@ try (ClientSession session = client.connect(...)) {
 
 On the server side, one can use the `SftpFileSystemAccessor#putRemoteFileName` 
to encode the returned file name/path using non-UTF8 encoding. However, this 
might break clients that expect UTF-8 - i.e., as long as both the client and 
server are somehow "aligned" on the encoding being used it will work. In this 
context, one might also need to consider implementing the `filename-charset` , 
`filename-translation-control` extensions as described in [DRAFT 13 - section 
6](https://tools.ietf.or [...]
 
+### Listing SFTP directories
+
+Listing directories can be done in Java in various ways. With the Java NIO 
framework, a common approach is
+
+```java
+public void processDirectory(Path directoryPath, Consumer<Path> process) 
throws IOException {
+  try (DirectoryStream<Path> dir = Files.newDirectoryStream(directoryPath)) {
+    dir.iterator().forEachRemaining(path -> {
+      process.accept(path); // Do whatever needs to be done with 'path'
+    });
+  }
+}
+```
+This also works fine if the `Path` is an `SftpPath` obtained from an 
`SftpFileSystem`. But what if you also need the file _attributes_ ?
+
+Again, in plain Java NIO, one might do
+
+```java
+public void processDirectory(Path directoryPath, BiConsumer<Path, 
BasicFileAttributes> process) throws IOException {
+  try (DirectoryStream<Path> dir = Files.newDirectoryStream(directoryPath)) {
+    Iterator<Path> files = dir.iterator();
+    while (files.hasNext()) {
+      Path path = files.next();
+      BasicFileAttributes attributes = Files.readAttributes(path, 
BasicFileAttributes.class);
+      process.accept(path, attributes);
+    }
+  }
+}
+```
+This gets all the Paths of the files inside the directory, then reads their 
attributes one-by-one.
+On Unix, there is variation using `Files.walkFileTree` that may have much 
better performance:
+
+```java
+public void processDirectory(Path directoryPath, BiConsumer<Path, 
BasicFileAttributes> process) throws IOException {
+  Files.walkFileTree(directoryPath, EnumSet.noneOf(FileVisitOption.class), 1,
+      new SimpleFileVisitor<Path>() {
+
+          @Override
+          public FileVisitResult visitFile(Path path, BasicFileAttributes 
attributes) {
+            // Beware this is also called for the directory itself
+            process.accept(path, attributes);
+            return FileVisitResult.CONTINUE;
+          }
+      });
+}
+```
+This typically performs better on Unix because the file system can deliver the 
file attributes together with the
+paths, and the standard Java implementation of `FileVisitor` takes advantage 
of this. On Windows you'll typically
+not see any improvement because the file system stores attributes differently 
and has to fetch them extra anyway.
+
+This is important when remote file systems come into play. With an 
`SftpFileSystem`, the call to `Files.readAttributes()`
+is a _remote call_ to the SFTP server, hence it's an expensive operation. Thus 
the first variant is slow, which may make
+processing a directory with many files excruciatingly slow.
+
+SFTP has a directory model similar to Unix: a request for a directory listing 
always returns the file names _and_ the
+file attributes. But Java's `FileVisitor` doesn't know this, and doesn't know 
about `SftpFileSystem` at all -- it's just
+a normal `java.nio.file.FileSystem` for it. Hence it doesn't use its internal 
optimization for Unix file systems and
+instead also calls `Files.readAttributes()` for each file under the hood. This 
makes the second variant also slow with an
+`SftpFileSystem`.
+
+To get paths and attributes in an _efficient_ way from an `SftpFileSystem`, 
one has to bypass the `FileSystem` abstraction
+and use SFTP commands directly:
+
+```java
+Path directoryPath = ...;
+if (directoryPath instanceof SftpPath) {
+  try (SftpClient client = ((SftpPath) 
directoryPath).getFileSystem().getClient();
+       CloseableHandle handle = client.openDir(directoryPath.toString())) {
+    client.listDir(handle).iterator().forEachRemaining(directoryEntry -> {
+      SftpClient.Attributes attributes = directoryEntry.getAttributes();
+      String file = directoryEntry.getFilename();
+      if (".".equals(file)) {
+        // The directory itself.
+        process(directoryPath, attributes);
+      } else if ("..".equals(file)) {
+        // The parent directory, if any
+        process(directoryPath.getParent(), attributes);
+      } else {
+        process(directoryPath.resolve(file), attributes);
+      }
+    });
+  }
+} else {
+  // Not an SFTP path -- get the directory listing in whatever other way is 
appropriate.
+}
+```
+So even if an `SftpFileSystem` fulfills the general contract of a 
`FileSystem`, a client still has to be aware that
+it is a _remote file system_ that may have quite different performance 
characteristics than a local file file system.
+
 ### SFTP aware directory scanners
 
 The framework provides special SFTP aware directory scanners that look for 
files/folders matching specific patterns. The

Reply via email to