[GitHub] [hadoop] steveloughran commented on a change in pull request #2584: HADOOP-16202. Enhance openFile()

GitBox Fri, 17 Sep 2021 10:37:35 -0700


steveloughran commented on a change in pull request #2584:
URL: https://github.com/apache/hadoop/pull/2584#discussion_r711237755




##########
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Options.java
##########
@@ -518,4 +522,112 @@ public String toString() {
     MD5MD5CRC,  // MD5 of block checksums, which are MD5 over chunk CRCs
     COMPOSITE_CRC  // Block/chunk-independent composite CRC
   }
+
+  /**
+   * The standard {@code openFile()} options.
+   */
+  @InterfaceAudience.Public
+  @InterfaceStability.Evolving
+  public static final class OpenFileOptions {
+
+    private OpenFileOptions() {
+    }
+
+    /**
+     * Prefix for all standard filesystem options: {@value}.
+     */
+    public static final String FILESYSTEM_OPTION = "fs.option.";
+
+    /**
+     * Prefix for all openFile options: {@value}.
+     */
+    public static final String FS_OPTION_OPENFILE =
+        FILESYSTEM_OPTION + "openfile.";
+
+    /**
+     * OpenFile option for file length: {@value}.
+     */
+    public static final String FS_OPTION_OPENFILE_LENGTH =
+        FS_OPTION_OPENFILE + "length";
+
+    /**
+     * OpenFile option for split start: {@value}.
+     */
+    public static final String FS_OPTION_OPENFILE_SPLIT_START =
+        FS_OPTION_OPENFILE + "split.start";
+
+    /**
+     * OpenFile option for split end: {@value}.
+     */
+    public static final String FS_OPTION_OPENFILE_SPLIT_END =
+        FS_OPTION_OPENFILE + "split.end";
+
+    /**
+     * OpenFile option for buffer size: {@value}.
+     */
+    public static final String FS_OPTION_OPENFILE_BUFFER_SIZE =
+        FS_OPTION_OPENFILE + "buffer.size";
+
+    /**
+     * OpenFile option for read policies: {@value}.
+     */
+    public static final String FS_OPTION_OPENFILE_READ_POLICY =
+        FS_OPTION_OPENFILE + "read.policy";
+
+    /**
+     * Read policy for adaptive IO: {@value}.
+     */
+    public static final String FS_OPTION_OPENFILE_READ_POLICY_ADAPTIVE =

Review comment:
       good point.
   
   I want us to have some basic names of them so they can be used more broadly, 
and in particular, we can have tools asking for some "whole-file", knowing that 
an FS which recognises them will handle them properly.
   
   FWIW whole-file is one I've realised abfs & s3a can do very efficiently
   * switch to a buffer read strategy of big buffers but fewer readers
   * start to read immediately
   
   in contrast, `sequential` is often used for splits, so lazy seek is needed 
and as it may end earlier, smaller buffers still make sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hadoop] steveloughran commented on a change in pull request #2584: HADOOP-16202. Enhance openFile()

Reply via email to