[jira] [Commented] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop

Aaron Fabbri (JIRA) Mon, 08 Feb 2016 14:54:06 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137927#comment-15137927
 ]


Aaron Fabbri commented on HADOOP-12666:
---------------------------------------

{code}
+  /**
+   * Constructor.
+   */                                                                          
                              
+  public CachedRefreshTokenBasedAccessTokenProvider() {                        
                              
+    super();                                                                   
                              
+    if (instance == null) {
+      instance = new ConfRefreshTokenBasedAccessTokenProvider();               
                              
+    }                                                                          
                              
+  }
{code}

You can omit call to super() here.

Same thing in PrivateDebugAzureDataLakeFileSystem()

{code}
+package com.microsoft.azure.datalake.store;
+
+import org.apache.hadoop.hdfs.web.PrivateAzureDataLakeFileSystem;
+
+class AdlFileSystem extends PrivateAzureDataLakeFileSystem {
+
{code}
Why is {{PrivateAzureDataLakeFileSystem}} public?

More importantly, shouldn't you move all the code into 
org.apache.hadoop.fs.azure?  As is, it is spread between 
{{com.microsoft.azure}}, {{org.apache.hadoop.fs.azure}}, and 
{{org.apache.hadoop.hdfs.web}}.

{code}
+package org.apache.hadoop.hdfs.web;
+
+/**
+ * Constants.
+ */
+public final class ADLConfKeys {
+  public static final String
+      ADL_FEATURE_CONCURRENT_READ_AHEAD_MAX_CONCURRENT_CONN =
+      "ADL.Feature.Override.ReadAhead.MAX.Concurrent.Connection";
+  public static final int
+      ADL_FEATURE_CONCURRENT_READ_AHEAD_MAX_CONCURRENT_CONN_DEFAULT = 2;
+  public static final String ADL_EVENTS_TRACKING_SOURCE =
+      "adl.events.tracking.source";
+  public static final String ADL_EVENTS_TRACKING_CLUSTERNAME =
+      "adl.events.tracking.clustername";
+  public static final String ADL_TRACKING_JOB_ID = "adl.tracking.job.id";
{code}

Please be consistent with all lowercase config names, and document them in 
{core-default.xml}.

Need to run.. more comments later.



> Support Microsoft Azure Data Lake - as a file system in Hadoop
> --------------------------------------------------------------
>
>                 Key: HADOOP-12666
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12666
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/azure, tools
>            Reporter: Vishwajeet Dusane
>            Assignee: Vishwajeet Dusane
>         Attachments: HADOOP-12666-002.patch, HADOOP-12666-003.patch, 
> HADOOP-12666-004.patch, HADOOP-12666-005.patch, HADOOP-12666-1.patch
>
>   Original Estimate: 336h
>          Time Spent: 336h
>  Remaining Estimate: 0h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Microsoft 
> Azure Data Lake Store (ADL) from within Hadoop. This would enable existing 
> Hadoop applications such has MR, HIVE, Hbase etc..,  to use ADL store as 
> input or output.
>  
> ADL is ultra-high capacity, Optimized for massive throughput with rich 
> management and security features. More details available at 
> https://azure.microsoft.com/en-us/services/data-lake-store/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop

Reply via email to