[jira] [Updated] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop

Vishwajeet Dusane (JIRA) Mon, 07 Mar 2016 03:42:07 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vishwajeet Dusane updated HADOOP-12666:
---------------------------------------
    Attachment: HADOOP-12666-007.patch

>From the code review comments, there were 5 major issues that we need to call 
>out, we have listed below the actions we have taken to address those comments

 * *Synchronization during read stream* – Though the use case seems to be 
unusual to allow concurrent access on stream. Accepted comment to have 
synchronized blocks across. Synchronized blocks are causing additional Avg. 7ms 
latency during read operation.
 * *Patch size too large & inclusion of Live test cases* – we have Split the 
patch into multiple JIRAs
 ** HADOOP-12666 - core updates
 ** HADOOP-12875 - Test updates including mechanism for live updates. 
 ** HADOOP-12876 -  file metadata cache management
 ** Separate JIRAs for Telemetry and instrumentation related updates would be 
raised once we agree and commit Hadoop-12666
 * *FileStatus Cache management* – Raised separate JIRA HADOOP-12876 to cover 
specific discussion around the cache.
 * *Package namespace (Remove dependency from org.apache.hadoop.hdfs.web)* – In 
order to remove dependency from org.apache.hadoop.hdfs.web package. One of the 
proposal (https://reviews.apache.org/r/44169/)is to modify access level for the 
dependent parts in org.apache.hadoop.hdfs.web package to public. 
 * *Allow Webhdfs and Adl file system to coexist (Common configuration 
parameter like dfs.webhdfs.oauth2.refresh.token )* – As of today only Adl is 
compliant to oauth2 protocol. To support only adl specific configuration 
requires changes in ASF code. We would like to take this as separate change set 
than to cover as part of this change set.   


> Support Microsoft Azure Data Lake - as a file system in Hadoop
> --------------------------------------------------------------
>
>                 Key: HADOOP-12666
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12666
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/azure, tools
>            Reporter: Vishwajeet Dusane
>            Assignee: Vishwajeet Dusane
>         Attachments: HADOOP-12666-002.patch, HADOOP-12666-003.patch, 
> HADOOP-12666-004.patch, HADOOP-12666-005.patch, HADOOP-12666-006.patch, 
> HADOOP-12666-007.patch, HADOOP-12666-1.patch
>
>   Original Estimate: 336h
>          Time Spent: 336h
>  Remaining Estimate: 0h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Microsoft 
> Azure Data Lake Store (ADL) from within Hadoop. This would enable existing 
> Hadoop applications such has MR, HIVE, Hbase etc..,  to use ADL store as 
> input or output.
>  
> ADL is ultra-high capacity, Optimized for massive throughput with rich 
> management and security features. More details available at 
> https://azure.microsoft.com/en-us/services/data-lake-store/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop

Reply via email to