[
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vishwajeet Dusane updated HADOOP-12666:
---------------------------------------
Attachment: HADOOP-12666-007.patch
>From the code review comments, there were 5 major issues that we need to call
>out, we have listed below the actions we have taken to address those comments
* *Synchronization during read stream* – Though the use case seems to be
unusual to allow concurrent access on stream. Accepted comment to have
synchronized blocks across. Synchronized blocks are causing additional Avg. 7ms
latency during read operation.
* *Patch size too large & inclusion of Live test cases* – we have Split the
patch into multiple JIRAs
** HADOOP-12666 - core updates
** HADOOP-12875 - Test updates including mechanism for live updates.
** HADOOP-12876 - file metadata cache management
** Separate JIRAs for Telemetry and instrumentation related updates would be
raised once we agree and commit Hadoop-12666
* *FileStatus Cache management* – Raised separate JIRA HADOOP-12876 to cover
specific discussion around the cache.
* *Package namespace (Remove dependency from org.apache.hadoop.hdfs.web)* – In
order to remove dependency from org.apache.hadoop.hdfs.web package. One of the
proposal (https://reviews.apache.org/r/44169/)is to modify access level for the
dependent parts in org.apache.hadoop.hdfs.web package to public.
* *Allow Webhdfs and Adl file system to coexist (Common configuration
parameter like dfs.webhdfs.oauth2.refresh.token )* – As of today only Adl is
compliant to oauth2 protocol. To support only adl specific configuration
requires changes in ASF code. We would like to take this as separate change set
than to cover as part of this change set.
> Support Microsoft Azure Data Lake - as a file system in Hadoop
> --------------------------------------------------------------
>
> Key: HADOOP-12666
> URL: https://issues.apache.org/jira/browse/HADOOP-12666
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs, fs/azure, tools
> Reporter: Vishwajeet Dusane
> Assignee: Vishwajeet Dusane
> Attachments: HADOOP-12666-002.patch, HADOOP-12666-003.patch,
> HADOOP-12666-004.patch, HADOOP-12666-005.patch, HADOOP-12666-006.patch,
> HADOOP-12666-007.patch, HADOOP-12666-1.patch
>
> Original Estimate: 336h
> Time Spent: 336h
> Remaining Estimate: 0h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Microsoft
> Azure Data Lake Store (ADL) from within Hadoop. This would enable existing
> Hadoop applications such has MR, HIVE, Hbase etc.., to use ADL store as
> input or output.
>
> ADL is ultra-high capacity, Optimized for massive throughput with rich
> management and security features. More details available at
> https://azure.microsoft.com/en-us/services/data-lake-store/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)