[
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15286021#comment-15286021
]
Chris Douglas commented on HADOOP-12666:
----------------------------------------
Guessing that trainwreck on Jenkins was before HADOOP-13161. Will wait for an
all-clear before resubmitting.
[~vishwajeet.dusane], could you review the proposed changes to
{{CachedRefreshTokenBasedAccessTokenProvider}}? It does not reconfigure all
running clients on {{setConf}}; it only shares an instance among clients
created with the same ID. The replacement criteria were a guess based on the
parameters used by {{ConfRefreshTokenBasedAccessTokenProvider}}, so please
correct it as appropriate.
HADOOP-13037 will remove the dependency on WebHDFS, largely rewriting this
client. The buffering in {{PrivateAzureDataLakeFileSystem}} should also be
rewritten. It's implementing something like demand-paging, but some of the
[control
flow|https://issues.apache.org/jira/browse/HADOOP-12666?focusedCommentId=15283360&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15283360]
would be more powerful, and more understandable, if it were layered more
conventionally. Configuring the client is also very complex. I tried the
directions, but only arrived at a working client with Vishwajeet's help.
The target version is 2.9, but we should hold off on backporting this before
it's easier to use and maintain. I would like to commit the result of review
from [~cnauroth], [~eddyxu], [~twu], [~fabbri], and [~mackrorysd] to trunk.
It'll be easier to fixup the patch in targeted JIRAs. Committing the contract
tests in HADOOP-12875 would also be helpful. This would be with the caveats
from HDFS-9938: this module may be removed if it impedes WebHDFS development.
Further, it should be easier to configure before we include it in a release. Is
this an acceptable path forward?
> Support Microsoft Azure Data Lake - as a file system in Hadoop
> --------------------------------------------------------------
>
> Key: HADOOP-12666
> URL: https://issues.apache.org/jira/browse/HADOOP-12666
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs, fs/azure, tools
> Reporter: Vishwajeet Dusane
> Assignee: Vishwajeet Dusane
> Attachments: Create_Read_Hadoop_Adl_Store_Semantics.pdf,
> HADOOP-12666-002.patch, HADOOP-12666-003.patch, HADOOP-12666-004.patch,
> HADOOP-12666-005.patch, HADOOP-12666-006.patch, HADOOP-12666-007.patch,
> HADOOP-12666-008.patch, HADOOP-12666-009.patch, HADOOP-12666-010.patch,
> HADOOP-12666-011.patch, HADOOP-12666-012.patch, HADOOP-12666-013.patch,
> HADOOP-12666-014.patch, HADOOP-12666-1.patch
>
> Original Estimate: 336h
> Time Spent: 336h
> Remaining Estimate: 0h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Microsoft
> Azure Data Lake Store (ADL) from within Hadoop. This would enable existing
> Hadoop applications such has MR, HIVE, Hbase etc.., to use ADL store as
> input or output.
>
> ADL is ultra-high capacity, Optimized for massive throughput with rich
> management and security features. More details available at
> https://azure.microsoft.com/en-us/services/data-lake-store/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]