[ 
https://issues.apache.org/jira/browse/HADOOP-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HADOOP-13687:
-----------------------------------
    Attachment: HADOOP-13687-trunk.002.patch
                HADOOP-13687-branch-2.002.patch

I'm attaching revision 002 patches for trunk and branch-2, with the following 
changes:

# [[email protected]] suggested that the artifact name {{hadoop-hcfs-client}} 
might still be misleading, something else that could make people think it 
really sweeps in every HCFS in the world.  I switched the name to 
{{hadoop-cloud-storage}}.
# The Jackson dependency problem that I mentioned earlier is really a bug in 
the hadoop-aws build, and it's just a coincidence that I spotted it while 
testing this new artifact.  I have filed HADOOP-13692 with its own patch to 
track that separately.

I repeated the same testing mentioned earlier with a custom application that 
depends on the new artifact.

{quote}
On the patch, from a build-system perspective, I think it makes more sense to 
create a hadoop-hcfs dir, move the various file systems out of tools, and put 
this code in there with them. This way there is a clear path of what is 
expected especially if/when more file systems get added. It's also an 
opportunity to pull these out of the tools dir in the distribution and actually 
make them separate components. That'd make life easier in lots of ways.
{quote}

There was some discussion a few months ago on the dev list about a separate 
sub-tree for the file systems, but the participants concluded that it wasn't 
valuable.  Can you describe in more detail what problems you see with the 
current structure and how separating the file systems out of hadoop-tools makes 
life easier?  Maybe we missed something.  hadoop-tools is getting bloated, but 
I figured Hadoop 3 shell profiles with classpath customization was sufficient 
to mitigate that.

I'm reluctant to take on a revamp of the tree in scope of this JIRA, but maybe 
we can lay some groundwork.

> Provide a unified dependency artifact that transitively includes the 
> Hadoop-compatible file systems shipped with Hadoop.
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13687
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13687
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: build
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-13687-branch-2.001.patch, 
> HADOOP-13687-branch-2.002.patch, HADOOP-13687-trunk.001.patch, 
> HADOOP-13687-trunk.002.patch
>
>
> Currently, downstream projects that want to integrate with different 
> Hadoop-compatible file systems like WASB and S3A need to list dependencies on 
> each one.  This creates an ongoing maintenance burden for those projects, 
> because they need to update their build whenever a new Hadoop-compatible file 
> system is introduced.  This issue proposes adding a new artifact that 
> transitively includes all Hadoop-compatible file systems.  Similar to 
> hadoop-client, this new artifact will consist of just a pom.xml listing the 
> individual dependencies.  Downstream users can depend on this artifact to 
> sweep in everything, and picking up a new file system in a future version 
> will be just a matter of updating the Hadoop dependency version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to