[
https://issues.apache.org/jira/browse/HADOOP-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HADOOP-13687:
-----------------------------------
Attachment: HADOOP-13687-trunk.002.patch
HADOOP-13687-branch-2.002.patch
I'm attaching revision 002 patches for trunk and branch-2, with the following
changes:
# [[email protected]] suggested that the artifact name {{hadoop-hcfs-client}}
might still be misleading, something else that could make people think it
really sweeps in every HCFS in the world. I switched the name to
{{hadoop-cloud-storage}}.
# The Jackson dependency problem that I mentioned earlier is really a bug in
the hadoop-aws build, and it's just a coincidence that I spotted it while
testing this new artifact. I have filed HADOOP-13692 with its own patch to
track that separately.
I repeated the same testing mentioned earlier with a custom application that
depends on the new artifact.
{quote}
On the patch, from a build-system perspective, I think it makes more sense to
create a hadoop-hcfs dir, move the various file systems out of tools, and put
this code in there with them. This way there is a clear path of what is
expected especially if/when more file systems get added. It's also an
opportunity to pull these out of the tools dir in the distribution and actually
make them separate components. That'd make life easier in lots of ways.
{quote}
There was some discussion a few months ago on the dev list about a separate
sub-tree for the file systems, but the participants concluded that it wasn't
valuable. Can you describe in more detail what problems you see with the
current structure and how separating the file systems out of hadoop-tools makes
life easier? Maybe we missed something. hadoop-tools is getting bloated, but
I figured Hadoop 3 shell profiles with classpath customization was sufficient
to mitigate that.
I'm reluctant to take on a revamp of the tree in scope of this JIRA, but maybe
we can lay some groundwork.
> Provide a unified dependency artifact that transitively includes the
> Hadoop-compatible file systems shipped with Hadoop.
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-13687
> URL: https://issues.apache.org/jira/browse/HADOOP-13687
> Project: Hadoop Common
> Issue Type: Improvement
> Components: build
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: HADOOP-13687-branch-2.001.patch,
> HADOOP-13687-branch-2.002.patch, HADOOP-13687-trunk.001.patch,
> HADOOP-13687-trunk.002.patch
>
>
> Currently, downstream projects that want to integrate with different
> Hadoop-compatible file systems like WASB and S3A need to list dependencies on
> each one. This creates an ongoing maintenance burden for those projects,
> because they need to update their build whenever a new Hadoop-compatible file
> system is introduced. This issue proposes adding a new artifact that
> transitively includes all Hadoop-compatible file systems. Similar to
> hadoop-client, this new artifact will consist of just a pom.xml listing the
> individual dependencies. Downstream users can depend on this artifact to
> sweep in everything, and picking up a new file system in a future version
> will be just a matter of updating the Hadoop dependency version.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]