[
https://issues.apache.org/jira/browse/HADOOP-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HADOOP-13687:
-----------------------------------
Attachment: HADOOP-13687-trunk.003.patch
HADOOP-13687-branch-2.003.patch
I'm attaching revision 003 patches for trunk and branch-2, showing the
structure Steve suggested in his last comment.
{code}
hadoop-cloud-storage-project
|- hadoop-azure-datalake
`- hadoop-cloud-storage
{code}
The trunk patch now looks huge because of {{git mv
hadoop-tools/hadoop-azure-datalake hadoop-cloud-storage-project}}. The
branch-2 patch is still small, because {{hadoop-azure-datalake}} doesn't exist
there.
One thing that wasn't clear to me is if people are suggesting a change to just
the source layout or also the distro layout. Would we move the jars out of
share/hadoop/tools and into a new share/hadoop/cloud-storage directory? It
would be a backward-incompatible change, and I don't think it would add much
value, so I haven't made that change in this revision. If anyone wants to
lobby hard for a change in the distro layout, then we'll need additional
changes to introduce a {{hadoop-cloud-storage-dist}} module, with
{{hadoop-project-dist}} as its parent, the {{hadoop.component}} property set to
{{cloud-storage}}, and a new {{cloud-storage.xml}} descriptor file under
{{hadoop-assemblies}}.
bq. I think you could be more aggressive about the dependencies of the
openstack stuff; I suspect there is stuff there which could/should be tagged as
scope=provided, so tuning down the transitiveness more.
I haven't gone any further yet with this. Right now, the only additional
dependency that clients of {{hadoop-cloud-storage}} sweep in transitively is
commons-httpclient 3.1, which is required until we break that dependency
(tracked elsewhere in another JIRA). I really wanted to get rid of that
test-jar dependency though.
bq. Allen Wittenauer there's no chance of Yetus doing a mvn dependencies >
target/dependencies.txt operation on any patch which does poms? Or perhaps we
add the policy: all patches which update dependencies must attached the changed
dependency graph
I think this could potentially become a feature request for Yetus pre-commit to
run {{mvn dependency:list}} before and after the patch and diff the results.
If anything changes, it could render a -0 in the report (not blocking the
patch, but flagging that the dependency changes are worth further review).
> Provide a unified dependency artifact that transitively includes the
> Hadoop-compatible file systems shipped with Hadoop.
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-13687
> URL: https://issues.apache.org/jira/browse/HADOOP-13687
> Project: Hadoop Common
> Issue Type: Improvement
> Components: build
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: HADOOP-13687-branch-2.001.patch,
> HADOOP-13687-branch-2.002.patch, HADOOP-13687-branch-2.003.patch,
> HADOOP-13687-trunk.001.patch, HADOOP-13687-trunk.002.patch,
> HADOOP-13687-trunk.003.patch
>
>
> Currently, downstream projects that want to integrate with different
> Hadoop-compatible file systems like WASB and S3A need to list dependencies on
> each one. This creates an ongoing maintenance burden for those projects,
> because they need to update their build whenever a new Hadoop-compatible file
> system is introduced. This issue proposes adding a new artifact that
> transitively includes all Hadoop-compatible file systems. Similar to
> hadoop-client, this new artifact will consist of just a pom.xml listing the
> individual dependencies. Downstream users can depend on this artifact to
> sweep in everything, and picking up a new file system in a future version
> will be just a matter of updating the Hadoop dependency version.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]