Re: [DISCUSS] new repos and project migration

Sam Corbett Thu, 26 Nov 2015 05:13:50 -0800

People may consider monoliths bad smells but that doesn't mean they'reright. Facebook's git repository was reported to be 54 GB in April lastyear [1]. Google store *everything* in one repo [2]. We are obviouslydifferent to these mega organisations but I'd like to see morejustification than fashion.

My hunch says the separation is reasonable, despite introducing its ownset of annoyances. It would be interesting to confirm this statisically.What files/modules are generally changed together?


Sam

1. https://twitter.com/feross/status/459259593630433280

2.http://www.engadget.com/2015/09/18/googles-codebase-is-ludicrously-huge-for-good-reason/


On 26/11/2015 11:33, Alex Heneveld wrote:

Richard-

In case you missed it:

> It would be possible to put it into *brooklyn* but that
> would be rather confusing for people who land on that project and see a
> bunch of code but nothing useful; if anything of substance were in
> *brooklyn* people would probably expect it to be *brooklyn-server*which is
> a possibility but the consensus has been that it is better to keep it
> extremely light so as not to mask the other projects, any of whichmight be
> what someone visiting would be interested in.
To be blunt, putting the dist folders into `apache/brooklyn` feelslike putting dirty laundry in a shop window.
Also worth noting that in recent communities, like Go, and Chef, andNodeJS, projects will routines check out LOTS of other git projects.The monolith codebase is a bad small to many people.
To expand and address your point:
> All of these [FILES] except pom.xml would also be present in every*other*
> repository, with minor modifications. This repo is not pulling its
> weight.
The READMEs will be dramatically different and if we moved `dist` to`brooklyn` then the README -- and the project -- would refer to how tobuild the dist. Which I think is NOT what anyone coming tohttp://github.com/apache/brooklyn would expect to see. That initself causes me to favour the separate project for dist.
There has also been discussion -- generally positive -- for adding gitsubmodules to apache/brooklyn. Whatever form it takes, we want`apache/brooklyn` to facilitate getting ALL the projects. This is adifferent intention to the `brooklyn-dist` repo -- which for most ofits purposes won't even require that the other projects be checkedout. And it would be confused by combining the projects.
In short, in order to encourage contribution I want to go for theproject structure that is the most welcoming, both to olderbig-codebase developers like me, and to the more recentmore-projects=better school; simplifying our lives is a considerationbut not an overwhelming one in this instance. If apache/brooklyn evenas a very small project makes the welcome experience nicer then Ithink it does more than carry its weight.
Best
Alex


On 26/11/2015 11:12, Richard Downer wrote:
Sorry, I missed the [DISCUSS] thread and posted on the [VOTE] thread.

But to repeat:
* brooklyn - all files in the root (no subdirs)
The files in the root are:

LICENSE
NOTICE
README.md
.gitignore
.gitattribues
pom.xml

All of these except pom.xml would also be present in every *other*
repository, with minor modifications. This repo is not pulling its
weight.
* brooklyn-dist
     usage/all
     usage/dist
     usage/scripts
     usage/downstream-parent
     usage/archetypes
My recommendation: drop `brooklyn-dist` and put all of this stuff into
`brooklyn`.

I'm with Mike on this one; 6 repositories - now 7 if there's a new one
for the Go CLI - risks overcomplicating things (and not just for
newcomers).

A top-level project for the distribution and odds and ends, plus
projects for "server", "web UI", "CLI" and "docs" is IMO acceptable -
to most observers it's a logical split and people would know where
they need to work. I'd accept "library" on the basis that we're trying
to obsolete it by having a catalog of pure-YAML blueprints, but I
suspect that it is going to hang around for quite a long time.

Richard.

On 26 November 2015 at 01:44, Alex Heneveld
<[email protected]> wrote:
// migrating Mike's +0 comments to this thread
I think the proposed repo breakdown+organization is very logical,but my
concern is that it spreads everything out amongst too many repos,
potentially making it more difficult for newcomers to get a handle on
things and creating too many scenarios with groups of dependent PRsacross
several repos.
I'm glad you raised this. I grew up with big codebase projects andhave a
soft spot for them for the reasons you bring up.  But there is a trend
towards multiple smaller projects, and I've had a fair amount offeedback
that the brooklyn codebase is big and hard to get to grips with.  While
sub-projects will make it a little harder to get to grips with allof it,it should simplify a lot where someone wants to get to grips with apart ofit. And a potential contributor would be looking to contribute tojust apart in the first instance. You need to worry about the JS or Goprojects
if you're working on server; and (even simpler for a new start) someone
working on the JS GUI or the Go client doesn't need to ever look at the
server.

The worst part I agree is likely to be the cross-project-PR-set but it
should only hit someone making a complex change (esp the REST APIand the
JS or CLI client) -- thus the pain is reserved for us and (we hope!)
alleviated for for everyone else.
The best part I think will be encouraging independent JS UIdevelopment --where someone can run it with `grunt` pointing a binary-downloadedbrooklynand not have to rebuild server or touch java -- and the same for BOMyaml
files in library.

Best
Alex
On 25 November 2015 at 23:03, Alex Heneveld<[email protected]
wrote:
Hi All-

Here is a summary of the recent "git repos" thread and online chat and
what we are currently proposing. I will follow-up with a [VOTE]thread for
the creation of the repos and project migration.  Please reply to this
thread if you want to discuss; leave that thread for casting votes.

We are planning to request the following git repos, mirrored at
github.com/apache/* :

* brooklyn - just pointers / summaries / scripts for working with the
other projects
* brooklyn-server - everythng to run the Brooklyn server (w RESTAPI and
Karaf)
* brooklyn-client - CLI for accessing REST server (in go; subjectof other
vote)
* brooklyn-ui - the JS GUI
* brooklyn-library - blueprints and tools which run on brooklyn-server
* brooklyn-docs - documentation (in markdown/ruby)
* brooklyn-dist - for building distros, incl source + binary tgz wall the
above

More detail of the content of these repos is below.
The motivation for this is so that people can check out smallerprojectsand focus more easily on the pieces relevant to them. IE someoneworkingon the UI or on the docs should not even need to look at server ordist.In addition languages are consistent within projects. There willbe timeswhen a change necessitates PR's to multiple projects (e.g. new UIfeaturewith support in REST API) but we believe the above split willminimise that.
The addition of the *brooklyn-dist* project is the only changewhich has
not been discussed at some length but its need was obvious when we
discussed it. (It would be possible to put it into *brooklyn* butthatwould be rather confusing for people who land on that project andsee a
bunch of code but nothing useful; if anything of substance were in
*brooklyn* people would probably expect it to be *brooklyn-server*which is
a possibility but the consensus has been that it is better to keep it
extremely light so as not to mask the other projects, any of whichmight be
what someone visiting would be interested in.)
There was also some discussion about the *brooklyn-server* projectbeingcalled *brooklyn-commons* instead. The idea of a grassy commons isnicebut *server* is a more descriptive and accurate name (as most ofthe other
projects don't depend on it per se).

Other key points:
* The releases we make will continue to look the same: the distprojectwill make a single big source and big binary, and maven artifactsfor allmaven projects. Automation in the brooklyn project or thebrooklyn-dist
project will build the projects in all the other repos to minimise the
impact of the multiple repositories.
* If we include a submodules setup in *brooklyn* it will be optional;
people who don't like submodules won't have to use them. We mayinsteadfind that it is more convenient to provide scripts for working withthe
other git modules.
* When we transfer the code to these repos we will exclude theoffensivebig files in the history. The incubator brooklyn repo willcontinue toexist but we will mark it as deprecated forwarding to the newlocation.
IMPORTANT:  When we do this there will obviously be an impact on pull
requests and branch development. We will endeavor to work throughthe PRqueue and we will give some notice before the changes. If you missthis itwon't be hard to take diffs and apply them to the differentstructure but
it will be tedious!

Best
Alex


* brooklyn
     empty for now, contains a README and instructions/scripts for
subprojects;
     e.g. git submodules

* brooklyn-dist
     usage/all -> all
usage/dist -> dist
     usage/scripts -> scripts
     usage/downstream-parent
     usage/archetypes
     karaf (distro + other features - e.g. jsgui)
     ** new project brooklyn-server-cli which injects the JSGUI

* brooklyn-server
     parent
     api, core, policy
     util/*
     usage/rest-*
     camp/*
     usage/camp
     usage/cli -> usage/server-cli-abstract
         (assumes jsgui war is on classpath; injects into launcher)
     usage/logback-*
     usage/launcher (refactor so that WAR gets injected by server CLI)
     storage/hazelcast
     usage/qa
     usage/test-*
     karaf (code + itest + features needed for itest)

* brooklyn-ui
     jsgui

* brooklyn-client [subject of a separate vote]
     (new project, in go)

* brooklyn-library
software/*
     examples/*
     sandbox/* (?!)

* brooklyn-docs
     docs

END

Re: [DISCUSS] new repos and project migration

Reply via email to