People may consider monoliths bad smells but that doesn't mean they're right. Facebook's git repository was reported to be 54 GB in April last year [1]. Google store *everything* in one repo [2]. We are obviously different to these mega organisations but I'd like to see more justification than fashion.

My hunch says the separation is reasonable, despite introducing its own set of annoyances. It would be interesting to confirm this statisically. What files/modules are generally changed together?

Sam

1. https://twitter.com/feross/status/459259593630433280
2. http://www.engadget.com/2015/09/18/googles-codebase-is-ludicrously-huge-for-good-reason/

On 26/11/2015 11:33, Alex Heneveld wrote:

Richard-

In case you missed it:

> It would be possible to put it into *brooklyn* but that
> would be rather confusing for people who land on that project and see a
> bunch of code but nothing useful; if anything of substance were in
> *brooklyn* people would probably expect it to be *brooklyn-server* which is
> a possibility but the consensus has been that it is better to keep it
> extremely light so as not to mask the other projects, any of which might be
> what someone visiting would be interested in.

To be blunt, putting the dist folders into `apache/brooklyn` feels like putting dirty laundry in a shop window.

Also worth noting that in recent communities, like Go, and Chef, and NodeJS, projects will routines check out LOTS of other git projects. The monolith codebase is a bad small to many people.

To expand and address your point:

> All of these [FILES] except pom.xml would also be present in every *other*
> repository, with minor modifications. This repo is not pulling its
> weight.

The READMEs will be dramatically different and if we moved `dist` to `brooklyn` then the README -- and the project -- would refer to how to build the dist. Which I think is NOT what anyone coming to http://github.com/apache/brooklyn would expect to see. That in itself causes me to favour the separate project for dist.

There has also been discussion -- generally positive -- for adding git submodules to apache/brooklyn. Whatever form it takes, we want `apache/brooklyn` to facilitate getting ALL the projects. This is a different intention to the `brooklyn-dist` repo -- which for most of its purposes won't even require that the other projects be checked out. And it would be confused by combining the projects.

In short, in order to encourage contribution I want to go for the project structure that is the most welcoming, both to older big-codebase developers like me, and to the more recent more-projects=better school; simplifying our lives is a consideration but not an overwhelming one in this instance. If apache/brooklyn even as a very small project makes the welcome experience nicer then I think it does more than carry its weight.

Best
Alex


On 26/11/2015 11:12, Richard Downer wrote:
Sorry, I missed the [DISCUSS] thread and posted on the [VOTE] thread.

But to repeat:

* brooklyn - all files in the root (no subdirs)
The files in the root are:

LICENSE
NOTICE
README.md
.gitignore
.gitattribues
pom.xml

All of these except pom.xml would also be present in every *other*
repository, with minor modifications. This repo is not pulling its
weight.

* brooklyn-dist
     usage/all
     usage/dist
     usage/scripts
     usage/downstream-parent
     usage/archetypes
My recommendation: drop `brooklyn-dist` and put all of this stuff into
`brooklyn`.

I'm with Mike on this one; 6 repositories - now 7 if there's a new one
for the Go CLI - risks overcomplicating things (and not just for
newcomers).

A top-level project for the distribution and odds and ends, plus
projects for "server", "web UI", "CLI" and "docs" is IMO acceptable -
to most observers it's a logical split and people would know where
they need to work. I'd accept "library" on the basis that we're trying
to obsolete it by having a catalog of pure-YAML blueprints, but I
suspect that it is going to hang around for quite a long time.

Richard.

On 26 November 2015 at 01:44, Alex Heneveld
<[email protected]> wrote:
// migrating Mike's +0 comments to this thread

I think the proposed repo breakdown+organization is very logical, but my
concern is that it spreads everything out amongst too many repos,
potentially making it more difficult for newcomers to get a handle on
things and creating too many scenarios with groups of dependent PRs across
several repos.
I'm glad you raised this. I grew up with big codebase projects and have a
soft spot for them for the reasons you bring up.  But there is a trend
towards multiple smaller projects, and I've had a fair amount of feedback
that the brooklyn codebase is big and hard to get to grips with.  While
sub-projects will make it a little harder to get to grips with all of it, it should simplify a lot where someone wants to get to grips with a part of it. And a potential contributor would be looking to contribute to just a part in the first instance. You need to worry about the JS or Go projects
if you're working on server; and (even simpler for a new start) someone
working on the JS GUI or the Go client doesn't need to ever look at the
server.

The worst part I agree is likely to be the cross-project-PR-set but it
should only hit someone making a complex change (esp the REST API and the
JS or CLI client) -- thus the pain is reserved for us and (we hope!)
alleviated for for everyone else.

The best part I think will be encouraging independent JS UI development -- where someone can run it with `grunt` pointing a binary-downloaded brooklyn and not have to rebuild server or touch java -- and the same for BOM yaml
files in library.

Best
Alex


On 25 November 2015 at 23:03, Alex Heneveld <[email protected]
wrote:
Hi All-

Here is a summary of the recent "git repos" thread and online chat and
what we are currently proposing. I will follow-up with a [VOTE] thread for
the creation of the repos and project migration.  Please reply to this
thread if you want to discuss; leave that thread for casting votes.

We are planning to request the following git repos, mirrored at
github.com/apache/* :

* brooklyn - just pointers / summaries / scripts for working with the
other projects
* brooklyn-server - everythng to run the Brooklyn server (w REST API and
Karaf)
* brooklyn-client - CLI for accessing REST server (in go; subject of other
vote)
* brooklyn-ui - the JS GUI
* brooklyn-library - blueprints and tools which run on brooklyn-server
* brooklyn-docs - documentation (in markdown/ruby)
* brooklyn-dist - for building distros, incl source + binary tgz w all the
above

More detail of the content of these repos is below.

The motivation for this is so that people can check out smaller projects and focus more easily on the pieces relevant to them. IE someone working on the UI or on the docs should not even need to look at server or dist. In addition languages are consistent within projects. There will be times when a change necessitates PR's to multiple projects (e.g. new UI feature with support in REST API) but we believe the above split will minimise that.

The addition of the *brooklyn-dist* project is the only change which has
not been discussed at some length but its need was obvious when we
discussed it. (It would be possible to put it into *brooklyn* but that would be rather confusing for people who land on that project and see a
bunch of code but nothing useful; if anything of substance were in
*brooklyn* people would probably expect it to be *brooklyn-server* which is
a possibility but the consensus has been that it is better to keep it
extremely light so as not to mask the other projects, any of which might be
what someone visiting would be interested in.)

There was also some discussion about the *brooklyn-server* project being called *brooklyn-commons* instead. The idea of a grassy commons is nice but *server* is a more descriptive and accurate name (as most of the other
projects don't depend on it per se).

Other key points:

* The releases we make will continue to look the same: the dist project will make a single big source and big binary, and maven artifacts for all maven projects. Automation in the brooklyn project or the brooklyn-dist
project will build the projects in all the other repos to minimise the
impact of the multiple repositories.
* If we include a submodules setup in *brooklyn* it will be optional;
people who don't like submodules won't have to use them. We may instead find that it is more convenient to provide scripts for working with the
other git modules.
* When we transfer the code to these repos we will exclude the offensive big files in the history. The incubator brooklyn repo will continue to exist but we will mark it as deprecated forwarding to the new location.

IMPORTANT:  When we do this there will obviously be an impact on pull
requests and branch development. We will endeavor to work through the PR queue and we will give some notice before the changes. If you miss this it won't be hard to take diffs and apply them to the different structure but
it will be tedious!

Best
Alex


* brooklyn
     empty for now, contains a README and instructions/scripts for
subprojects;
     e.g. git submodules

* brooklyn-dist
     usage/all -> all
usage/dist -> dist
     usage/scripts -> scripts
     usage/downstream-parent
     usage/archetypes
     karaf (distro + other features - e.g. jsgui)
     ** new project brooklyn-server-cli which injects the JSGUI

* brooklyn-server
     parent
     api, core, policy
     util/*
     usage/rest-*
     camp/*
     usage/camp
     usage/cli -> usage/server-cli-abstract
         (assumes jsgui war is on classpath; injects into launcher)
     usage/logback-*
     usage/launcher (refactor so that WAR gets injected by server CLI)
     storage/hazelcast
     usage/qa
     usage/test-*
     karaf (code + itest + features needed for itest)

* brooklyn-ui
     jsgui

* brooklyn-client [subject of a separate vote]
     (new project, in go)

* brooklyn-library
software/*
     examples/*
     sandbox/* (?!)

* brooklyn-docs
     docs

END




Reply via email to