Hi,

Just found out about this discussion so I realize I'm stepping in rather late with my feedback... still for what it's worth, here it is :-).

In general I'm against this proposal as I believe it's can cause more harm than good. The way I (and many others) see Lucene is as a separate effort than Solr. I'm *big* fan of Solr and (as some of you may know) I'm using it daily and promoting it where/when I can. That said, I'm also a big fan of Lucene and I believe Solr has its value and use cases while Lucene has its own.

Joining Solr with Lucene has the potential of creating a "virtual" monopoly over Solr-like solutions built on top of Lucene which is not community friendly but more importantly it puts the competition for Solr in jeopardy. IMO competition is a key advantage for products/projects. Yes, there is competition that will always come from the commercial vendors, but competition and challenges must also come from the open source community. This a big part of what drives innovation. Furthermore, the community and the users of Lucene should have the power/ability to decide on which solutions they want to go for - this is true community driven development way.

I fully agree that there are many duplication in the work that is currently being done in Solr. But it mainly originates in Solr not in Lucene and the Lucene community should not be bothered by that. Such duplicate work should be addressed in the Solr project. So for example, take the analysis code... if all the work that has gone into the analyzers in Solr would have been committed in Lucene from the start, there wouldn't have been duplications. Same goes for the spatial support or other duplicate work. Solr development certainly proven to push Lucene development in many ways, and the best way to handle it is to contribute back all this goodness to Lucene. And yes, it means that Solr releases will need to wait for official Lucene releases, or in the mean time have their own custom Lucene distributions, but this is the fair play that all Lucene based solutions (let it be Katta, ElasticSearch, Sensei, or any other) will have to deal with.

 Merging committers.
I believe this will create a proliferation of commiters on these projects which can bring a lot of mess. Let Lucene commiters focus on what they do and know best - which is Lucene, and let Solr committer focus on Solr. If a Solr committer can bring a lot of value to Lucene, then yes, sure, make him/her a Lucene committers, but IMO being a Solr committer doesn't automatically give anyone the credentials or the skills to be a Lucene committer... mainly because the work done is Solr is often at a higher level and often not related to Lucene at all.
Single source for all the code dup we now have across the
    projects (my original reason, specifically on analyzers, for
    starting this).
As mentioned above, this can easily be done by contributing the changes to the analyzers back to Lucene.

Whenever a new feature is added to Lucene, we'd work through what
    the impact is to Solr.  This can still mean we separately develop
    exposure in Solr, but it'd get us to at least more immediately
    think about it.
This is something that Solr committers need to be responsible for, not lucene commiters. Lucene committers need to make sure that Lucene works and is bug free. I don't think it makes sense to push Solr responsibilities on to Lucene committers.

Solr is Lucene's biggest direct user -- most people who use Lucene
    use it through Solr -- so having it more closely integrated means
    we know sooner if we broke something.
I disagree here. I believe Lucene still has larger install base than Solr. Think of Jackrabbit which uses Lucene directly and all the CMSs that use Jackrabbit. Think of frameworks like Compass and Hibernate Search (that use Lucene directly) which are used in a lot of JEE deployments around the world. And certainly there are a lot of large infrastructures that use Lucene directly as well (as in LinkedIn for example). Solr is great in what it does but it is certainly not everything when it comes to open source search or Lucene.

Right now I could test whether flex breaks anything in Solr.  I
    can't do that now since Solr is isn't upgraded to 3.1.
True, but again, this is an issue Solr committers will have to deal with. And yes, it means that Solr will almost always be one step behind Lucene, but that's how it works with every dependency on every library you use. If you want to test the flex stuff and it's currently being developed as a separate lucene branch, then you can create a separate Solr branch to see how it works and what future changes might need to go into Solr. Again, Lucene committers shouldn't bother with this problem and the development of Lucene shouldn't be effected due Solr related issues.

Also take into account the huge difference in the release cycles between the projects. Lucene has quite a steady release cycle (last year it was quite constant on a release every 3 months or so). Solr on the other hand, has longer release cycles that can span more than a year. A lot of the issues that stall Solr releases have nothing to do with Lucene and Lucene release cycle shouldn't suffer from that. Furthermore users/projects/products that use Lucene directly should not suffer from that as well. All the goodness that is developed in Lucene and all the bug fixes should be available to Lucene users to download as soon as they're ready - they don't need to suffer from any Solr related issues.

Please rest assure that my goal here is not to step on anyone's toes. I'm not a committer on either project but I certainly want to see these two projects go the right direction (at least the direction I believe is right). So just wanted to express my concerns here.

Keep up the good work!

Cheers,
Uri

Reply via email to