I'm not a committer here (or on SOLR), so I can't vote, but I'm generally against this. but on the flip side I've been using SOLR for quite a while.

firstly SOLR is not the only application that uses lucene as a webservice.

waiting for SOLR developers to implement re-factorings and changes made to the core will hamper lucene development. and things like katta, elastic search, neo4j, and zoie will be treated like 2nd class citizens and suffer.

It will also hamper innovative new developments, as now 'oh.. this will break SOLR', or 'SOLR can't use that easily' will stop them. I'm curious how the NRT enhancements and payload changes would have gone if they had to wait for SOLR to change stuff to make them work. and most of the SOLR dev's are on the lucene dev list anyway.

SOLR should just be treated like any API user of lucene and lucene should not be limited by SOLR.


as for the original reason.. I support breaking out the analyzers and making them more generic, or pushing down the changes SOLR (and nutch and whoever)
have made back into the core.

as for the assertion that SOLR is the largest user of lucene, I don't even know how you could back that up, and even if it is today, that might change tomorrow.
The web is a fickle place.

so.. I'm pretty happy with how things are going today. lucene is a library that other things can include. SOLR is a webservice using lucene.


On 2/28/10 5:57 AM, Michael McCandless wrote:
To make this more concrete, I think this is roughly what's being
proposed:

   * Merging the dev lists into a single list.

   * Merging committers.

   * When a change it committed to Lucene, it must pass all Solr
     tests.

   * Release both at once.

These things would not change:

   * Most importantly, the source code would remain factored into
     separate dirs/modules.

   * User's lists should remain separate.

   * Web sites would remain separate.

   * Solr&  Lucene are still separate downloads, separate JARs,
     seperate subdirs in the source tree, etc.

The outside world still sees Solr&  Lucene as separate entities.  It's
only that they would now be developed/released in synchrony.

There are some important gains by doing this:

   * Single source for all the code dup we now have across the
     projects (my original reason, specifically on analyzers, for
     starting this).

   * Whenever a new feature is added to Lucene, we'd work through what
     the impact is to Solr.  This can still mean we separately develop
     exposure in Solr, but it'd get us to at least more immediately
     think about it.

   * Solr is Lucene's biggest direct user -- most people who use Lucene
     use it through Solr -- so having it more closely integrated means
     we know sooner if we broke something.

   * Right now I could test whether flex breaks anything in Solr.  I
     can't do that now since Solr is isn't upgraded to 3.1.

Recent big changes (eg segment based searching, Version, attr based
tokenstream api) caused alot of work in Solr that could've been much
smoother had Solr "been there" as we were working through them.

Recent new features, eg near-real-time search, which are unavailable
in Solr still, would have at least had some discussion about how to
expose this in Solr.

Over time (and we don't have to do this right on day 1) we can make
core capabilities available to pure Lucene.  EG core Lucene users
should be able to use faceting, use a schema, etc.

I think this idea makes alot of sense and I think now is a good time
to do it.  Yes, this a big change, but I think the gains are sizable.
As Lucene&  Solr diverge more, it'll only become harder and harder to
merge.

Robert's massive patch on SOLR-1657, upgrading most Solr's analyzers
to 3.0, is aging... while other changes to analyzers are being
proposed (SOLR-1799).  If we were integrated (or at least single
source for analyzers), Robert would already have committed it.

Mike

On Fri, Feb 26, 2010 at 5:20 PM, Yonik Seeley
<[email protected]>  wrote:
On Fri, Feb 26, 2010 at 5:15 PM, Steven A Rowe<[email protected]>  wrote:
On 02/24/2010 at 2:20 PM, Yonik Seeley wrote:
I've started to think that a merge of Solr and Lucene would be in the
best interest of both projects.
The Sorlucene :) merger could be achieved virtually, i.e. via policy, rather 
than physically merging:
Everything is virtual here anyway :-)
I agree with Mike that a single dev list is highly desirable.  There
would still be separate downloads.  What to do with some of the other
stuff is unspecified.

Committers would need to be merged though - that's the only way to
make a change across projects w/o breaking stuff.

-Yonik


Reply via email to