Lucene?

Grant Ingersoll Wed, 24 Feb 2010 13:05:28 -0800

On Feb 24, 2010, at 2:09 PM, Doug Cutting wrote:

> Michael McCandless wrote:
>> I think, in order to stop duplicating our analysis code across
>> Nutch/Solr/Lucene, we should separate out the analyzers into a
>> standalone package, and maybe as its own sub-project under the Lucene
>> tlp?
> 
> Is the goal to release these on a separate schedule from Lucene Java? If so, 
> then this makes sense, if not, then perhaps this could be simply a separate 
> source code tree in Lucene Java built as separate jars.
> 
> Where would the analyzer APIs live, in the core or in the analyzer tree?  My 
> guess is that they'd live in the core, and that the analyzer tree would 
> depend on the core, but one might do it the other way around if one felt 
> there were non-Lucene uses for analyzers.
> 
> Note that subprojects with different committer lists are an anti-pattern at 
> Apache.  We've long done this in Lucene, but have recently been asked by the 
> board to consider breaking most subprojects into their own TLPs.


Yeah, I've seen rumblings of this, but not sure why it is a big deal here.  
Many of Lucene's projects are related and interoperate with some committer 
overlap, but not all.  For instance, Lucene.NET and PyLucene don't have a lot 
of overlap committer wise, but it would be silly for them to be TLPs.  To me, 
Lucene has spun off subprojects when it makes sense, i.e. Hadoop and 
potentially Mahout in the near future, but otherwise, "if it ain't broke, don't 
fix it". 

>  Would analyzers someday make sense as an indepdendent TLP?  If not, then a 
> subproject with disjoint committers might not be the right pattern.
> 

In my mind, I think all current committers for Lucene/Nutch/Solr would be 
committers on this new project.

-Grant

Re: Factor out a standalone, shared analysis package for Nutch/Solr/Lucene?

Reply via email to