Michael McCandless wrote:
I think, in order to stop duplicating our analysis code across Nutch/Solr/Lucene, we should separate out the analyzers into a standalone package, and maybe as its own sub-project under the Lucene tlp?
Is the goal to release these on a separate schedule from Lucene Java? If so, then this makes sense, if not, then perhaps this could be simply a separate source code tree in Lucene Java built as separate jars.
Where would the analyzer APIs live, in the core or in the analyzer tree? My guess is that they'd live in the core, and that the analyzer tree would depend on the core, but one might do it the other way around if one felt there were non-Lucene uses for analyzers.
Note that subprojects with different committer lists are an anti-pattern at Apache. We've long done this in Lucene, but have recently been asked by the board to consider breaking most subprojects into their own TLPs. Would analyzers someday make sense as an indepdendent TLP? If not, then a subproject with disjoint committers might not be the right pattern.
Doug
