Hey Mike, > This looks great!
Thanks! > > But, the goal is to make a standalone toolkit exposing GIS functions, > right? Yep you got it! > > My original question (integrating this into Lucene/Solr) remains. Sure, I think the goal would be to provide only the Spatial aspects required by Search (e.g., filters for documents, field types, etc.) as small classes in Lucene/Solr-land, and do the heavy lifting in the SIS project. > > EG there's alot of good working happening now in Solr to make spatial > search available. How will that find its way back to Lucene? Lucene > has its own (now duplicate) spatial package that was already > developed. Users will now be confused about the two, each have > different bugs/features, etc. I think as we move towards having an official SIS/spatial project and start to have releases/libraries, etc., it could partially help, but not totally alleviate, this issue. Cheers, Chris > > On Mon, Mar 1, 2010 at 1:28 PM, Mattmann, Chris A (388J) > <[email protected]> wrote: >> I'm glad that you brought that up! :) >> >> Check out: >> >> http://incubator.apache.org/projects/sis.html >> >> We're just starting to tackle that very issue right >> now...patches/ideas/contributions welcome. >> >> Cheers, >> Chris >> >> >> >> On 3/1/10 11:25 AM, "Michael McCandless" <[email protected]> wrote: >> >> Because the code dup with analyzers is only one of the problems to >> solve. In fact, it's the easiest of the problems to solve (that's why >> I proposed it, only, first). >> >> A more differentiating example is a much less mature module.... >> >> EG take spatial -- if Solr were its own TLP, how could spatial be >> built out in a way that we don't waste effort, and so that both direct >> Lucene and Solr users could use it when it's released? >> >> Mike >> >> On Mon, Mar 1, 2010 at 1:07 PM, Mattmann, Chris A (388J) >> <[email protected]> wrote: >>> Hi Mike, >>> >>> I'm not sure I follow this line of thinking: how would Solr being a TLP >>> affect the creation of a separate project/module for Analyzers any more so >>> than it not being a TLP? Both Lucene-java and Solr (as a TLP) could depend >>> on the newly created refactored Analysis project. >>> >>> Chris >>> >>> >>> >>> On 3/1/10 10:44 AM, "Michael McCandless" <[email protected]> wrote: >>> >>> If we don't somehow first address the code duplication across the 2 >>> projects, making Solr a TLP will make things worse. >>> >>> I started here with analysis because I think that's the biggest pain >>> point: it seemed like an obvious first step to fixing the code >>> duplication and thus the most likely to reach some consensus. And >>> it's also very timely: Robert is right now making all kinds of great >>> fixes to our collective analyzers (in between bouts of fuzzy DFA >>> debugging). >>> >>> But it goes beyond analyzers: I'd like to see other modules, now in >>> Solr, eventually moved to Lucene, because they really are "core" >>> functionality (eg facets, function (and other?) queries, spatial, >>> maybe improvements to spellchecker/highlighter). How can we do this? >>> >>> And how can we do this so that it "lasts" over time? If new cool >>> "core" things are born in Solr-land (which of course happens alot -- >>> lots of good healthy usage), how will they find their way back to >>> Lucene? >>> >>> Yonik's proposal (merging development of Solr/Lucene, but keeping all >>> else separate) would achieve this. >>> >>> If we do the opposite (Solr -> TLP), how could we possibly achieve >>> this? >>> >>> I guess one possibility is to just suck it up and duplicate the code. >>> Meaning, each project will have to manually merge fixes in from the >>> other project (so long as there's someone around with the itch to do >>> so). Lucene would copy in all of Solr's analysis, and vice-versa (and >>> likewise other dup'd functionality). I really dislike this >>> solution... it will confuse the daylights out of users, its error >>> proned, it's a waste of dev effort, there will always be little >>> differences... but maybe it is in fact the lesser evil? >>> >>> I would much prefer merging Solr/Lucene development... >>> >>> Mike >>> >>> On Mon, Mar 1, 2010 at 12:01 PM, Mattmann, Chris A (388J) >>> <[email protected]> wrote: >>>> Hi Grant, >>>> >>>>> On Mar 1, 2010, at 8:20 AM, Mattmann, Chris A (388J) wrote: >>>>> >>>>>> Hi Robert, >>>>>> >>>>>> I think my proposal (Solr->TLP) is sort of orthogonal to the whole >>>>>> analyzers >>>>>> issue - I was in favor, at the very least, of having a separate >>>>>> module/project/whatever that both Solr/Lucene (and whatever project) can >>>>>> depend on for the shared analyzer code... >>>>> >>>>> Not really. They are intimately linked. >>>> >>>> Ummm, how so? Making project A called "Apache Super Analyzers" and then >>>> making Lucene(-java) and Solr depend on Apache Super Analyzers is separate >>>> of whether or not Lucene(-java) and Solr are TLPs or not... >>>> >>>> Cheers, >>>> Chris >>>> >>>> >>>>> >>>>> >>>>>> >>>>>> Cheers, >>>>>> Chris >>>>>> >>>>>> >>>>>> >>>>>> On 3/1/10 9:12 AM, "Robert Muir" <[email protected]> wrote: >>>>>> >>>>>> this will make the analyzers duplication problem even worse >>>>>> >>>>>> On Mon, Mar 1, 2010 at 11:06 AM, Mattmann, Chris A (388J) < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Mark, >>>>>>> >>>>>>> Thanks for your message. I respect your viewpoint, but I respectfully >>>>>>> disagree. It just seems (to me at least based on the discussion) like a >>>>>>> TLP >>>>>>> for Solr is the way to go. >>>>>>> >>>>>>> Cheers, >>>>>>> Chris >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 3/1/10 8:54 AM, "Mark Miller" <[email protected]> wrote: >>>>>>> >>>>>>> On 03/01/2010 10:40 AM, Mattmann, Chris A (388J) wrote: >>>>>>>> Hi Mark, >>>>>>>> >>>>>>>> >>>>>>>>> That would really be no real world change from how things work today. >>>>>>> The fact >>>>>>>>> is, today, Solr already operates essentially as an independent >>>>>>>>> project. >>>>>>>>> >>>>>>>> Well if that's the case, then it would lead me to think that it's more >>>>>>>> of >>>>>>> a >>>>>>>> TLP more than anything else per best practices. >>>>>>>> >>>>>>> That depends. It could be argued it should be a top level project or >>>>>>> that it should be closer to the Lucene project. Some people are arguing >>>>>>> for both approaches right now. There are two directions we could move >>>>>>> in. >>>>>>>> >>>>>>>>> The only real difference is that it shares the same PMC with Lucene >>>>>>>>> now >>>>>>> and >>>>>>>>> wouldn't with this change. This would address none of the issues that >>>>>>>>> triggered >>>>>>>>> the idea for a possible merge. >>>>>>>>> >>>>>>>> I don't agree -- you're looking to bring together two communities that >>>>>>> are >>>>>>>> "fairly separate" as you put it. The separation likely didn't spring up >>>>>>> over >>>>>>>> night and has been this way for a while (as least to my knowledge). >>>>>>>> This >>>>>>> is >>>>>>>> exactly the type of situation that typically leads to TLP creation from >>>>>>> what >>>>>>>> I've seen. >>>>>>>> >>>>>>> It also causes negatives between Solr/Lucene that some are looking to >>>>>>> address. Hence the birth of this proposal. Going TLP with Solr will only >>>>>>> aggravate those negatives, not help them. >>>>>>> >>>>>>> While the communities operate fairly separately at the moment, the >>>>>>> people in the communities are not so separate. The committer list has >>>>>>> huge overlap. Many committers on one project but not the other do a lot >>>>>>> of work on both projects. >>>>>>> >>>>>>> There is already a strong link with the personal - merging the >>>>>>> management of the projects addresses many of the concerns that have >>>>>>> prompted this discussion. TLP'ing Solr only makes those concerns >>>>>>> multiply. They would diverge further, and incompatible overlap between >>>>>>> them would increase. >>>>>>> >>>>>>>> Cheers, >>>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 03/01/2010 10:04 AM, Mattmann, Chris A (388J) wrote: >>>>>>>>> >>>>>>>>>> Hey Grant, >>>>>>>>>> >>>>>>>>>> I'd like to explore this< does this imply that the Lucene >>>>>>> sub-projects will >>>>>>>>>> go away and Lucene will turn into Lucene-java and maintain its Apache >>>>>>> TLP, >>>>>>>>>> and then you'd have say, solr.apache.org, tika.apache.org, >>>>>>> mahout.apache.org >>>>>>>>>> (already started), etc. etc.? If so, that may be the best of all >>>>>>> worlds, >>>>>>>>>> allowing project independence, but also not following the Apache >>>>>>>>>> "antipattern" as Doug put it... >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/1/10 7:28 AM, "Grant Ingersoll"<[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Also, as Doug alluded to, the Board is likely to ask us to consider >>>>>>> less >>>>>>>>>>> subprojects in the future, so we may be consolidating and spinning >>>>>>>>>>> off >>>>>>>>>>> anyway. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>>>> Chris Mattmann, Ph.D. >>>>>>>>>> Senior Computer Scientist >>>>>>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>>>>>>>>> Office: 171-266B, Mailstop: 171-246 >>>>>>>>>> Email: [email protected] >>>>>>>>>> Phone: +1 (818) 354-8810 >>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>>>> Adjunct Assistant Professor, Computer Science Department >>>>>>>>>> University of Southern California, Los Angeles, CA 90089 USA >>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> - Mark >>>>>>>>> >>>>>>>>> http://www.lucidimagination.com >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>> Chris Mattmann, Ph.D. >>>>>>>> Senior Computer Scientist >>>>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>>>>>>> Office: 171-266B, Mailstop: 171-246 >>>>>>>> Email: [email protected] >>>>>>>> WWW: >>>>>>>> http://sunset.usc.edu/~mattmann/<http://sunset.usc.edu/%7Emattmann/> >>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>> Adjunct Assistant Professor, Computer Science Department >>>>>>>> University of Southern California, Los Angeles, CA 90089 USA >>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> - Mark >>>>>>> >>>>>>> http://www.lucidimagination.com >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>> Chris Mattmann, Ph.D. >>>>>>> Senior Computer Scientist >>>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>>>>>> Office: 171-266B, Mailstop: 171-246 >>>>>>> Email: [email protected] >>>>>>> WWW: >>>>>>> http://sunset.usc.edu/~mattmann/<http://sunset.usc.edu/%7Emattmann/> >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>> Adjunct Assistant Professor, Computer Science Department >>>>>>> University of Southern California, Los Angeles, CA 90089 USA >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Robert Muir >>>>>> [email protected] >>>>>> >>>>>> >>>>>> >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>> Chris Mattmann, Ph.D. >>>>>> Senior Computer Scientist >>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>>>>> Office: 171-266B, Mailstop: 171-246 >>>>>> Email: [email protected] >>>>>> WWW: http://sunset.usc.edu/~mattmann/ >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>> Adjunct Assistant Professor, Computer Science Department >>>>>> University of Southern California, Los Angeles, CA 90089 USA >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>> >>>>> >>>>> >>>> >>>> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> Chris Mattmann, Ph.D. >>>> Senior Computer Scientist >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>>> Office: 171-266B, Mailstop: 171-246 >>>> Email: [email protected] >>>> WWW: http://sunset.usc.edu/~mattmann/ >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> Adjunct Assistant Professor, Computer Science Department >>>> University of Southern California, Los Angeles, CA 90089 USA >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> >>>> >>>> >>> >>> >>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Chris Mattmann, Ph.D. >>> Senior Computer Scientist >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> Office: 171-266B, Mailstop: 171-246 >>> Email: [email protected] >>> WWW: http://sunset.usc.edu/~mattmann/ >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Adjunct Assistant Professor, Computer Science Department >>> University of Southern California, Los Angeles, CA 90089 USA >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >>> >> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: [email protected] >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
