Re: Solr Project Structure (was Re: Field Collapsing SOLR-236)

Erik Hatcher Thu, 17 Jun 2010 17:12:58 -0700


On Jun 17, 2010, at 7:44 PM, Mark Diggory wrote:

when I saw what was done with the "templating" of the Maven pomwork that was originally donated to solr, I just cringed at it.

Most of us Solr committers are fairly anti-Maven or ambivalent aboutit at best, so it hasn't gotten much TLC, admittedly. But rather thanjust cringe, help us fix it if it is still broken. I know Ryan, andsometimes Grant, care about the POM stuff being right, so I'm sure youcan count on some committer eyes on whatever you have to contribute.

Ideally, source directories should have a 1 to 1 relationship toartifacts that are produced.

You get complete agreement from me on that. Directory structure isvery important. I haven't spent much if any time on Solr's buildmyself, alas, and I hear some complaints about it. It certainly coulduse a bit more attention.

This SOLR-236 is a posterchild of an unclear practice or conventionfor how to package customizations to Solr. Really, isn't SOLR-236"wanted" enough to warrant that it actually reside in the svn whereit could be developed properly rather than as a task thats been openfor "how many years"?! I'd highly recommend the Field Collapsingprototype ceased to be managed as patches in a JIRA task andactually got some code re-visioning behind it and interim releasebuilding available.

This is where the new craze of git comes in, I think. These types ofbig feature additions to Solr or any Apache project can be developedin a personal branch, maintained there, version controlled, etc. Andthen patched and committed to Solr when ready.

Because of Apache's tighter control on committers, it's not reallyfeasible to have Apache svn branches for these sorts of things wherenon-committers are collaborating.

I cringe at working with patches in JIRA myself - it's difficult andclunky, for me at least.

I'll even confess that my "patch cludge" in my maven project toapply SOLR-236 to the solr source is not at all a best practice interms of supporting addons to solr. It was simply an attempt tocompensate.


Pragmatism at its finest.  +1

Ideally, Field Collapsing should have been a separately maintainedcodebase in a separate maven project that did not interfere with thesolr, solr core request handler or configuration implementations andsimply just depended on them. Then it could be dropped into a libdir of any solr 1.4.0. (and conversely just added to my webapp pomsas a maven dependency when they are assembled in our own buildprocesses).

I don't know the details of SOLR-236 myself, but I believe it includesnecessary core changes too, so it can't simply be a drop in lib.

Ideally, a Maven Archetype could be created that would allow onerapidly produce a Solr webapp and fire it up in Jetty in mereseconds.
How's that any different than cd example; java -jar start.jar? Ordo you mean a Solr client webapp?
mvn package jetty:run


Oh, and Solr's build has this too:

  ant run-example

with optional switches to -D set: example.solr.home, example.data.dir,example.jetty.port, and some others like running the JVM withdebugging enabled.

Finally, with projects such as Bobo, integration with Spring wouldmake configuration more consistent and request significantly lessjava coding just to add new capabilities everytime someone authorsa new RequestHandler.
It's one line of config to add a new request handler. How manyridiculously ugly confusing lines of Spring XML would it take?
But if I have my own configuration for that Request Handler, howmany lines of java to I need to add/alter to get that configurationto parse in solr config and be available? Even if its just a few,its IMO, its still the wrong way to be cutting the cake.

Zero lines of additional Java code. Make your configuration availableas a separate file pointed to by the args available in Solr's config,or however you want to wire your own configuration in. Maybe I'mmisunderstanding exactly what you want, but request handlers have initparams.

Though, to be technical, most extensions these days are going to besearch components, not request handlers - but the same discussionapplies.

I personally find it silly that we customize SolrJ for all theserequest handlers anyway. You get a decent navigable data structureback from general SolrJ query requests as it is, there's no need tobuild in all these convenience methods specific to all the Solrcomponetry. Sure, it's "convenient", but it's a maintenanceheadache and as you say, not generic.
Its an example of something I coin a "policing bottleneck". Wherethe core code introduces a pattern for convenience that restrictsthe ability to add features to the application without "approval"I.E. consensus that the code contribution be part of the centralAPI. Thus as long as the patch alters core code, you can't make thewhole solution easily available to your users without completeapproval.

Well, there's nothing that restricts the ability to add new features,is there? make a request to Solr via SolrJ generically, and navigatethe response structure. Seems like folks get hung up on expectingthere always be a convenient getSomeSpecialResponseData() method forevery request handler / component out there, and all that does isclutter up SolrJ, IMO.

Your welcome to your opinions, I have some pretty strong ones infavor of Spring as well. hacking a configuration file is one thing,having to alter code to support new configuration properties, wellthat creates a bit of a bottleneck in getting changes into thecodebase.

Can you lay out an example where you've had to alter code to supportnew configuration properties? And perhaps illustrate how Spring wouldmake it better?

1.) Having simple configuration is not really of benefit if it takescoding for your users to customize it, which they will, as is seenin many of the patches provided to solr. Ultimately, this burdensyour own efforts to maintain the application and delays inprocessing tasks that have good functional addons in them.

I'm not following this point. What's an example where custom code wasneeded?

2.) Adopting Spring (or any other IoC/DI framework) forces yourapplication code instantiation and binding to be evaluated andrefactored; this frees how your application is assembled, allowsyour application functional areas to be more cleanly separated, andhardens your applications interface contracts. Doing so, in turn,allows your application to be assembled and reused inside otherframeworks and environments, increasing your target user base andparticipation, making your project ultimately, more active and moresuccessful.

Don't get me wrong... I'm pro IoC/DI. Solr does need more of that inorder to have it's componentry better factored for unit testing,extensibility, and pluggability. +1

So, I think the goal of any development activity around adopting aDI framework would be to free up the application from beinghardbound to the configuration file so those of us that choose touse other configuration tools can do so. I even suspect that itmight be already possible if the instantiation and binding of theobjects that form a Solr application are sufficiently separate fromthose configuration classes in the first place. A good area toexplore.


No doubt.  +1 again.

Oh, and Hi Mark! :)
Your Solr Blackbelt at code4lib was excellent, as were ourconversations afterward. :-)


Why thank you!

        Erik

Re: Solr Project Structure (was Re: Field Collapsing SOLR-236)

Reply via email to