On Jun 17, 2010, at 7:44 PM, Mark Diggory wrote:
when I saw what was done with the "templating" of the Maven pom
work that was originally donated to solr, I just cringed at it.
Most of us Solr committers are fairly anti-Maven or ambivalent about
it at best, so it hasn't gotten much TLC, admittedly. But rather than
just cringe, help us fix it if it is still broken. I know Ryan, and
sometimes Grant, care about the POM stuff being right, so I'm sure you
can count on some committer eyes on whatever you have to contribute.
Ideally, source directories should have a 1 to 1 relationship to
artifacts that are produced.
You get complete agreement from me on that. Directory structure is
very important. I haven't spent much if any time on Solr's build
myself, alas, and I hear some complaints about it. It certainly could
use a bit more attention.
This SOLR-236 is a posterchild of an unclear practice or convention
for how to package customizations to Solr. Really, isn't SOLR-236
"wanted" enough to warrant that it actually reside in the svn where
it could be developed properly rather than as a task thats been open
for "how many years"?! I'd highly recommend the Field Collapsing
prototype ceased to be managed as patches in a JIRA task and
actually got some code re-visioning behind it and interim release
building available.
This is where the new craze of git comes in, I think. These types of
big feature additions to Solr or any Apache project can be developed
in a personal branch, maintained there, version controlled, etc. And
then patched and committed to Solr when ready.
Because of Apache's tighter control on committers, it's not really
feasible to have Apache svn branches for these sorts of things where
non-committers are collaborating.
I cringe at working with patches in JIRA myself - it's difficult and
clunky, for me at least.
I'll even confess that my "patch cludge" in my maven project to
apply SOLR-236 to the solr source is not at all a best practice in
terms of supporting addons to solr. It was simply an attempt to
compensate.
Pragmatism at its finest. +1
Ideally, Field Collapsing should have been a separately maintained
codebase in a separate maven project that did not interfere with the
solr, solr core request handler or configuration implementations and
simply just depended on them. Then it could be dropped into a lib
dir of any solr 1.4.0. (and conversely just added to my webapp poms
as a maven dependency when they are assembled in our own build
processes).
I don't know the details of SOLR-236 myself, but I believe it includes
necessary core changes too, so it can't simply be a drop in lib.
Ideally, a Maven Archetype could be created that would allow one
rapidly produce a Solr webapp and fire it up in Jetty in mere
seconds.
How's that any different than cd example; java -jar start.jar? Or
do you mean a Solr client webapp?
mvn package jetty:run
Oh, and Solr's build has this too:
ant run-example
with optional switches to -D set: example.solr.home, example.data.dir,
example.jetty.port, and some others like running the JVM with
debugging enabled.
Finally, with projects such as Bobo, integration with Spring would
make configuration more consistent and request significantly less
java coding just to add new capabilities everytime someone authors
a new RequestHandler.
It's one line of config to add a new request handler. How many
ridiculously ugly confusing lines of Spring XML would it take?
But if I have my own configuration for that Request Handler, how
many lines of java to I need to add/alter to get that configuration
to parse in solr config and be available? Even if its just a few,
its IMO, its still the wrong way to be cutting the cake.
Zero lines of additional Java code. Make your configuration available
as a separate file pointed to by the args available in Solr's config,
or however you want to wire your own configuration in. Maybe I'm
misunderstanding exactly what you want, but request handlers have init
params.
Though, to be technical, most extensions these days are going to be
search components, not request handlers - but the same discussion
applies.
I personally find it silly that we customize SolrJ for all these
request handlers anyway. You get a decent navigable data structure
back from general SolrJ query requests as it is, there's no need to
build in all these convenience methods specific to all the Solr
componetry. Sure, it's "convenient", but it's a maintenance
headache and as you say, not generic.
Its an example of something I coin a "policing bottleneck". Where
the core code introduces a pattern for convenience that restricts
the ability to add features to the application without "approval"
I.E. consensus that the code contribution be part of the central
API. Thus as long as the patch alters core code, you can't make the
whole solution easily available to your users without complete
approval.
Well, there's nothing that restricts the ability to add new features,
is there? make a request to Solr via SolrJ generically, and navigate
the response structure. Seems like folks get hung up on expecting
there always be a convenient getSomeSpecialResponseData() method for
every request handler / component out there, and all that does is
clutter up SolrJ, IMO.
Your welcome to your opinions, I have some pretty strong ones in
favor of Spring as well. hacking a configuration file is one thing,
having to alter code to support new configuration properties, well
that creates a bit of a bottleneck in getting changes into the
codebase.
Can you lay out an example where you've had to alter code to support
new configuration properties? And perhaps illustrate how Spring would
make it better?
1.) Having simple configuration is not really of benefit if it takes
coding for your users to customize it, which they will, as is seen
in many of the patches provided to solr. Ultimately, this burdens
your own efforts to maintain the application and delays in
processing tasks that have good functional addons in them.
I'm not following this point. What's an example where custom code was
needed?
2.) Adopting Spring (or any other IoC/DI framework) forces your
application code instantiation and binding to be evaluated and
refactored; this frees how your application is assembled, allows
your application functional areas to be more cleanly separated, and
hardens your applications interface contracts. Doing so, in turn,
allows your application to be assembled and reused inside other
frameworks and environments, increasing your target user base and
participation, making your project ultimately, more active and more
successful.
Don't get me wrong... I'm pro IoC/DI. Solr does need more of that in
order to have it's componentry better factored for unit testing,
extensibility, and pluggability. +1
So, I think the goal of any development activity around adopting a
DI framework would be to free up the application from being
hardbound to the configuration file so those of us that choose to
use other configuration tools can do so. I even suspect that it
might be already possible if the instantiation and binding of the
objects that form a Solr application are sufficiently separate from
those configuration classes in the first place. A good area to
explore.
No doubt. +1 again.
Oh, and Hi Mark! :)
Your Solr Blackbelt at code4lib was excellent, as were our
conversations afterward. :-)
Why thank you!
Erik