Re: Some new SOLR features

Jason Rutherglen Thu, 18 Sep 2008 05:38:29 -0700

Yes, so it's probably best to make the changes through a remote
interface so that the app will be able to make the appropriate
internal changes.  File based system changes are less than ideal,
agreed, however I suppose with an open source project such as SOLR the
kitchen sink affect happens and it will find it's way in there
anyways.  The hard part is organizing the project such that it does
not get too bloated with everyone's features and allows features to be
pluggable outside of the core releases.  There are many things that
may best best as contrib modules that could be OSGI based add ons
rather than placed into the standard releases (of which I don't have
any off hand).  The standard for contribs for SOLR can be OSGI.  This
will greatly assist in SOLR becoming grid computing friendly.  Ideally
SOLR 2.0 would be cleaner, standardized, and most of the features
pluggable.  This will allow for consistent release cycles, make grid
computing simpler to implement.  SOLR seems like it could be going in
the direction of bloat which could increasingly confuse new users.
Instead they could either implement their own modules and upload them
in the contrib section, implement their own that are proprietary.


I am curious about what is the recommended place to put the query
expansion code (such as adding boosting, adding phrase queries and
such)?  Is is now best to use a SearchComponent?  Is it possible in
the future to make SearchComponents OSGI enabled?

On Thu, Sep 18, 2008 at 7:56 AM, Mark Miller <[EMAIL PROTECTED]> wrote:
> Dynamic changes are not what I'm against...I'm against dynamic changes that
> are triggered by the app noticing that the config have changed.
>
> Jason Rutherglen wrote:
>>
>> Servlets is one thing.  For SOLR the situation is different.  There
>> are always small changes people want to make, a new stop word, a small
>> tweak to an analyzer.  Rebooting the server for these should not be
>> necessary.  Ideally this is handled via a centralized console and
>> deployed over the network (using RMI or XML) so that files do not need
>> to be deployed.
>>
>> On Thu, Sep 18, 2008 at 7:41 AM, Mark Miller <[EMAIL PROTECTED]>
>> wrote:
>>
>>>
>>> Isnt this done in servlet containers for debugging type work? Maybe an
>>> option, but I disagree that this should drive anything in solr. It should
>>> really be turned off in production in servelet containers imo as well.
>>>
>>> This can really be such a pain in the ass on a live site...someone
>>> touches
>>> web.xml and the app server reboots....*shudder*. Seen it, don't dig it.
>>>
>>> Jason Rutherglen wrote:
>>>
>>>>
>>>> This should be done.  Great idea.
>>>>
>>>> On Wed, Sep 17, 2008 at 3:41 PM, Lance Norskog <[EMAIL PROTECTED]>
>>>> wrote:
>>>>
>>>>
>>>>>
>>>>> My vote is for dynamically scanning a directory of configuration files.
>>>>> When
>>>>> a new one appears, or an existing file is touched, load it. When a
>>>>> configuration disappears, unload it.  This model works very well for
>>>>> servlet
>>>>> containers.
>>>>>
>>>>> Lance
>>>>>
>>>>> -----Original Message-----
>>>>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
>>>>> Seeley
>>>>> Sent: Wednesday, September 17, 2008 11:21 AM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: Re: Some new SOLR features
>>>>>
>>>>> On Wed, Sep 17, 2008 at 1:27 PM, Jason Rutherglen
>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> If the configuration code is going to be rewritten then I would like
>>>>>> to see the ability to dynamically update the configuration and schema
>>>>>> without needing to reboot the server.
>>>>>>
>>>>>>
>>>>>
>>>>> Exactly.  Actually, multi-core allows you to instantiate a completely
>>>>> new
>>>>> core and swap it for the old one, but it's a bit of a heavyweight
>>>>> approach.
>>>>>
>>>>> The key is finding the right granularity of change.
>>>>> My current thought is that a schema object would not be mutable, but
>>>>> that
>>>>> one could easily swap in a new schema object for an index at any time.
>>>>>  That
>>>>> would allow a single request to see a stable view of the schema, while
>>>>> preventing having to make every aspect of the schema thread-safe.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Also I would like the
>>>>>> configuration classes to just contain data and not have so many
>>>>>> methods that operate on the filesystem.
>>>>>>
>>>>>>
>>>>>
>>>>> That's the plan... completely separate the serialized and in memory
>>>>> representations.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> This way the configuration
>>>>>> object can be serialized, and loaded by the server dynamically.  It
>>>>>> would be great for the schema to work the same way.
>>>>>>
>>>>>>
>>>>>
>>>>> Nothing will stop one from using java serialization for config
>>>>> persistence,
>>>>> however I am a fan of human readable for config files...
>>>>> so much easier to debug and support.  Right now, people can cut-n-paste
>>>>> relevant parts of their config in email for support, or to a wiki to
>>>>> explain
>>>>> things, etc.
>>>>>
>>>>> Of course, if you are talking about being able to have custom filters
>>>>> or
>>>>> analyzers (new classes that don't even exist on the server yet), then
>>>>> it
>>>>> does start to get interesting.  This intersects with deployment in
>>>>> general... and I'm not sure what the right answer is.
>>>>> What if Lucene or Solr needs an upgrade?  It would be nice if that
>>>>> could
>>>>> also automatically be handled in a a large cluster... what are the
>>>>> options
>>>>> for handling that?  Is there a role here for OSGi to play?
>>>>>  It sounds like at least some of that is outside of the Solr domain.
>>>>>
>>>>> An alternative to serializing everything would be to ship a new schema
>>>>> along
>>>>> with a new jar file containing the custom components.
>>>>>
>>>>> -Yonik
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>>
>
>

Re: Some new SOLR features

Reply via email to