Re: Solr Managed Schema by Default in 5.5

Shalin Shekhar Mangar Fri, 11 Mar 2016 18:52:58 -0800

Data driven mode is different from managed schema. It is unfortunate
that in our example configurations we implemented them together.


Managed schema is about using APIs to read/write schema changes. Not
requiring people to hand edit schema.xml is a good thing, IMO.

Data driven schema uses the managed schema infrastructure internally
and adds update request processors to create/modify schema depending
on what data you throw at Solr. It is a nice mode to play around with
Solr but I would only use it for PoCs.

I hope that clarifies things.

On Fri, Mar 11, 2016 at 10:36 PM, Nick Vasilyev
<nick.vasily...@gmail.com> wrote:
> Got it.
>
> Thank you for clarifying this, I was under impression that I would only be
> able to make changes via the API. I will look into this some more.
>
> On Fri, Mar 11, 2016 at 11:51 AM, Shawn Heisey <apa...@elyograg.org> wrote:
>
>> On 3/11/2016 9:28 AM, Nick Vasilyev wrote:
>> > Maybe I am missing something, if that is the case what is the difference
>> > between data_driven_schema_configs and basic_configs? I thought that the
>> > only difference was that the data_driven_schema_configs comes with the
>> > managed schema and the basic_configs come with regular?
>> >
>> > Also, I haven't really dived into the schema less mode so far, I know
>> > elastic uses it and it has been kind of a turn off for me. Can you
>> provide
>> > some guidance around best practices on how to use it?
>>
>> Schemaless mode is implemented with an update processor chain.  If you
>> look in the data_driven_schema_configs solrconfig.xml file, you will
>> find an updateRequestProcessorChain named
>> "add-unknown-fields-to-the-schema".  This update chain is then enabled
>> with an initParams config.
>>
>> I personally would not recommend using it.  It would be fine to use
>> during prototyping, but I would definitely turn it off for production.
>>
>> > For example, now I have all of my configuration files in version control,
>> > if I need to make a change, I upload a new schema to version control,
>> then
>> > the server pulls them down, uploads to zk and reloads collections. This
>> is
>> > almost fully automated and since all configuration is in a single file it
>> > is easy to review and track previous changes. I like this process and it
>> > works well; if I have to start using managed schemas; I would like some
>> > feedback on how to implement it with minimal disruption to this.
>>
>> There's no reason you can't continue to use this method, even with the
>> managed schema.  Editing the managed-schema is discouraged if you
>> actually intend to use the Schema API, but there's nothing in place to
>> prevent you from doing it that way.
>>
>> > If I am sending all schema changes via the API, I would need to have
>> still
>> > have some file with the schema configuration, it would just be a
>> different
>> > format. I would then need to have some code to read it and send specific
>> > items to Solr, right?  When I need to make a change, do I have to then
>> make
>> > this change individually and include that configuration as part of the
>> > config file? Or should I be able to just send the entire schema in again?
>>
>> Using the Schema API changes the managed-schema file in place.  You
>> wouldn't need to upload anything to zookeeper, the change would already
>> be there -- but you'd have to take an extra step (retrieving from
>> zookeeper) to make sure it's in version control.
>>
>> My recommendation is to just keep using version control as you have
>> been, which you can do with either the Classic or Managed schema.  The
>> filename for the schema would change with the managed version, but
>> nothing else.
>>
>> Thanks,
>> Shawn
>>
>>



-- 
Regards,
Shalin Shekhar Mangar.

Re: Solr Managed Schema by Default in 5.5

Reply via email to