Data driven mode is different from managed schema. It is unfortunate that in our example configurations we implemented them together.
Managed schema is about using APIs to read/write schema changes. Not requiring people to hand edit schema.xml is a good thing, IMO. Data driven schema uses the managed schema infrastructure internally and adds update request processors to create/modify schema depending on what data you throw at Solr. It is a nice mode to play around with Solr but I would only use it for PoCs. I hope that clarifies things. On Fri, Mar 11, 2016 at 10:36 PM, Nick Vasilyev <nick.vasily...@gmail.com> wrote: > Got it. > > Thank you for clarifying this, I was under impression that I would only be > able to make changes via the API. I will look into this some more. > > On Fri, Mar 11, 2016 at 11:51 AM, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 3/11/2016 9:28 AM, Nick Vasilyev wrote: >> > Maybe I am missing something, if that is the case what is the difference >> > between data_driven_schema_configs and basic_configs? I thought that the >> > only difference was that the data_driven_schema_configs comes with the >> > managed schema and the basic_configs come with regular? >> > >> > Also, I haven't really dived into the schema less mode so far, I know >> > elastic uses it and it has been kind of a turn off for me. Can you >> provide >> > some guidance around best practices on how to use it? >> >> Schemaless mode is implemented with an update processor chain. If you >> look in the data_driven_schema_configs solrconfig.xml file, you will >> find an updateRequestProcessorChain named >> "add-unknown-fields-to-the-schema". This update chain is then enabled >> with an initParams config. >> >> I personally would not recommend using it. It would be fine to use >> during prototyping, but I would definitely turn it off for production. >> >> > For example, now I have all of my configuration files in version control, >> > if I need to make a change, I upload a new schema to version control, >> then >> > the server pulls them down, uploads to zk and reloads collections. This >> is >> > almost fully automated and since all configuration is in a single file it >> > is easy to review and track previous changes. I like this process and it >> > works well; if I have to start using managed schemas; I would like some >> > feedback on how to implement it with minimal disruption to this. >> >> There's no reason you can't continue to use this method, even with the >> managed schema. Editing the managed-schema is discouraged if you >> actually intend to use the Schema API, but there's nothing in place to >> prevent you from doing it that way. >> >> > If I am sending all schema changes via the API, I would need to have >> still >> > have some file with the schema configuration, it would just be a >> different >> > format. I would then need to have some code to read it and send specific >> > items to Solr, right? When I need to make a change, do I have to then >> make >> > this change individually and include that configuration as part of the >> > config file? Or should I be able to just send the entire schema in again? >> >> Using the Schema API changes the managed-schema file in place. You >> wouldn't need to upload anything to zookeeper, the change would already >> be there -- but you'd have to take an extra step (retrieving from >> zookeeper) to make sure it's in version control. >> >> My recommendation is to just keep using version control as you have >> been, which you can do with either the Classic or Managed schema. The >> filename for the schema would change with the managed version, but >> nothing else. >> >> Thanks, >> Shawn >> >> -- Regards, Shalin Shekhar Mangar.