On 12/3/2015 8:09 AM, Kelly, Frank wrote:
> Just wondering if folks have any suggestions on using Schema.xml vs. Managed 
> Schema going forward.
> 
> Our deployment will be
>> 3 Zk, 3 Shards, 3 replicas
>> Copies of each collection in 5 AWS regions (EBS-backed EC2 instances)
>> Planning at least 1 Billion objects indexed (currently < 100 million)
> 
> I'm sure our schema.xml will have changes and fixes and just wondering which 
> approach (schema.xml vs. managed)
> will be easier to deploy / maintain?

In production, you probably want a schema that cannot change.  The
managed schema that you find in the data-driven configuration will
automatically add new fields to the schema if unknown fields are
encountered in your data ... which means that if somehow a typo makes it
through your indexing process, you may not know about the problem until
later.

With a static schema, an indexing request that has an error in a field
name will be rejected and you will receive an error, which is how I
would want Solr to behave.

The data-driven schema is good for prototyping, but because the field
definitons that get added are just a guess by Solr, I would manually
edit the schema before going into production.  Once in production I
would want to be in complete manual control of the schema.

Thanks,
Shawn

Reply via email to