Hi Shawn,
Yes, I have managed schema enabled like so:
<schemaFactory class="ManagedIndexSchemaFactory">
<bool name="mutable">true</bool>
<str name="managedSchemaResourceName">cp-schema.xml</str>
</schemaFactory>
The reason why I enabled it is so that I can dynamically customize the
schema based on what's in the DB. So that I can add fields to the schema
dynamically.
I didn't know about the field "guessing" part. Now that I know I see this
in my solrconfig.xml file:
<updateRequestProcessorChain name="add-unknown-fields-to-the-schema"
default="${update.autoCreateFields:true}"
processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields">
<processor class="solr.LogUpdateProcessorFactory"/>
<processor class="solr.DistributedUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
If I remove this block, what will happen?
I guess a better question, to meet my need, is this: how do I tell Solr, in
schema-less mode, to use *my* defined field-type whenever it needs to
create a new field?
I'm on Solr 8.6.1 and the link at
https://lucene.apache.org/solr/guide/8_6/schema-factory-definition-in-solrconfig.html#schema-factory-definition-in-solrconfig
doesn't offer much help.
Thanks
Steven
On Mon, Feb 15, 2021 at 11:09 AM Shawn Heisey <[email protected]> wrote:
> On 2/15/2021 6:52 AM, Steven White wrote:
> > It looks to me that SolrInputDocument.addField() is either missnamed or
> > isn't well implemented.
> >
> > When it is called on a field that doesn't exist in the schema, it will
> > create that field and give it a type based on the data. Not only that,
> it
> > will set default values. For example, this call
> >
> > SolrInputDocument doc = new SolrInputDocument();
> > doc.addField("Company", "ACM company");
> >
> > Will create the following:
> >
> > <field name="Company" type="text_general"/>
> > <copyField source="Company" dest="Company_str" maxChars="256"/>
>
> That SolrJ code does not make those changes to your schema. At least
> not in the way you're thinking.
>
> It sounds to me like your solrconfig.xml includes what we call
> "schemaless mode" -- an update processor that adds unknown fields when
> they are indexed. You should disable it. We strongly recommend never
> using it in production, because it can make the wrong guess about which
> fieldType is required. The fieldType chosen has very little to do with
> the SolrJ code. It is controlled by what's in solrconfig.xml.
>
> Thanks,
> Shawn
>