Re: Solr complains about unknown field during atomic indexing

Andreas Hubold Mon, 22 Mar 2021 01:27:49 -0700

Hi,

You could then add the following to take care of any and all unknownfields:
<dynamicField name="*" type="ignored" multiValued="true" />
Or you could name individual fields like that, which I think would bea better option than the wildcard dynamic field.

Just a small addition, in case you're also using nested documents: Youshould really prefer individual field names instead of the "*" wildcardthen.Otherwise you may run into the following bug, which causes nested childdocuments to disappear: https://issues.apache.org/jira/browse/SOLR-15018


Cheers,
Andreas

Shawn Heisey wrote on 20.03.21 11:52:

On 3/19/2021 3:36 PM, gnandre wrote:
While performing atomic indexing, I run into an error which says'unknown
field X' where X is not a field specified in the schema. It is a
discontinued field. After deleting that field from the schema, I have
restarted Solr but I have not re-indexed the content back, so thedeleted
field data still might be there in Solr index.

The way I understand how atomic indexing works, it tries to index all
stored values again, but why is it trying to index stored value of afield
that does not exist in the schema?
Solr's Atomic Update feature works by grabbing the existing document,all of it, performing the atomic update instructions on that document,and then indexing the results as a new document. If the uniqueKeyfeature is enabled (which would be required for Atomic Updates to workproperly), the old document is deleted as the new document is added. I haven't looked at the code, but the existing fields are likely addedto the document that is being built all at once and without consultingthe schema. So if field X is in the document that's already in theindex, it will be in the new document too. If X is deleted from theschema, you'll get the error you're getting.
It would be a fair amount of work to have Solr take the schema intoaccount for atomic updates. Not impossible, just slightlytime-consuming. I think we (the Solr developers) would want it tostill fail indexing in this situation, the failure would just happenat a different place in the code than it does now, during atomicdocument assembly. Fail earlier and faster.
What you'll need to for your circumstances is leave X in the schema,but change it to a type that will be completely ignored on indexing.
Something like this:

<fieldType
  name="ignored"
  indexed="false"
  stored="false"
  docValues="false"
  multiValued="true"
  class="solr.StrField" />
You could then add the following to take care of any and all unknownfields:
<dynamicField name="*" type="ignored" multiValued="true" />
Or you could name individual fields like that, which I think would bea better option than the wildcard dynamic field.
My source for the config snippets:https://stackoverflow.com/questions/46509259/solr-7-managed-schema-how-to-ignore-unnamed-fields
Thanks,
Shawn
.

Re: Solr complains about unknown field during atomic indexing

Reply via email to