Hi,
You could then add the following to take care of any and all unknown
fields:
<dynamicField name="*" type="ignored" multiValued="true" />
Or you could name individual fields like that, which I think would be
a better option than the wildcard dynamic field.
Just a small addition, in case you're also using nested documents: You
should really prefer individual field names instead of the "*" wildcard
then.
Otherwise you may run into the following bug, which causes nested child
documents to disappear: https://issues.apache.org/jira/browse/SOLR-15018
Cheers,
Andreas
Shawn Heisey wrote on 20.03.21 11:52:
On 3/19/2021 3:36 PM, gnandre wrote:
While performing atomic indexing, I run into an error which says
'unknown
field X' where X is not a field specified in the schema. It is a
discontinued field. After deleting that field from the schema, I have
restarted Solr but I have not re-indexed the content back, so the
deleted
field data still might be there in Solr index.
The way I understand how atomic indexing works, it tries to index all
stored values again, but why is it trying to index stored value of a
field
that does not exist in the schema?
Solr's Atomic Update feature works by grabbing the existing document,
all of it, performing the atomic update instructions on that document,
and then indexing the results as a new document. If the uniqueKey
feature is enabled (which would be required for Atomic Updates to work
properly), the old document is deleted as the new document is added.
I haven't looked at the code, but the existing fields are likely added
to the document that is being built all at once and without consulting
the schema. So if field X is in the document that's already in the
index, it will be in the new document too. If X is deleted from the
schema, you'll get the error you're getting.
It would be a fair amount of work to have Solr take the schema into
account for atomic updates. Not impossible, just slightly
time-consuming. I think we (the Solr developers) would want it to
still fail indexing in this situation, the failure would just happen
at a different place in the code than it does now, during atomic
document assembly. Fail earlier and faster.
What you'll need to for your circumstances is leave X in the schema,
but change it to a type that will be completely ignored on indexing.
Something like this:
<fieldType
name="ignored"
indexed="false"
stored="false"
docValues="false"
multiValued="true"
class="solr.StrField" />
You could then add the following to take care of any and all unknown
fields:
<dynamicField name="*" type="ignored" multiValued="true" />
Or you could name individual fields like that, which I think would be
a better option than the wildcard dynamic field.
My source for the config snippets:
https://stackoverflow.com/questions/46509259/solr-7-managed-schema-how-to-ignore-unnamed-fields
Thanks,
Shawn
.