On 8/17/2018 6:15 PM, Zimmermann, Thomas wrote:
I’m trying to track down an odd issue I’m seeing when using the
SolrEntityProcessor to seed some test data from a solr 4.x cluster to a solr
7.x cluster. It seems like strings are being interpreted as multivalued when
passed from a string field to a text field via the copyTo directive. Any clever
ideas how to resolve this?
What's happening is deceptively simple.
In the source system, you're copying from author to authorText. Both
fields are stored. So if you have "Jeff Hartley" in author, you also
have "Jeff Hartley" in authorText. So what's happening is that when the
destination system imports from the source system, it gets "Jeff
Hartley" in both fields, and then copyField says "put a copy of what's
in author into authorText" ... and suddenly there are two copies of
"Jeff Hartley" in authorText.
There are two ways to deal with this:
1) In the query you're doing with SolrEntityProcessor, add an "fl"
parameter and list all the fields *except* authorText and any other
field where this same problem is happening.
2) Remove the copyField from the schema until after the import from the
source server is done.
Thanks,
Shawn