I am indexing some data using dataimport handler files in solr 3.6.1. I using a nested entity in my handler file. I noticed a scenario where-in instead of the records which is to be fetched for a document, all the records present in the table are indexed.
Following is the ideal scenario how the data has to be indexed. For a document A, I am trying to index the 2 values B,C as a multivalued field <id>A</id> <related_id> <str>B</str> <str>C</str> </related_id> This is how the output should be. I have used the same DIH file for solr 1.4,3.5 versions and the data was indexed fine like the one mentioned above in both the versions. But in solr 3.6.1 version, data was indexed differently. In my table, there are 4 values(B,C,D,E) in related_id field. This is how the data is indexed in 3.6.1 <id>A</id> <related_id> <str>B</str> <str>C</str> <str>D</str> <str>E</str> </related_id> Ideally, the values D and E should not get indexed under id "A". This is the same for the other id records. Following is the content of the DIH file <entity name="ent1" query="select sid as id Table1 a " transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer"> <field column="id" name="id" boost="0.5"/> <entity name="ent2" query="select id1,rid from Table2 " processor="CachedSqlEntityProcessor" cacheKey="id1" cacheLookup="ent1.uid" transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer"> <field column="rid" name="related_id"/> </entity> </entity> I tried changing the CachedSqlEntityProcessor to SqlEntityProcessor and then indexed the same but still I faced the same issue. When I googled a bit, I found this url https://issues.apache.org/jira/browse/SOLR-3360 I am not sure if the issue 3360 is the same as the scenario as I have mentioned above. Please guid me. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Dataimport-Handler-in-solr-3-6-1-tp4001149.html Sent from the Solr - User mailing list archive at Nabble.com.