You can cache the subentity, then it will retrieve all the data for that entity in 1 query.
See http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor for more information. This section focuses on caching data from SQLEntityProcessor. However, it is now possible to cache data from other entity types also. Also, it is possible to plug in cache implementations if the default in-memory cache does not scale for you. See https://issues.apache.org/jira/browse/SOLR-2382 . James Dyer Ingram Content Group (615) 213-4311 -----Original Message----- From: harpax [mailto:a.psczo...@pan-sonic.de] Sent: Monday, March 04, 2013 8:49 AM To: solr-user@lucene.apache.org Subject: solr-dih does multiple queries for sub-entities Hi, I am trying to use the DIH for crawling over some xml-files and xpathing them and then access a db with the filename as a key. That works, but reading ~30.000 docs would take almost 3h. When I looked at the DIH-Debug-console it showed me, that way to many db-calls were made: 1 for the 1st doc, then 2, 3, 4, .. I tried different attributes combinations (eg stripped it to the minimum), but still the same. This problem was asked before: http://lucene.472066.n3.nabble.com/DIH-multiple-queries-per-sub-entity-tt701038.html thanks a lot! regards Arne -- <?xml version="1.0" encoding="UTF-8"?> <dataConfig> <dataSource name="cr-db" jndiName="xyz" type="JdbcDataSource" /> <dataSource name="cr-xml" type="FileDataSource" encoding="utf-8" /> <document name="doc"> <entity dataSource="cr-xml" name="f" processor="FileListEntityProcessor" baseDir="/path/to/xml" filename="*.xml" recursive="true" rootEntity="true" onError="skip"> <entity name="xml-data" dataSource="cr-xml" processor="XPathEntityProcessor" forEach="/root" url="${f.fileAbsolutePath}" transformer="DateFormatTransformer" onError="skip"> <field column="id" xpath="/root/id" /> <field column="A" xpath="/root/a" /> </entity> <entity name="db-data" dataSource="cr-db" query=" SELECT id, b FROM a_table WHERE id = '${f.file}'"> <field column="B" name="b" /> </entity> </entity> </document> </dataConfig> -- -- View this message in context: http://lucene.472066.n3.nabble.com/solr-dih-does-multiple-queries-for-sub-entities-tp4044522.html Sent from the Solr - User mailing list archive at Nabble.com.