Hi, You are correct about not wanting to index everything every day, however for this PoC i need a 'bootstrap' mechanism which basically does what Endeca does.
The 'defaultRowPrefetch' in the solrconfig.xml does not seem to take, i'll have a closer look. With the long time, it appeard that one of the views i was reading was also by far the biggest with over 4mln entries. Other views should take much less time. With regards to the parallel processing, i have the 2 classes you mention and packaged them. The documentation in the patch was not clear on how to exactly do that. My assumption is that * for every entity you have to define a DIH in the solrconfig and refer to aspecific data-config-<entity>.xml * define 1 importhandler for the join in the solrconfig * what isn't clear is how a data-config-<entity>.xml should look like (for example, i see no reference in the documention to a cacheName) * and how the data-config-join.xml should should look like My first attempt: the data-config-products.xml (parent) <dataSource name="jdbc1" driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@//<host>:1521/ENDDEV" user="un" password="pw"/> <document> <entity name="END_FRG_PRODUCTS_VW" processor="SqlEntityProcessor" cacheImpl="org.apache.solr.handler.dataimport.BerkleyBackedCache" writerImpl="org.apache.solr.handler.dataimport.DIHCacheWriter" dataSource="jdbc1" rootEntity="true" persistCacheName="PRODUCTS" persistCacheBaseDir="d:\cacheloc" berkleyInternalCacheSize="1000000" persistCacheFieldNames="PDT_ID,SEARCH_TITLE,PDT_GLOBAL_ID,PDT_EAN_CODE,PDT_TYP_CODE,PDT_AVAILABILITY,AVAIL_CODE_OFF_STOCK,AVAIL_CODE_ON_STOCK,OFFER_TYPE" persistCacheFieldTypes="STRING,STRING,STRING,STRING,STRING,STRING,STRING,STRING" query="select PDT_ID,SEARCH_TITLE,PDT_GLOBAL_ID,PDT_EAN_CODE,PDT_TYP_CODE,PDT_AVAILABILITY,AVAIL_CODE_OFF_STOCK,AVAIL_CODE_ON_STOCK,OFFER_TYPE from END_FRG_PRODUCTS_VW"> </entity> </document> the data-config-features (child): <dataSource name="jdbc1" driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@//<host>:1521/ENDDEV" user="un" password="pw" batchSize="20000"/> <document> <entity name="END_FRG_FEATURES_VW" processor="SqlEntityProcessor" cacheImpl="org.apache.solr.handler.dataimport.BerkleyBackedCache" writerImpl="org.apache.solr.handler.dataimport.DIHCacheWriter" persistCacheName="FEATURE" persistCacheBaseDir="d:\cacheloc" berkleyInternalCacheSize="1000000" persistCacheFieldNames="PDT_ID,PDT_FEATURES" persistCacheFieldTypes="STRING,STRING" berkleyInternalShared="true" cacheKey="PDT_ID" cacheLookup="END_FRG_PRODUCTS_VW.PDT_ID" dataSource="jdbc1" query="select PDT_ID, PDT_FEATURES from END_FRG_FEATURES_VW"/> </document> the data-config-join.xml <entity name="END_FRG_PRODUCTS_VW" processor="org.apache.solr.handler.dataimport.DIHCacheProcessor" rootEntity="true" name="PARENT" persistCacheFieldNames="PDT_ID,SEARCH_TITLE,PDT_GLOBAL_ID,PDT_EAN_CODE,PDT_TYP_CODE,PDT_AVAILABILITY,AVAIL_CODE_OFF_STOCK,AVAIL_CODE_ON_STOCK,OFFER_TYPE" persistCacheFieldTypes="STRING,STRING,STRING,STRING,STRING,STRING,STRING,STRING" <entity name="END_FRG_FEATURES_VW" processor="org.apache.solr.handler.dataimport.DIHCacheProcessor" cacheImpl="org.apache.solr.handler.dataimport.BerkleyBackedCache" persistCacheName="FEATURE" persistCacheBaseDir="d:\cacheloc" berkleyInternalCacheSize="1000000" persistCacheFieldNames="PDT_ID,PDT_FEATURES" persistCacheFieldTypes="STRING,STRING" berkleyInternalShared="true" cacheKey="PDT_ID" cacheLookup="END_FRG_PRODUCTS_VW.PDT_ID"/> Is this a correct setup? Hope you can give some pointers. Thanks, Maarten -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-nested-entities-don-t-work-tp4015514p4020727.html Sent from the Solr - User mailing list archive at Nabble.com.