What's the relation between items and item_descriptions table? I.e. is there only one item_descriptions record for every id?
If 1-1 then you can merge all your data into single database and use the following query <entity name="item" dataSource="single_datasource" query="select * from items inner join item_descriptions on item_descriptions.id=items.id"> </entity> HTH, Alex On Thu, Jun 3, 2010 at 6:34 AM, Blargy <zman...@hotmail.com> wrote: > > > Erik Hatcher-4 wrote: >> >> One thing that might help indexing speed - create a *single* SQL query >> to grab all the data you need without using DIH's sub-entities, at >> least the non-cached ones. >> >> Erik >> >> On Jun 2, 2010, at 12:21 PM, Blargy wrote: >> >>> >>> >>> As a data point, I routinely see clients index 5M items on normal >>> hardware >>> in approx. 1 hour (give or take 30 minutes). >>> >>> Also wanted to add that our main entity (item) consists of 5 sub- >>> entities >>> (ie, joins). 2 of those 5 are fairly small so I am using >>> CachedSqlEntityProcessor for them but the other 3 (which includes >>> item_description) are normal. >>> >>> All the entites minus the item_description connect to datasource1. >>> They >>> currently point to one physical machine although we do have a pool >>> of 3 DB's >>> that could be used if it helps. The other entity, item_description >>> uses a >>> datasource2 which has a pool of 2 DB's that could potentially be >>> used. Not >>> sure if that would help or not. >>> >>> I might as well that the item description will have indexed, stored >>> and term >>> vectors set to true. >>> -- >>> View this message in context: >>> http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865219.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >> >> >> > > I can't find any example of creating a massive sql query. Any out there? > Will batching still work with this massive query? > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p866506.html > Sent from the Solr - User mailing list archive at Nabble.com. >