2010/12/15 Robert Gründler <rob...@dubture.com>: > The data-config.xml looks like this (only 1 entity): > > <entity name="track" query="select t.id as id, t.title as title, l.title > as label from track t left join label l on (l.id = t.label_id) where > t.deleted = 0" transformer="TemplateTransformer"> > <field column="title" name="title_t" /> > <field column="label" name="label_t" /> > <field column="id" name="sf_meta_id" /> > <field column="metaclass" template="Track" name="sf_meta_class"/> > <field column="metaid" template="${track.id}" name="sf_meta_id"/> > <field column="uniqueid" template="Track_${track.id}" > name="sf_unique_id"/> > > <entity name="artists" query="select a.name as artist from artist a > left join track_artist ta on (ta.artist_id = a.id) where > ta.track_id=${track.id}"> > <field column="artist" name="artists_t" /> > </entity> > > </entity>
So there's one track entity with an artist sub-entity. My (admittedly rather limited) experience has been that sub-entities, where you have to run a separate query for every row in the parent entity, really slow down data import. For my own purposes, I wrote a custom data import using SolrJ to improve the performance (from 3 hours to 10 minutes). Just as a test, how long does it take if you comment out the artists entity?