delta imports are likely to be far slower that the full imports because it makes one db call per changed row. if you can write the "query" in such a way that it gives only the changed rows, then write a separate entity (directly under <document>) and just run a full-import with that entity only.
On Tue, Aug 18, 2009 at 6:32 AM, Matthew Painter<matthew.pain...@archives.govt.nz> wrote: > Hi, > > We are using Solr's DataImportHandler to populate the Solr index from a > SQL Server database of nearly 4,000,000 rows. Whereas the population > itself is very fast (around 1000 rows per second), the delta import is > only processing around one row a second. > > Is this a known performance issue? We are using Solr 1.3. > > For reference, the abridged entity configuration (cuts indicated by > '...') is below: > > <entity name="id" transformer="ClobTransformer" pk="oid" > query="select archwaypublic.getSolrIdentifier(oid, 'agency') > as oid, oid as realoid, archwaypublic.getSolrIdentifier(oid, 'agency') > as id, code, name, ..." > deltaQuery="select oid from publicagency with (nolock) where > modifiedtime > '${dataimporter.last_index_time}'" > deletedPkQuery="select archwaypublic.getSolrIdentifier(entityoid, > 'agency') as oid from pendingsolrdeletions with (nolock) where > entitytype='agency'"> > > ... > </entity> > > Thanks, > Matt > > This e-mail message and any attachments are CONFIDENTIAL to the addressee(s) > and may also be LEGALLY PRIVILEGED. If you are not the intended addressee, > please do not use, disclose, copy or distribute the message or the > information it contains. Instead, please notify me as soon as possible and > delete the e-mail, including any attachments. Thank you. > -- ----------------------------------------------------- Noble Paul | Principal Engineer| AOL | http://aol.com