As per the example in the wiki - http://wiki.apache.org/solr/DataImportHandler - I am seeing the following fragment.
<dataSource driver="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:/temp/example/ex" user="sa" /> <document name="products"> <entity name="item" query="select * from item"> <field column="ID" name="id" /> <field column="NAME" name="name" /> ...................... </entity> </document> </dataSource> My scaled-down application looks very similar along these lines but where my resultset is so big that it cannot fit within main memory by any chance. So I was planning to split this single query into multiple subqueries - with another conditional based on the id . ( id < 0 and id > 100 , say ) . I am curious if there is any way to specify another conditional clause , (<splitData Column = "id" batch="10000" />, where the column is supposed to be an integer value) - and internally , the implementation could actually generate the subqueries - i) get the min , max of the numeric column , and send queries to the database based on the batch size ii) Add Documents for each batch and close the resultset . This might end up putting more load on the database (but at least the dataset would fit in the main memory ). Let me know if anyone else had run into similar issues and how this was encountered.