Simple add this line to your crontab with crontab -e command: 0,30 * * * * /usr/bin/wget http://<solr_host>:8983/solr/<core_name>/dataimport?command=full-import
This will full import every 30 minutes. Replace <solr_host> and <core_name> with your configuration *Using delta-import command* Delta Import operation can be started by hitting the URL http://localhost:8983/solr/dataimport?command=delta-import. This operation will be started in a new thread and the status attribute in the response should be shown busy now. Depending on the size of your data set, this operation may take some time. At any time, you can hit http://localhost:8983/solr/dataimport to see the status flag. When delta-import command is executed, it reads the start time stored in conf/dataimport.properties. It uses that timestamp to run delta queries and after completion, updates the timestamp in conf/dataimport.properties. Note: there is an alternative approach for updating documents in Solr, which is in many cases more efficient and also requires less configuration explained on DataImportHandlerDeltaQueryViaFullImport. *Delta-Import Example* We will use the same example database used in the full import example. Note that the database schema has been updated and each table contains an additional column last_modified of timestamp type. You may want to download the database again since it has been updated recently. We use this timestamp field to determine what rows in each table have changed since the last indexed time. Take a look at the following data-config.xml <dataConfig> <dataSource driver="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:/temp/example/ex" user="sa" /> <document name="products"> <entity name="item" pk="ID" query="select * from item" deltaImportQuery="select * from item where ID='${dih.delta.id}'" deltaQuery="select id from item where last_modified > '${dih.last_index_time}'"> <entity name="feature" pk="ITEM_ID" query="select description as features from feature where item_id='${item.ID}'"> </entity> <entity name="item_category" pk="ITEM_ID, CATEGORY_ID" query="select CATEGORY_ID from item_category where ITEM_ID='${item.ID}'"> <entity name="category" pk="ID" query="select description as cat from category where id = '${item_category.CATEGORY_ID}'"> </entity> </entity> </entity> </document> </dataConfig> Pay attention to the deltaQuery attribute which has an SQL statement capable of detecting changes in the item table. Note the variable ${dataimporter.last_index_time} The DataImportHandler exposes a variable called last_index_time which is a timestamp value denoting the last time full-import 'or' delta-import was run. You can use this variable anywhere in the SQL you write in data-config.xml and it will be replaced by the value during processing. -- View this message in context: http://lucene.472066.n3.nabble.com/Automating-Solr-tp4166696p4166707.html Sent from the Solr - User mailing list archive at Nabble.com.