On 8/1/2013 1:25 AM, deniz wrote: > So lets say that i have some kinda large data to index and it takes around 2 > hours to finish the full import. > > When I start full import at 1pm, what happens if some data in db is updated > at 1:15 or 2pm while full import is still going on? will they be lost on > solr side or they will be added to solr index? or it all depends if that > updated data was indexed before the update or not?
For the most part, you'll have to do another import in order to get those added/updated records. Most SQL databases have a concept of transactions such that a long-running query will always return data as it was at the start of the transaction, regardless of what happens to the database in the meantime. If your DIH entities are not nested, then you can be sure that this applies to you. When you have nested entities and don't use Solr's entity caching, the situation can be more complex. I won't go into the intricacies here. The way that I solve this problem is by using a SolrJ/JDBC program (which I wrote myself) to do all my minute-to-minute indexing, and only use DIH when I need to completely rebuild my index. Thanks, Shawn