My DIH's full-import logs end with a tailing output saying that 1500 documents were added, which is correct because I have 16 sources and one of them was down and each source is supposed to give me 100 results: (1500 adds)],optimize=} 0 0
But When I check my document count I get only 1384 results: INFO: [rss] webapp=/solr path=/select params={start=0&q=*:*&rows=0} hits=1384 status=0 QTime=0 1) I think I may have duplicates based on the primary key for the data coming in. Is there any other explnation than that? 2) Is there some way to get a log of how many documents were deleted? Because an update does a delete then add, this would allow me to make sure of what is going on. The sources I have are URL based, soemtimes they appear to be down because the request gets denied I suppose: SEVERE: Exception thrown while getting data java.io.FileNotFoundException: http://www.amazon.com/rss/tag/anime/popular/ref=tag_tdp_rss_pop_man?length=100 Caused by: java.io.FileNotFoundException: http://www.amazon.com/rss/tag/anime/popular/ref=tag_tdp_rss_pop_man?length=100 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434) 3) Is there some way to configure the datasource to retry 3 time or something like that? I have increased the values for connectionTimeout and readTimeout but it doesn't help when sometimes the server simply denies the request due to heavy load. I need to be able to retry at those times. The onError has only the abort,skip,continue options, non of which really let me retry anything. Thank You. - Pulkit