For (2), look at your admin/stats page. The difference between numDocs and
maxDocs is the number of documents that have been deleted from your
index...

For (3) I don't have a clue about.

Best
Erick

On Sat, Sep 17, 2011 at 7:20 PM, Pulkit Singhal <pulkitsing...@gmail.com> wrote:
> My DIH's full-import logs end with a tailing output saying that 1500
> documents were added, which is correct because I have 16 sources and
> one of them was down and each source is supposed to give me 100
> results:
> (1500 adds)],optimize=} 0 0
>
> But When I check my document count I get only 1384 results:
> INFO: [rss] webapp=/solr path=/select params={start=0&q=*:*&rows=0}
> hits=1384 status=0 QTime=0
>
> 1) I think I may have duplicates based on the primary key for the data
> coming in. Is there any other explnation than that?
> 2) Is there some way to get a log of how many documents were deleted?
> Because an update does a delete then add, this would allow me to make
> sure of what is going on.
>
> The sources I have are URL based, soemtimes they appear to be down
> because the request gets denied I suppose:
> SEVERE: Exception thrown while getting data
> java.io.FileNotFoundException:
> http://www.amazon.com/rss/tag/anime/popular/ref=tag_tdp_rss_pop_man?length=100
> Caused by: java.io.FileNotFoundException:
> http://www.amazon.com/rss/tag/anime/popular/ref=tag_tdp_rss_pop_man?length=100
>        at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434)
>
> 3) Is there some way to configure the datasource to retry 3 time or
> something like that? I have increased the values for connectionTimeout
> and readTimeout but it doesn't help when sometimes the server simply
> denies the request due to heavy load. I need to be able to retry at
> those times. The onError has only the abort,skip,continue options, non
> of which really let me retry anything.
>
> Thank You.
> - Pulkit
>

Reply via email to