: We've successfully setup Solr 3.4.0 to parse and import multiple news 
: RSS feeds (based on the slashdot example on 
: http://wiki.apache.org/solr/DataImportHandler) using the HttpDataSource.

: The objective is for Solr to index ALL news items published on this feed 
: (ever) - not just the current contents of the feed. I've read that the 
: delta import is not supported for XML imports. I've therefore tried to 
: use "command=full-impor&clean=false". 

1) note your typo, should be "full-import"

: But still the number of Documents Processed seems to be stuck at a fixed 
: number of items looking at the Stats and the 'numFound' result for a 
: generic '*:*' search. New items are being added to the feeds all the 
: time (and old ones dropping off).

"Documents Processed" after each full import should be whatever the number 
of items in the current feed is -- it's the number processed in that 
import, no total number processed in all time.

if you specify clean=false no documents should be deleted.  I just tested 
this using the slashdot example with Solr 3.4 and could not reproduce the 
problem you described.  I loaded the following URL...

http://localhost:8983/solr/rss/dataimport?clean=false&command=full-import

...then waited a while for the feed to cahnge, and then loaded that URL 
again.  The number of documents (returned by a *:* query) increased after 
the second run.


-Hoss

Reply via email to