-user@lucene.apache.org
Subject: Re: Removing irrelevant URLS
You can always do a delete-by-query, but that pre-supposes you can form
a query that would remove only those documents with URLs you want
removed... Assuming you do this, an optimize would then physically
remove the documents from your index
You can always do a delete-by-query, but that pre-supposes you can form
a query that would remove only those documents with URLs you want
removed... Assuming you do this, an optimize would then physically
remove the documents from your index (delete by query just marks
the docs as deleted).
Solr h
Hi,
I have 100k URL's in my index. I specifically crawled sits relating to law.
However, during my intitial crawls I didn't specify urlfilters so I am stuck
with extrinsic and often irrelevant URL's like twitter, etc.
Is there some way in Solr that I can run periodic URL cleanings to remov