You can always do a delete-by-query, but that pre-supposes you can form a query that would remove only those documents with URLs you want removed... Assuming you do this, an optimize would then physically remove the documents from your index (delete by query just marks the docs as deleted).
Solr has nothing specifically for URLs, it's an engine rather than a web crawling app.... Best Erick On Fri, Nov 5, 2010 at 4:33 PM, Eric Martin <e...@makethembite.com> wrote: > Hi, > > > > I have 100k URL's in my index. I specifically crawled sits relating to law. > However, during my intitial crawls I didn't specify urlfilters so I am > stuck > with extrinsic and often irrelevant URL's like twitter, etc. > > > > Is there some way in Solr that I can run periodic URL cleanings to remove > URL's and search string results? Or, should I just dump my index and > rebuild > using the filter? > > > > I have looked on the Solr wiki and came across some candidates that look > like it is what I am trying to accomplish but am not sure. If anyone knows > where I should be looking I would appreciate it. > > > > Eric > >