subject:"Solr indexing Duplicate URL's ending with \/"

Re: Solr indexing Duplicate URL's ending with /

2018-08-29 Thread Jan Høydahl

Hi, You would have to direct this question to the crawler you are using, since it is the crawler that decides the document ID to send to Solr. Most crawlers will have configuration options to normalize the URL for each document. However you could also try to clean the URL after it arrives in SO

Solr indexing Duplicate URL's ending with /

2018-08-29 Thread kunhu0...@gmail.com

Team, Need suggestion on how to remove the duplicate entries while indexing to Solr. Below are the sample entries i see in solr collection while i need to remove the one which is ending with / https://www.abc.com/2018/test.html https://www.abc.com/2018/test.html/ Thank you -- Sent from: http