The URL Classify Update Processor can take a URL and split it into pieces,
including the host name.
http://lucene.apache.org/solr/4_3_0/solr-core/org/apache/solr/update/processor/URLClassifyProcessorFactory.html
Unfortunately, the Javadoc is sparse, not even one example.
I have some examples in the book.
You can also use a regular expression tokenfilter to extract the host name
as well.
And you can use standard Solr "grouping" to group by the field containing
host name.
-- Jack Krupansky
-----Original Message-----
From: Wojciech Kapelinski
Sent: Thursday, June 27, 2013 8:18 AM
To: solr-user@lucene.apache.org
Subject: displaying one result per domain
I'm looking for a neat solution to replace default multiple results from
single domain in SERP
somepage.com/contact.html
somepage.com/aboutus.html
otherpage.net/info.html
somepage.com/directions.html etc
with only one result per each domain [main URL by default]
somepage.com
otherpage.net
completelydifferentpage.org
Tried grouping by Carrot2 but it's not exactly what I'm looking for.
Thanks in advance.