Re: What kind of nutch documents does Solr index?

2015-09-30 Thread NutchDev
What Nutch does is, after fetching document from server they are passed to
parser to parse and parser detects the document type and accordingly do the
parsing. 

One possibility could be parser had failed to parse some documents. and
that's why you are getting count mismatch. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/What-kind-of-nutch-documents-does-Solr-index-tp4231646p4232034.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Exclude documents having same data in two fields

2015-10-07 Thread NutchDev
One option could be creating another boolean field field1_equals_field2 and
set it to true for documents matching it while indexing. Use this field as a
filter criteria while querying solr. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Exclude-documents-having-same-data-in-two-fields-tp4233408p4233411.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Fuzzy search for names and phrases

2015-10-07 Thread NutchDev
WordDelimiterFilterFactory can handle cases like,

wi-fi ==> wifi
SD500 ==> sd 500
PowerShot ==> Power Shot

you can get more information at wiki page here,
https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Fuzzy-search-for-names-and-phrases-tp4233209p4233413.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Exclude documents having same data in two fields

2015-10-08 Thread NutchDev
Hi Aman,

Have a look at this , it has query time approach also using Solr function
query,

http://stackoverflow.com/questions/15927893/how-to-check-equality-of-two-solr-fields
http://stackoverflow.com/questions/16258605/query-for-document-that-two-fields-are-equal



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Exclude-documents-having-same-data-in-two-fields-tp4233408p4233489.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to show some documents ahead of others

2015-10-08 Thread NutchDev
Hi Christian,

You can take a look at Solr's  QueryElevationComponent
  . 

It will allow you to configure the top results for a given query regardless
of the normal lucene scoring. Also you can specify exclude document list to
exclude certain results for perticular query.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-show-some-documents-ahead-of-others-tp4233481p4233490.html
Sent from the Solr - User mailing list archive at Nabble.com.