I need to store 100 million documents in our Solr instance and be able to
retrieve them with simple term queries - keyword matches.  I'm NOT
implementing a search application where documents are scored and
ranked...they either match the keywords or not.  Also, I have an external
ranking system that I need to use to filter and order the search results.

My requirements are for the very fast and reliable retrieval so I'm trying
to figure a place to hook in or customize Solr/Lucene to just do the
simplest thing, reliably and fast.  

1. A naive approach would be to implement a handler, let the query happen
normally then perform N lookups to my external scoring system then filter
and sort the documents.  It seems I may be doing a lot of extra work that
way, especially with paging results and who knows what I'd doing to the
cache.

2. Create a custom FieldType that is virtual and calls out to my external
system? Then queries could be written to return all docs > my rank.

3. Implement custom Query, Weight, Scorer (et al) implementations to
minimize the "Search Stuff" and just delegate calls to my external ranking
system.

4.  A filter of some kind?


I'd love to get a sanity check on any of these approaches or some
recommendations.

Thanks

Jim

-- 
View this message in context: 
http://www.nabble.com/Using-Solr-for-Info-Retreval-not-so-much-Search...-tp18723102p18723102.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to