I need to store 100 million documents in our Solr instance and be able to retrieve them with simple term queries - keyword matches. I'm NOT implementing a search application where documents are scored and ranked...they either match the keywords or not. Also, I have an external ranking system that I need to use to filter and order the search results.
My requirements are for the very fast and reliable retrieval so I'm trying to figure a place to hook in or customize Solr/Lucene to just do the simplest thing, reliably and fast. 1. A naive approach would be to implement a handler, let the query happen normally then perform N lookups to my external scoring system then filter and sort the documents. It seems I may be doing a lot of extra work that way, especially with paging results and who knows what I'd doing to the cache. 2. Create a custom FieldType that is virtual and calls out to my external system? Then queries could be written to return all docs > my rank. 3. Implement custom Query, Weight, Scorer (et al) implementations to minimize the "Search Stuff" and just delegate calls to my external ranking system. 4. A filter of some kind? I'd love to get a sanity check on any of these approaches or some recommendations. Thanks Jim -- View this message in context: http://www.nabble.com/Using-Solr-for-Info-Retreval-not-so-much-Search...-tp18723102p18723102.html Sent from the Solr - User mailing list archive at Nabble.com.