First of all, yes, you are right, we're trying to optimize quering, but not "just". In our company we arrived to the limit of resources that we can set to our servers (CPU and RAM). We need return to our example, fieldX=true is all the documents that are indexed in the last week (like "news", it may be first_indexed_time:[NOW/DAY-7DAY TO *]), and fieldX=false is for all the documents that were first inserted to the system before the last 7 days (it may be first_indexed_time:[* TO NOW/DAY-7DAY]. We also think about two collections (first for "news" and second for "old" items), but we have tf/idf problem between two collections ("news" collection is very small relative to "old" collection) since we are using solr 4 and there is no distributed IDF.
Second of all, we have already measured the perfomance. We did a naive experiment: created two collections: one is a small collection (all the new documents) and one is a big collection (the other documents). Also we have created alias that units the two collections. We saw that this architecture improved perfomance by 30% (query time and throughput) in compare to the case when we used only one collection. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-custom-document-routing-tp4308432p4308481.html Sent from the Solr - User mailing list archive at Nabble.com.