First of all, yes, you are right, we're trying to optimize quering, but not
"just". In our company we arrived to the limit of resources that we can set
to our servers (CPU and RAM). We need return to our example, fieldX=true is
all the documents that are indexed in the last week (like "news", it may be
first_indexed_time:[NOW/DAY-7DAY TO *]), and fieldX=false is for all the
documents that were first inserted to the system before the last 7 days (it
may be first_indexed_time:[* TO NOW/DAY-7DAY]. We also think about two
collections (first for "news" and second for "old" items), but we have
tf/idf problem between two collections ("news" collection is very small
relative to "old" collection) since we are using solr 4 and there is no
distributed IDF.

Second of all, we have already measured the perfomance. We did a naive
experiment: created two collections: one is a small collection (all the new
documents) and one is a big collection (the other documents). Also we have
created alias that units the two collections. We saw that this architecture
improved perfomance by 30% (query time and throughput) in compare to the
case when we used only one collection.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-custom-document-routing-tp4308432p4308481.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to