I'm interesting in using the new custom sharding features in the collections API to search a rolling window of event data. I'd appreciate a spot/sanity check of my plan/understanding.
Say I only care about the last 7 days of events and I have thousands per second (billions per week). Am I correct that I could create a new shard for each hour, and send events that happen in those hour with the ID (uniqueKey) of `new_event_hour!event_id` so that each hour block of events goes into one shard? I *always* query these events by the time in which they occurred, which is another TrieInt field that I index with every document. So at query time I would need to calculate the range the user cared about and send something like _route_=hour1&_route_=hour2 if I wanted to only query those two shards. (I *can* set multiple _route_ arguments in one query, right? And Solr will handle merging results like it would with any other cores?) Some scheduled task would drop and delete shards after they were more than 7 days old. Does all of that make sense? Do you see a smarter way to do large "time-oriented" search in SolrCloud? Thanks!