Hi,

Yes, you can use Solr for this, but index partitioning should be done outside 
of Solr.  That is, your app will need to know where to send each doc based on 
its timestamp, when and where to create new index (new Solr core), and so on.  
Similarly, deleting older than N days is done by you, using a delete by query 
with a date-based open-ended range query.  The Solr setup is really done the 
same as usual, since all the partitioning-related stuff lives outside of Solr.  
Of course, you could come up with a "Solr Proxy" component that abstract 
some/all of this and pretends to be Solr.


Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: vivek sar <vivex...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, March 25, 2009 3:52:11 PM
> Subject: Partition index by time using Solr
> 
> Hi,
> 
>   I've used Lucene before, but new to Solr. I've gone through the
> mailing list, but unable to find any clear idea on how to partition
> Solr indexes. Here is what we want,
> 
>   1) Be able to partition indexes by timestamp - basically partition
> per day (create a new index directory every day)
> 
>   2) Be  able to search partitions based on timestamp. All our queries
> are time based, so instead of looking into all the partitions I want
> to go directly to the partitions where the data might be.
> 
>   3) Be able to purge any data older than 6 months without bringing
> down the application. Since, partitions would be marked by timestamp
> we would just have to delete the old partitions.
> 
> 
>   This is going to be a distributed system with 2 boxes each running
> an instance of Solr. I don't  want to replicate data, but each box may
> have same timestamp partition with different data. We would be
> indexing on avg of  20 million documents (each document = 500 bytes)
> with estimate of 10g in index size - evenly distributed across
> machines
>   (each machine would get roughly 5g of index everyday).
> 
>   My questions,
> 
>   1) Is this all possible using Solr? If not, should I just do this
> using Lucene or is there any other out-of-box alternative?
>   2) If it's possible in Solr how do we do this - configuration, setup etc.
>   3) How would I optimize the partitions - would it be required when using 
> Solr?
> 
>   Thanks,
>   -vivek

Reply via email to