We actually have a system that uses weekly shards but that is all .NET 
(Lucene.NET) and has lots of code to manage adding new indexes.  We want to 
move to SOLR for performance and maintenance reasons.  

So if we use some sort of weekly or daily sharding, there needs to be some 
mechanism in place to dynamically add the new shard when the current one fills 
up.  (Which would also ideally know where to put the new shards on what server, 
etc.) Since SOLR does not implement that I was thinking of just having a static 
set of shards.  


On Dec 16, 2011, at 10:54 AM, Otis Gospodnetic wrote:

> Hi,
> 
> We've done a fair number of such things over the years. :)
> If daily shards don't work for you, why not weekly or monthly?
> Have a look at Zoie's Hourglass concept/code.
> Some Solr alternatives are currently better suited to handle this sort of 
> setup...
> 
> Otis 
> ----
> Performance Monitoring SaaS for Solr - 
> http://sematext.com/spm/solr-performance-monitoring/index.html
> 
> 
> 
> ----- Original Message -----
>> From: Robert Stewart <bstewart...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Cc: 
>> Sent: Thursday, December 15, 2011 12:55 PM
>> Subject: Re: how to setup to archive expired documents?
>> 
>> I think managing 100 cores will be too much headache.  Also
>> performance of querying 100 cores will not be good (need
>> page_number*page_size from 100 cores, and then merge).
>> 
>> I think having around 10 SOLR instances, each one about 10M docs.
>> Always search all 10 nodes.  Index using some hash(doc) to distribute
>> new docs among nodes.  Run some nightly/weekly job to delete old docs
>> and force merge (optimize) to some min/max number of segments.  I
>> think that will work ok, but not sure about how to handle
>> replication/failover so each node is redundant.  If we use SOLR
>> replication it will have problems with replication after optimize for
>> large indexes.  Seems to take a long time to move 10M doc index from
>> master to slave (around 100GB in our case).  Doing it once per week is
>> probably ok.
>> 
>> 
>> 
>> 2011/12/15 Avni, Itamar <itamar.a...@verint.com>:
>>> What about managing a core for each day?
>>> 
>>> This way the deletion/archive is very simple. No "holes" in the 
>> index (which is often when deleting document by document).
>>> The index done against core [today-0].
>>> The query is done against cores [today-0],[today-1]...[today-99]. Quite a 
>> headache.
>>> 
>>> Itamar
>>> 
>>> -----Original Message-----
>>> From: Robert Stewart [mailto:bstewart...@gmail.com]
>>> Sent: יום ה 15 דצמבר 2011 16:54
>>> To: solr-user@lucene.apache.org
>>> Subject: how to setup to archive expired documents?
>>> 
>>> We have a large (100M) index where we add about 1M new docs per day.
>>> We want to keep index at a constant size so the oldest ones are removed 
>> and/or archived each day (so index contains around 100 days of data).  What 
>> is 
>> the best way to do this?  We still want to keep older data in some archive 
>> index, not just delete it (so is it possible to export older segments, etc. 
>> into 
>> some other index?).  If we have some daily job to delete old data, I assume 
>> we'd need to optimize the index to actually remove and free space, but that 
>> will require very large (and slow) replication after optimize which will 
>> probably not work out well for so large an index.  Is there some way to 
>> shard 
>> the data or other best practice?
>>> 
>>> Thanks
>>> Bob
>>> This electronic message may contain proprietary and confidential 
>> information of Verint Systems Inc., its affiliates and/or subsidiaries.
>>> The information is intended to be for the use of the individual(s) or
>>> entity(ies) named above.  If you are not the intended recipient (or 
>> authorized to receive this e-mail for the intended recipient), you may not 
>> use, 
>> copy, disclose or distribute to anyone this message or any information 
>> contained 
>> in this message.  If you have received this electronic message in error, 
>> please 
>> notify us by replying to this e-mail.
>>> 
>> 

Reply via email to