First of all, what evidence do you have that you even need to shard?
12 M documents is quite a small index by Solr standards, just test it
and see.
As far as replication, 10 minutes is probably a good place to start, but
you can experiment with reducing it. I've often found that "real time" is
usu
Appreciate your reply. Have some more follow up questions inline.
On Thu, Feb 2, 2012 at 12:35 AM, Emmanuel Espina
wrote:
>> 1. Adds : 20 docs/sec
>> 2. Searches : 100 searches/sec
>> 3. Deletes : (20*3600*24*7 ~ 12 mill ) docs/week ( basically a cron
>> job which deletes all documents more than
In addition to what Emmanuel mentioned, why not consider 7 shards? If
you used one shard/day, your delete problem becomes really easy,
just nuke the oldest shard
Although beware that this solution may affect your TF/IDF calculations
on the new shard (i.e. the one you use for *today's* data) un
2012/2/1 prasenjit mukherjee :
> I have the following requirements :
>
> 1. Adds : 20 docs/sec
> 2. Searches : 100 searches/sec
> 3. Deletes : (20*3600*24*7 ~ 12 mill ) docs/week ( basically a cron
> job which deletes all documents more than 7 days old )
>
> I am thinking of having 6 shards ( with
I have the following requirements :
1. Adds : 20 docs/sec
2. Searches : 100 searches/sec
3. Deletes : (20*3600*24*7 ~ 12 mill ) docs/week ( basically a cron
job which deletes all documents more than 7 days old )
I am thinking of having 6 shards ( with each having 2 million docs )
with 1 master an