Re: add shard to index

2012-10-15 Thread Radim Kolar
Can you share more please? i do not know how exactly is formula for calculating ratio. if you have something like: (term count in shard 1 + term count in shard 2) / num documents in all shards then just use shard size as weight while computing this: (term count in shard 1 * shard1 keyspace

Re: add shard to index

2012-10-12 Thread Otis Gospodnetic
Hi, Can you share more please? Have you tried this? How well did it work for you? Thanks, Otis -- Search Analytics - http://sematext.com/search-analytics/index.html Performance Monitoring - http://sematext.com/spm/index.html On Fri, Oct 12, 2012 at 7:17 AM, Radim Kolar wrote: > Dne 11.10.201

Re: add shard to index

2012-10-12 Thread Radim Kolar
Dne 11.10.2012 1:12, Upayavira napsal(a): That is what is being discussed already. The thing is, at present, Solr requires an even distribution of documents across shards, so you can't just add another shard, assign it to a hash range, and be done with it. You can use shard size as part of scori

Re: add shard to index

2012-10-10 Thread Upayavira
That is what is being discussed already. The thing is, at present, Solr requires an even distribution of documents across shards, so you can't just add another shard, assign it to a hash range, and be done with it. The reason is down to the scoring mechanism used - TF/IDF (term frequency/inverse d

Re: add shard to index

2012-10-08 Thread Michael Della Bitta
AKA Consistent Hashing: http://en.wikipedia.org/wiki/Consistent_hashing Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Oct 8, 2012 at 11:33 AM, Radim K

Re: add shard to index

2012-10-08 Thread Radim Kolar
Do it as it is done in cassandra database. Adding new node and redistributing data can be done in live system without problem it looks like this: every cassandra node has key range assigned. instead of assigning keys to nodes like hash(key) mod nodes, then every node has its portion of hash k

Re: add shard to index

2012-10-08 Thread Erick Erickson
Right, but even if that worked, you'd then get docs being assigned to the wrong shard. The shard assignment would be something like (hash(id)/3). So a document currently on shard 0 would be indexed next time, perhaps, on shard 2, leaving two "live" docs in your system with the same ID. Bad Things w

Re: add shard to index

2012-10-08 Thread Rafał Kuć
Hello! Radim there is a JIRA issue - https://issues.apache.org/jira/browse/SOLR-3755. It is work in progress, but once finished Solr will enable you to add additional shards on a live collection and split the ones that were already created. -- Regards, Rafał Kuć Sematext :: http://sematext.com

Re: add shard to index

2012-10-08 Thread Upayavira
Given that Solr does not support distributed IDF, adding a shard without balancing the number of documents could seriously skew your scoring. If you are okay with that, then the next question is what happens if you download the clusterstate.json from ZooKeeper, and add another entry, along the line

add shard to index

2012-10-07 Thread Radim Kolar
i am reading this: http://wiki.apache.org/solr/SolrCloud section Re-sizing a Cluster Its possible to add shard to an existing index? I do not need to get data redistributed, they can stay where they are, its enough for me if new entries will be distributed into new number of shards. restarting