Can you tell whether it's the "index" folder that is that large or is it including the "tlog" transaction log folder? If you have a huge transaction log, you need to start sending hard commits more often during indexing to flush the tlogs.
-- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 4. mars 2013 kl. 04:16 skrev alx...@aim.com: > Hello, > > I had a non cloud collection index size around 80G for 15M documents with > solr-4.1.0. So, I decided to use solr cloud with two shards and sent to solr > the following command > > curl > 'http://slave:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&replicationFactor=1&maxShardsPerNode=1' > > I tried to put replicationFactor=0 but this command gave an error. After > reindexing, into two separate linux boxes with one instances of solr running > in each of them I see that size of index in each shard is 90GB versus > expected 40GB although each of the shards has half (7.5M) of documents. > > Any ideas what went wrong? > > Thanks. > Alex.