Greg, what do you mean by 'manually setting the shard on each document'? I explicitly push the documents to their respective shard/port numbers. Something like curl http://localhost:shardport/solr/update --data-binary file.csv -H 'Content-type:text/csv; charset=ISO-8859-1'
I guess that this is making the routing 'implicit'? For example I am pushing 2M rows from CSV files to one shard (About 2GB). The upload is not finished but I am seeing all the shards data directory at about 14GB... I was expecting only the current shard to grow. Thierry On Mon, Aug 12, 2013 at 4:59 PM, Greg Preston <gpres...@marinsoftware.com>wrote: > Are you manually setting the shard on each document? If not, documents > will be hashed across all the shards. > > -Greg > > > On Mon, Aug 12, 2013 at 3:50 PM, Thierry Thelliez < > thierry.thelliez.t...@gmail.com> wrote: > > > Hello, I am trying to set a four shard system for the first time. I do > > not understand why all the shards data are growing at about the same rate > > when I push the documents to only one shard. > > > > The four shards represent four calendar years. And for now, on a > > development machine, these four shards run on four different ports. > > > > The first shard is started with Zookeeper. > > > > The log of the other shards is filed with something like: > > > > 7882051 [qtp1154079020-1245] INFO > > org.apache.solr.update.processor.LogUpdateProcessor – [collection1] > > webapp=/solr path=/update params={distrib.from= > > > > > http://x.y.z.4:50121/solr/collection1/&update.distrib=TOLEADER&wt=javabin&version=2 > > } > > {add=[14939-96467-304 (1443204912169091072), 14939-96467-308 > > (1443204912179576832), 14939-96467-310 (1443204912185868288), > > 14939-96467-311 (1443204912192159744), 14939-96467-313 > > (1443204912204742656), 14939-96467-314 (1443204912220471296), > > 14939-96467-318 (1443204912239345664), 14939-96467-319 > > (1443204912250880000), 14939-96467-322 (1443204912257171456), > > 14939-96467-324 (1443204912263462912)]} 0 282 > > > > What is getting written to the other shards? Is a separate index computed > > on all four shards? I thought that when pushing a document to one shard, > > only that shard would update its index. > > > > > > Thanks, > > Thierry > > >