Greg,  what do you mean by 'manually setting the shard on each document'?
 I explicitly push the documents to their respective shard/port numbers.
 Something like
curl http://localhost:shardport/solr/update --data-binary file.csv -H
'Content-type:text/csv; charset=ISO-8859-1'

I guess that this is making the routing 'implicit'?

For example I am pushing 2M rows from CSV files to one shard (About 2GB).
 The upload is not finished but I am seeing all the shards data directory
at about 14GB...    I was expecting only the current shard to grow.

Thierry


On Mon, Aug 12, 2013 at 4:59 PM, Greg Preston <gpres...@marinsoftware.com>wrote:

> Are you manually setting the shard on each document?  If not, documents
> will be hashed across all the shards.
>
> -Greg
>
>
> On Mon, Aug 12, 2013 at 3:50 PM, Thierry Thelliez <
> thierry.thelliez.t...@gmail.com> wrote:
>
> > Hello,  I am trying to set a four shard system for the first time.  I do
> > not understand why all the shards data are growing at about the same rate
> > when I push the documents to only one shard.
> >
> > The four shards represent four calendar years.  And for now, on a
> > development machine, these four shards run on four different ports.
> >
> > The first shard is started with Zookeeper.
> >
> > The log of the other shards is filed with something like:
> >
> > 7882051 [qtp1154079020-1245] INFO
> > org.apache.solr.update.processor.LogUpdateProcessor – [collection1]
> > webapp=/solr path=/update params={distrib.from=
> >
> >
> http://x.y.z.4:50121/solr/collection1/&update.distrib=TOLEADER&wt=javabin&version=2
> > }
> > {add=[14939-96467-304 (1443204912169091072), 14939-96467-308
> > (1443204912179576832), 14939-96467-310 (1443204912185868288),
> > 14939-96467-311 (1443204912192159744), 14939-96467-313
> > (1443204912204742656), 14939-96467-314 (1443204912220471296),
> > 14939-96467-318 (1443204912239345664), 14939-96467-319
> > (1443204912250880000), 14939-96467-322 (1443204912257171456),
> > 14939-96467-324 (1443204912263462912)]} 0 282
> >
> > What is getting written to the other shards? Is a separate index computed
> > on all four shards?  I thought that when pushing a document to one shard,
> > only that shard would update its index.
> >
> >
> > Thanks,
> > Thierry
> >
>

Reply via email to