Why do you want to control what gets indexed into a core and then
knowing what core to search? That's the kind of "knowing" that SolrCloud
solves. In SolrCloud, it handles the distribution of documents across
shards and retrieves them regardless of which node is searched from.
That is the point of "cloud", you don't know the details of where
exactly documents are being managed (i.e. they are cloudy). It can
change and re-balance from time to time. SolrCloud performs the
distributed search for you, therefore when you try to search a node/core
with no documents, all the results from the "cloud" are retrieved
regardless. This is considered "A Good Thing".

It requires a change in thinking about indexing and searching....

On Tue, 2012-05-22 at 08:43 +0800, Yandong Yao wrote:
> Hi Guys,
> 
> I use following command to start solr cloud according to solr cloud wiki.
> 
> yydzero:example bjcoe$ java -Dbootstrap_confdir=./solr/conf
> -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
> yydzero:example2 bjcoe$ java -Djetty.port=7574 -DzkHost=localhost:9983 -jar
> start.jar
> 
> Then I have created several cores using CoreAdmin API (
> http://localhost:8983/solr/admin/cores?action=CREATE&name=
> <coreName>&collection=collection1), and clusterstate.json show following
> topology:
> 
> 
> collection1:
>     -- shard1:
>           -- collection1
>           -- CoreForCustomer1
>           -- CoreForCustomer3
>           -- CoreForCustomer5
>     -- shard2:
>           -- collection1
>           -- CoreForCustomer2
>           -- CoreForCustomer4
> 
> 
> 1) Index:
> 
> Using following command to index mem.xml file in exampledocs directory.
> 
> yydzero:exampledocs bjcoe$ java -Durl=
> http://localhost:8983/solr/coreForCustomer3/update -jar post.jar mem.xml
> SimplePostTool: version 1.4
> SimplePostTool: POSTing files to
> http://localhost:8983/solr/coreForCustomer3/update..
> SimplePostTool: POSTing file mem.xml
> SimplePostTool: COMMITting Solr index changes.
> 
> And now SolrAdmin UI shows that 'coreForCustomer1', 'coreForCustomer3',
> 'coreForCustomer5' has 3 documents (mem.xml has 3 documents) and other 2
> core has 0 documents.
> 
> *Question 1:*  Is this expected behavior? How do I to index documents into
> a specific core?
> 
> *Question 2*:  If SolrCloud don't support this yet, how could I extend it
> to support this feature (index document to particular core), where should i
> start, the hashing algorithm?
> 
> *Question 3*:  Why the documents are also indexed into 'coreForCustomer1'
> and 'coreForCustomer5'?  The default replica for documents are 1, right?
> 
> Then I try to index some document to 'coreForCustomer2':
> 
> $ java -Durl=http://localhost:8983/solr/coreForCustomer2/update -jar
> post.jar ipod_video.xml
> 
> While 'coreForCustomer2' still have 0 documents and documents in ipod_video
> are indexed to core for customer 1/3/5.
> 
> *Question 4*:  Why this happens?
> 
> 2) Search: I use "
> http://localhost:8983/solr/coreForCustomer2/select?q=*%3A*&wt=xml"; to
> search against 'CoreForCustomer2', while it will return all documents in
> the whole collection even though this core has no documents at all.
> 
> Then I use "
> http://localhost:8983/solr/coreForCustomer2/select?q=*%3A*&wt=xml&shards=localhost:8983/solr/coreForCustomer2";,
> and it will return 0 documents.
> 
> *Question 5*: So If want to search against a particular core, we need to
> use 'shards' parameter and use solrCore name as parameter value, right?
> 
> 
> Thanks very much in advance!
> 
> Regards,
> Yandong


Reply via email to