Hi all, I just wanted to make the simplest repro of this issue, which now I am thinking might be related to the decision made in: https://issues.apache.org/jira/browse/SOLR-3080 ? And this is the expected behaviour?
1. Download SOLR 4 production and extract. 2. Replace solr.xml in apache-solr-4.0.0/example/solr/solr.xml with: <?xml version="1.0" encoding="UTF-8" ?> <solr persistent="true"> <cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}"> <core shard="shard1" instanceDir="collection1/" name="collection1" collection="polecat"/> <core shard="shard1" instanceDir="collection2/" name="collection2" collection="polecat"/> <core schema="schema.xml" shard="core3" instanceDir="core3/" name="core3" config="solrconfig.xml" collection="polecat" dataDir="data"/> </cores> </solr> 3. Start solr with: java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -Dsolrcloud.skip.autorecovery=true -jar start.jar (skip.autorecovery is used because the shards don't exist previously) Then run this: Sanity query: http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true Remove the core: http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core3&deleteIndex=true Error query: http://localhost:8983/solr/polecat/select?q=*%3A*&wt=xml&distrib=true And the sanity query, we will receive 0 records, the error query "no servers hosting shard:". And in the clusterstate.json: "core3":{"replicas":{}}}} Regards, Gilles -----Original Message----- From: Gilles Comeau [mailto:gilles.com...@polecat.co] Sent: 13 November 2012 16:39 To: solr-user@lucene.apache.org; markrmil...@gmail.com Subject: RE: Removing Shards from Zookeeper - no servers hosting shard Sorry forgot.. pictures are no good.. From cluster.json, the same information, the core I unloaded shard sticks around: “"solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}}” Do I need a special command to delete the shard or something? I’ve never seen a command that does that? Regards, Gilles "experiment":{ "solrexperiment:8080_solr_experiment_master":{"replicas":{"IS-17093:9090_solr_experiment_master":{ "shard":"solrexperiment:8080_solr_experiment_master", "roles":null, "state":"active","core":"experiment_master","collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}}, "solrexperiment:8080_solr_experiment_01_10_2012":{"replicas":{"IS-17093:9090_solr_01_10_2012_experiment":{ "shard":"solrexperiment:8080_solr_experiment_01_10_2012","roles":null,"state":"active","core":"01_10_2012_experiment", "collection":"experiment","node_name":"IS-17093:9090_solr","base_url":"http://IS-17093:9090/solr","leader":"true"}}}, "solrexperiment:8080_solr_experiment_02_10_2012":{"replicas":{}}}} From: Gilles Comeau [mailto:gilles.com...@polecat.co] Sent: 13 November 2012 16:29 To: solr-user@lucene.apache.org; markrmil...@gmail.com Subject: RE: Removing Shards from Zookeeper - no servers hosting shard When I do the unload through the UI, I see the below messages in the solr log. Nothing in the zookeeper log. Then right after I try: http://217.147.83.124:9090/solr/experiment_master/select?q=*%3A*&wt=xml&distrib=true and get <str name="msg">no servers hosting shard:</str>. Also, I still see the shard being referenced in the cloud tab in the UI. [cid:image001.png@01CDC1BB.FD2BE590] Does this work for anyone else using SOLR 4.0 production with external zookeeper and distributed queries and if so, can you let me know exactly what versions and steps you take to not get this error? ☺ Anyone else have any problems getting this to work? My setup is pretty basic: Local external zookeeper 3.3.6, solr 4.0 with three cores seen above. Regards, Gilles INFO: [02_10_2012_experiment] CLOSING SolrCore org.apache.solr.core.SolrCore@11e3c2c6<mailto:org.apache.solr.core.SolrCore@11e3c2c6> 13-Nov-2012 16:19:13 org.apache.solr.core.SolrCore closeSearcher INFO: [02_10_2012_experiment] Closing main searcher on request. 13-Nov-2012 16:19:13 org.apache.solr.search.SolrIndexSearcher close FINE: Closing Searcher@7cd47880 main fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=7,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} queryResultCache{lookups=4,hits=3,hitratio=0.75,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=4,cumulative_hits=3,cumulative_hitratio=0.75,cumulative_inserts=1,cumulative_evictions=0} documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0} 13-Nov-2012 16:19:13 org.apache.solr.core.CachingDirectoryFactory close FINE: Closing: CachedDir<<org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false<mailto:org.apache.lucene.store.MMapDirectory@/solr2/cores/02_10_2012/data/index%20lockFactory=org.apache.lucene.store.NativeFSLockFactory@717757ad;refCount=1;path=/solr2/cores/02_10_2012/data/index;done=false>>> 13-Nov-2012 16:19:13 org.apache.solr.update.DirectUpdateHandler2 close INFO: closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0} 13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref INFO: SolrCoreState ref count has reached 0 - closing IndexWriter 13-Nov-2012 16:19:13 org.apache.solr.update.DefaultSolrCoreState decref INFO: Closing SolrCoreState - canceling any ongoing recovery 13-Nov-2012 16:19:13 org.apache.solr.core.CoreContainer persistFile INFO: Persisting cores config to /solr2/solr.xml 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal FINE: null solr/cores/@adminPath=/admin/cores 13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode FINE: null missing optional solr/cores/@shareSchema 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal FINE: null solr/cores/@hostPort=9090 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal FINE: null solr/cores/@zkClientTimeout=10000 13-Nov-2012 16:19:13 org.apache.solr.core.Config getVal FINE: null solr/cores/@hostContext=solr 13-Nov-2012 16:19:13 org.apache.solr.core.Config getNode FINE: null missing optional solr/cores/@leaderVoteWait 13-Nov-2012 16:19:13 org.apache.solr.core.SolrXMLSerializer persistFile INFO: Persisting cores config to /solr2/solr.xml 13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader updateClusterState INFO: Updating cloud state from ZooKeeper... 13-Nov-2012 16:19:13 org.apache.solr.common.cloud.ZkStateReader$2 process INFO: A cluster state change has occurred - updating... -----Original Message----- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: 13 November 2012 14:13 To: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org> Subject: Re: Removing Shards from Zookeeper - no servers hosting shard Odd...the unload command should be enough... On Tue, Nov 13, 2012 at 5:26 AM, Gilles Comeau <gilles.com...@polecat.co<mailto:gilles.com...@polecat.co>> wrote: > Hi all, > > We've just updated to SOLR 4.0 production and Zookeeper 3.3.6 from SOLR 4.0 > development version circa November 2011. We keep 6 months of data online in > our primary cluster, and archive off old stuff to a slower disk archive > cluster. We used to remove SOLR cores with the following code, but > everything has changed in Zookeeper now. > > Old code to remove cores from Zookeeper: > > > curl > http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=${SHARD}<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d<http://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d%3chttp://127.0.0.1:8080/solr/admin/cores?action=UNLOAD&core=$%7bSHARD%7d>> > > echo "Removing indexes from all Zookeeper hosts" > for (( i=0; i<${#ZK_HOSTS[*]}; i++ )) > do > $JAVA -cp > .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar > org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete > /collections/polecat/shards/solrenglish:8080_solr_$SHARD/$HOSTNAME:8080_solr_$SHARD > $JAVA -cp > .:/apps/zookeeper-3.3.5/zookeeper-3.3.5.jar:/apps/zookeeper-3.3.5/lib/jline-0.9.94.jar:/apps/zookeeper-3.3.5/lib/log4j-1.2.15.jar > org.apache.zookeeper.ZooKeeperMain -server ${ZK_HOSTS[$i]} delete > /collections/polecat/shards/solrenglish:8080_solr_$SHARD > Done > > curl http://solrmaster01:8080/solr/admin/cores?action=RELOAD&core=master > > Now that we have migrated, I have tried removing cores from Zookeeper by > removing the stuff for the unloaded core in "leaders" and "leader_elect", but > for some reason SOLR keeps sending the requests to the shard, and I end up > with the "no servers hosting shard" error. > > Does anyone know how to remove a SOLR core from a SOLR server and have > Zookeeper updated, and have distributed queries still work? The only thing > I know how to do now is stop tomcat, stop zookeeper, clear out the data > directory and then restart both. This isn't really ideal for a process I'd > like to have running each night, and surely it is something others have it. > I've tried google searching, and what I find is references to the bug where > solr notifies zookeeper on core unloads which is marked as fixed, and people > talking about how it doesn't work but if your run reloads on each core, it > will work. (also doesn't work when I do it) > > Regards, > > Gilles Comeau -- - Mark