Hi Erick, I tried out your changes from the branch_4x branch. It looks good in terms of preserving the zkHost, but I'm running into an exception because it isn't persisting the instanceDir attribute on the <core> element.
I've got a few other things I need to take care of, but as soon as I have time I'll dig in and see if I can figure out what's going on, and see what changed to make this not work. Here are details on what the files looked like before/after CREATE call: original solr.xml: <?xml version="1.0" encoding="UTF-8" ?> <solr persistent="true" sharedLib="lib" zkHost="10.116.249.136:2181"> <!-- this 8080 might need to change in production --> <cores adminPath="/admin/cores" zkClientTimeout="20000" hostPort="8080" hostContext="/"/> </solr> here's what was produced with 4.3 branch + a quick mod to preserve zkHost: <?xml version="1.0" encoding="UTF-8" ?> <solr persistent="true" zkHost="10.116.249.136:2181" sharedLib="lib"> <cores adminPath="/admin/cores" zkClientTimeout="20000" hostPort="8080" hostContext="/"> <core loadOnStartup="true" shard="shard1" instanceDir="directory_shard1_replica1/" transient="false" name="directory_shard1_replica1" collection="directory"/> <core loadOnStartup="true" shard="shard2" instanceDir="directory_shard2_replica1/" transient="false" name="directory_shard2_replica1" collection="directory"/> </cores> </solr> here's what was produced with branch_4x 4.4-SNAPSHOT: <?xml version="1.0" encoding="UTF-8" ?> <solr persistent="true" zkHost="10.116.249.136:2181" sharedLib="lib"> <cores adminPath="/admin/cores" zkClientTimeout="20000" distribUpdateSoTimeout="0" distribUpdateConnTimeout="0" hostPort="8080" hostContext="/"> <core shard="shard1" numShards="2" name="directory_shard1_replica2" collection="directory" qt="/admin/cores" wt="javabin" version="2"/> <core shard="shard2" numShards="2" name="directory_shard2_replica2" collection="directory" qt="/admin/cores" wt="javabin" version="2"/> </cores> </solr> and here's the error from solr.log after restarting after the CREATE: 2013-06-17 21:37:07,083 1874 [pool-2-thread-1] ERROR org.apache.solr.core.CoreContainer - null:java.lang.NullPointerException: Missing required 'instanceDir' at org.apache.solr.core.CoreDescriptor.doInit(CoreDescriptor.java:133) at org.apache.solr.core.CoreDescriptor.<init>(CoreDescriptor.java:87) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:365) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:221) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:190) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:124) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:277) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382) at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:103) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4638) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5294) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:895) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:871) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:615) at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1099) at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1621) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) On Jun 16, 2013, at 5:38 AM, Erick Erickson wrote: > Al: > > As it happens, I hope sometime today to put up a patch for SOLR-4910 > that should harden up many things in persisting solr.xml, I'll be sure > to include this. It's kind of a pain to create an automated test for > this, so I'll give it a whirl manually. > > As you say, most of this is going away in 5.0, but it needs to work for 4.x. > > And when I get the patch up, if you could give it a "real world" try > it'd be great! > > Thanks, > Erick > > On Fri, Jun 14, 2013 at 6:15 PM, Al Wold <alw...@alwold.com> wrote: >> Hi, >> I'm working on setting up a solr cloud test environment, and the target >> environment I need to put it in has multiple webapps per tomcat instance. >> With that in mind, I wanted/had to avoid putting any configs in system >> properties. I tried putting the zkHost in solr.xml, like this: >> >>> <?xml version="1.0" encoding="UTF-8" ?> >>> <solr persistent="true" sharedLib="lib" zkHost="10.116.249.136:2181"> >>> <!-- this 8080 might need to change in production --> >>> <cores adminPath="/admin/cores" zkClientTimeout="20000" hostPort="8080" >>> hostContext="/"/> >>> </solr> >> >> Everything works fine when I first start things up, create collections, >> upload docs, search, etc. Creating the collection, however, modifies the >> solr.xml file, and doesn't keep the zkHost setting: >> >>> <?xml version="1.0" encoding="UTF-8" ?> >>> <solr persistent="true" sharedLib="lib"> >>> <cores adminPath="/admin/cores" zkClientTimeout="20000" hostPort="8080" >>> hostContext="/"> >>> <core loadOnStartup="true" shard="shard2" >>> instanceDir="directory_shard2_replica1/" transient="false" >>> name="directory_shard2_replica1" collection="directory"/> >>> <core loadOnStartup="true" shard="shard1" >>> instanceDir="directory_shard1_replica1/" transient="false" >>> name="directory_shard1_replica1" collection="directory"/> >>> </cores> >>> </solr> >> >> >> With that in mind, once I restart tomcat, it no longer knows it's supposed >> to be talking to zookeeper, so it looks for local configs and blows up. >> >> I traced this back to the code in CoreContainer.java, in the method >> persistFile(), where it seems to contain no code to write out the zkHost >> when it updates solr.xml. I upped the logging on my solr instance to verify >> this code is executing, so I'm pretty sure it's the right spot. >> >> Is anyone else using zkHost in their solr.xml successfully? I can't see how >> it would work given this problem. >> >> Does this seem like a bug? If so, I can probably file a report and submit a >> patch. It seems like this problem may become a non-issue in 5.0, based on >> comments in the code and some of the discussion in JIRA, but I'm not sure >> how far off that is. >> >> Thanks! >> >> -Al Wold >>