Yes Upayavira, that's exactly what prompted me to ask Erick as soon as I read https://cwiki.apache.org/confluence/display/solr/Config+Sets
Erick, Regarding my delta-import not working I do see the dataimport.properties in zookeeper. after I "upconfig" and "linkconfig" my conf files into ZK...see below [zk: localhost:YYYY (CONNECTED) 0] ls /configs/xxxxxx [admin-extra.menu-top.html, person-synonyms.txt, entity-stopwords.txt, protwords.txt, location-synonyms.txt, solrconfig.xml, organization-synonyms.txt, stopwords.txt, spellings.txt, dataimport.properties, admin-extra.html, xslt, synonyms.txt, scripts.conf, subject-synonyms.txt, elevate.xml, admin-extra.menu-bottom.html, solr-import-config.xml, clustering, schema.xml] However, when I look into dataimport.properties in my 'conf' folder it hasn't updated even after running full-import on Sep 19 2015 1:00AM successfully and subsequent delta-import on Sep 20 2015 11:AM which did not import newer docs, This prompted me to look into the dataimport.properties in the conf folder...the details are shown below, you can clearly see the dates are quite a bit off. [xxxx@yyyyy conf]$ cat dataimport.properties #Tue Sep 15 18:11:17 UTC 2015 reindex-docs.last_index_time=2015-09-15 18\:11\:16 last_index_time=2015-09-15 18\:11\:16 sep.last_index_time=2014-03-24 13\:41\:46 I saw some JIRA tickets about different location of dataimport.properties for SolrCloud but couldnt find the path where it stores...Anybody have idea where it stores it ? Thanks Ravi Kiran Bhaskar On Sun, Sep 20, 2015 at 5:28 AM, Upayavira <u...@odoko.co.uk> wrote: > It is worth noting that the ref guide page on configsets refers to > non-cloud mode (a useful new feature) whereas people may confuse this > with configsets in cloud mode, which use Zookeeper. > > Upayavira > > On Sun, Sep 20, 2015, at 04:59 AM, Ravi Solr wrote: > > Cant thank you enough for clarifying it at length. Yeah its pretty > > confusing even for experienced Solr users. I used the upconfig and > > linkconfig commands to update 4 collections into zookeeper...As you > > described, I lucked out as I used the same name for configset and the > > collection and hence did not have to use the collections API :-) > > > > Thanks, > > > > Ravi Kiran Bhaskar > > > > On Sat, Sep 19, 2015 at 11:22 PM, Erick Erickson > > <erickerick...@gmail.com> > > wrote: > > > > > Let's back up a second. Configsets are what _used_ to be in the conf > > > directory for each core on a local drive, it's just that they're now > > > kept up on Zookeeper. Otherwise, you'd have to put them on each > > > instance in SolrCloud, and bringing up a new replica on a new machine > > > would look a lot like adding a core with the old core admin API. > > > > > > So instead, configurations are kept on zookeeper. A config set > > > consists of, essentially, a named old-style "conf" directory. There's > > > no a-priori limit to the number of config sets you can have. Look in > > > the admin UI, Cloud>>tree>>configs and you'll see each name you've > > > pushed to ZK. If you explore that tree, you'll see a lot of old > > > familiar faces, schema.xml, solrconfig.xml, etc. > > > > > > So now we come to associating configs with collections. You've > > > probably done one of the examples where some things happen under the > > > covers, including explicitly pushing the configset to Zookeeper. > > > Currently, there's no option in the bin/solr script to push a config, > > > although I know there's a JIRA to do that. > > > > > > So, to put a new config set up you currently need to use the zkCli.sh > > > script see: > > > > https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities, > > > the "upconfig" command. That pushes the configset up to ZK and gives > > > it a name. > > > > > > Now, you create a collection and it needs a configset stored in ZK. > > > It's a little tricky in that if you do _not_ explicitly specify a > > > configest (using the collection.configName parameter to the > > > collections API CREATE command), then by default it'll look for a > > > configset with the same name as the collection. If it doesn't find > > > one, _and_ there is one and only one configset, then it'll use that > > > one (personally I find that confusing, but that's the way it works). > > > See: https://cwiki.apache.org/confluence/display/solr/Collections+API > > > > > > If you have two or more configsets in ZK, then either the configset > > > name has to be identical to the collection name (if you don't specify > > > collection.configName), _or_ you specify collection.configName at > > > create time. > > > > > > NOTE: there are _no_ config files on the local disk! When a replica of > > > a collection loads, it "knows" what collection it's part of and pulls > > > the corresponding configset from ZK. > > > > > > So typically the process is this. > > > > you create the config set by editing all the usual suspects, > schema.xml, > > > solrconfig.xml, DIH config etc. > > > > you put those configuration files into some version control system > (you > > > are using one, right?) > > > > you push the configs to Zookeeper > > > > you create the collection > > > > you figure out you need to change the configs so you > > > > check the code out of your version control > > > > edit them > > > > put the current version back into version control > > > > push the configs up to zookeeper, overwriting the ones already > > > there with that name > > > > reload the collection or bounce all the servers. As each replica > > > in the collection comes up, > > > it downloads the latest configs from Zookeeper to memory (not to > > > disk) and uses them. > > > > > > Seems like a long drawn-out process, but pretty soon it's automatic. > > > And really, the only extra step is the push to Zookeeper, the rest is > > > just like old-style cores with the exception that you don't have to > > > manually push all the configs to all the machines hosting cores. > > > > > > Notice that I have mostly avoided talking about "cores" here. Although > > > it's true that a replica in a collection is just another core, it's > > > "special" in that it has certain very specific properties set. I > > > _strongly_ advise you stop thinking about old-style Solr cores and > > > instead thing about collections and replicas. And above all, do _not_ > > > use the admin core API to try to create members of a collection > > > (cores), use the collections API to ADDREPLICA/DELETEREPLICA instead. > > > Loading/unloading cores is less "fraught", but I try to avoid that too > > > and use > > > > > > Best, > > > Erick > > > > > > On Sat, Sep 19, 2015 at 9:08 PM, Ravi Solr <ravis...@gmail.com> wrote: > > > > Thanks Erick, I will report back once the reindex is finished. Oh, > your > > > > answer reminded me of another question - Regarding configsets the > > > > documentation says > > > > > > > > "On a multicore Solr instance, you may find that you want to share > > > > configuration between a number of different cores." > > > > > > > > Can the same be used to push disparate mutually exclusive configs ? > I ask > > > > this as I have 4 mutually exclusive apps each with a 4 single core > index > > > on > > > > a single machine which I am trying to convert to SolrCloud with > single > > > > shard approach. Just being lazy and trying to find a way to update > and > > > link > > > > configs to zookeeper ;-) > > > > > > > > Thanks > > > > > > > > Rvai Kiran Bhaskar > > > > > > > > On Sat, Sep 19, 2015 at 6:54 PM, Erick Erickson < > erickerick...@gmail.com > > > > > > > > wrote: > > > > > > > >> Just pushing up the entire configset would be easiest, but the > > > >> Zookeeper command line tools allow you to push up a single > > > >> file if you want. > > > >> > > > >> Yeah, it puzzles me too that the import worked yesterday, not really > > > >> sure what happened, the file shouldn't just disappear.... > > > >> > > > >> Erick > > > >> > > > >> On Sat, Sep 19, 2015 at 2:46 PM, Ravi Solr <ravis...@gmail.com> > wrote: > > > >> > Thank you for the prompt response Erick. I did a full-import > > > yesterday, > > > >> you > > > >> > are correct that I did not push dataimport.properties to ZK, > should it > > > >> have > > > >> > not worked even for a full import ?. You may be right about > 'clean' > > > >> option, > > > >> > I will reindex again today. BTW how do we push a single file to a > > > >> specific > > > >> > config name in zookeeper ? > > > >> > > > > >> > > > > >> > Thanks, > > > >> > > > > >> > Ravi Kiran Bhaskar > > > >> > > > > >> > > > > >> > On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson < > > > erickerick...@gmail.com > > > >> > > > > >> > wrote: > > > >> > > > > >> >> Could not read DIH properties from > > > >> >> /configs/sitesearchcore/dataimport.properties > > > >> >> > > > >> >> This looks like somehow you didn't push this file up to > Zookeeper. > > > You > > > >> >> can check what files are there in the admin UI. How you indexed > > > >> >> yesterday is a mystery though, unless somehow this file was > removed > > > >> >> from ZK. > > > >> >> > > > >> >> As for why you lost all the docs, my suspicion is that you have > the > > > >> >> clean param set up for delta import.... > > > >> >> > > > >> >> FWIW, > > > >> >> Erick > > > >> >> > > > >> >> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ravis...@gmail.com> > > > wrote: > > > >> >> > I am facing a weird problem. As part of upgrade from 4.7.2 > > > >> (Master-Slave) > > > >> >> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH > using > > > >> >> > SolrEntityProcessor yesterday, all of them indexed properly. > Today > > > >> >> morning > > > >> >> > I just ran the DIH again with delta import and I lost all > > > docs...what > > > >> am > > > >> >> I > > > >> >> > missing ? Did anybody face similar issue ? > > > >> >> > > > > >> >> > Here are the errors in the logs > > > >> >> > > > > >> >> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous > SolrRequestInfo > > > was > > > >> >> not > > > >> >> > closed! > > > >> >> > req=waitSearcher=true&distrib.from= > > > >> >> > > > >> > > > > http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false > > > >> >> > 9/19/2015, > > > >> >> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015, > > > >> 2:41:17 AM > > > >> >> > WARN null ZKPropertiesWriter Could not read DIH properties from > > > >> >> > /configs/sitesearchcore/dataimport.properties :class > > > >> >> > org.apache.zookeeper.KeeperException$NoNodeException > > > >> >> > > > > >> >> > org.apache.zookeeper.KeeperException$NoNodeException: > > > KeeperErrorCode > > > >> >> > = NoNode for /configs/sitesearchcore/dataimport.properties > > > >> >> > at > > > >> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > > > >> >> > at > > > >> >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > > > >> >> > at > > > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) > > > >> >> > at > > > >> >> > > > > org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349) > > > >> >> > at > > > >> >> > > > >> > > > > org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91) > > > >> >> > at > > > >> >> > > > >> > > > > org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65) > > > >> >> > at > > > >> >> > > > >> > > > > org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307) > > > >> >> > at > > > >> >> > > > >> > > > > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253) > > > >> >> > at > > > >> >> > > > >> > > > > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416) > > > >> >> > at > > > >> >> > > > >> > > > > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480) > > > >> >> > at > > > >> >> > > > >> > > > > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461) > > > >> >> > > > > >> >> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous > SolrRequestInfo > > > >> was > > > >> >> not > > > >> >> > closed! > > > >> >> > req=waitSearcher=true&distrib.from= > > > >> >> > > > >> > > > > http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false > > > >> >> > 9/19/2015, > > > >> >> > 11:16:43 AM ERROR null SolrCore prev == info : false > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> > Thanks > > > >> >> > > > > >> >> > Ravi Kiran Bhaskar > > > >> >> > > > >> > > > >