Whoops, hit the send keyboard shortcut. I just created a JIRA issue for the first bit I’ll be working on:
SOLR-5656: When using HDFS, the Overseer should have the ability to reassign the cores from failed nodes to running nodes. - Mark On Jan 22, 2014, 12:57:46 PM, Lajos <la...@protulae.com> wrote: Thanks Mark ... indeed, some doc updates would help. Regarding what seems to be a popular question on sharding. It seems that it would be a Good Thing that the shards for a collection running HDFS essentially be pointers to the HDFS-replicated index. Is that what your thinking is? I've been following your work recently, would be interested in helping out on this if there's the chance. Is there a JIRA yet on this issue? Thanks, lajos On 22/01/2014 16:57, Mark Miller wrote: > Right - solr.hdfs.home is the only setting you should use with SolrCloud. > > The documentation should probably be improved. > > If you set the data dir or ulog location in solrconfig.xml explicitly, it > will be the same for every collection. SolrCloud shares the solrconfig.xml > across SolrCore’s, and this will not work out. > > By setting solr.hdfs.home and leaving the relative defaults, all of the > locations are correctly set for each different collection under > solr.hdfs.home without any effort on your part. > > - Mark > > > > On Jan 22, 2014, 7:22:22 AM, Lajos <la...@protulae.com> wrote: Uugh. I just > realised I should have take out the data dir and update log > definitions! Now it works fine. > > Cheers, > > L > > > On 22/01/2014 11:47, Lajos wrote: >> Hi all, >> >> I've been running Solr on HDFS, and that's fine. >> >> But I have a Cloud installation I thought I'd try on HDFS. I uploaded >> the configs for the core that runs in standalone mode already on HDFS >> (on another cluster). I specify the HdfsDirectoryFactory, HDFS data dir, >> solr.hdfs.home, and HDFS update log path: >> >> <dataDir>hdfs://master:9000/solr/test/data</dataDir> >> >> <directoryFactory name="DirectoryFactory" >> class="solr.HdfsDirectoryFactory"> >> <str name="solr.hdfs.home">hdfs://master:9000/solr</str> >> </directoryFactory> >> >> <updateHandler class="solr.DirectUpdateHandler2"> >> <updateLog> >> <str name="dir">hdfs://master:9000/solr/test/ulog</str> >> </updateLog> >> </updateHandler> >> >> Question is: should I create my collection differently than I would a >> normal collection? >> >> If I just try that, Solr will initialise the directory in HDFS as if it >> were a single core. It will create shard directories on my nodes, but >> not actually put anything in there. And then it will complain mightily >> about not being able to forward updates to other nodes. (This same >> cluster hosts regular collections, and everything is working fine). >> >> Am I missing a step? Do I have to manually create HDFS directories for >> each replica? >> >> Thanks, >> >> L >