Hi Joe,

We fought with Solr on HDFS for quite some time, and faced similar issues
as you're seeing. (See this thread, for example:"
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201812.mbox/%3cCABd9LjTeacXpy3FFjFBkzMq6vhgu7Ptyh96+w-KC2p=-rqk...@mail.gmail.com%3e
 )

The Solr lock files on HDFS get deleted if the Solr server gets shut down
gracefully, but we couldn't always guarantee that in our environment so we
ended up writing a custom startup script to search for lock files on HDFS
and delete them before solr startup.

However, the issue that you mention of the Solr server rebuilding its whole
index from replicas on startup was enough of a show-stopper for us that we
switched away from HDFS to local disk. It literally made the difference
between 24+ hours of recovery time after an unexpected outage to less than
a minute...

If you do end up finding a solution to this issue, please post it to this
mailing list, because there are others out there (like us!) who would most
definitely make use it.

Thanks

Kyle

On Fri, 2 Aug 2019 at 08:58, Joe Obernberger <joseph.obernber...@gmail.com>
wrote:

> Thank you.  No, while the cluster is using Cloudera for HDFS, we do not
> use Cloudera to manager the solr cluster.  If it is a
> configuration/architecture issue, what can I do to fix it?  I'd like a
> system where servers can come and go, but the indexes stay available and
> recover automatically.  Is that possible with HDFS?
> While adding an alias to other collections would be an option, if that
> collection is the only collection, or one that is currently needed, in a
> live system, we can't bring it down, re-create it, and re-index when
> that process may take weeks to do.
>
> Any ideas?
>
> -Joe
>
> On 8/1/2019 6:15 PM, Angie Rabelero wrote:
> > I don’t think you’re using claudera or ambari, but ambari has an option
> to delete the locks. This seems more a configuration/architecture isssue
> than a realibility issue. You may want to spin up an alias while you bring
> down, clear locks and directories, recreate and index the affected
> collection, while you work your other isues.
> >
> > On Aug 1, 2019, at 16:40, Joe Obernberger <joseph.obernber...@gmail.com>
> wrote:
> >
> > Been using Solr on HDFS for a while now, and I'm seeing an issue with
> redundancy/reliability.  If a server goes down, when it comes back up, it
> will never recover because of the lock files in HDFS. That solr node needs
> to be brought down manually, the lock files deleted, and then brought back
> up.  At that point, it appears to copy all the data for its replicas.  If
> the index is large, and new data is being indexed, in some cases it will
> never recover. The replication retries over and over.
> >
> > How can we make a reliable Solr Cloud cluster when using HDFS that can
> handle servers coming and going?
> >
> > Thank you!
> >
> > -Joe
> >
> >
> >
> > ---
> > This email has been checked for viruses by AVG.
> > https://www.avg.com
> >
>

Reply via email to