On 10/31/2013 7:26 AM, Shalom Ben-Zvii Kazaz wrote:
> Shawn, Thank you for your answer.
> for the purpose of testing it we have a test environment where we are not
> indexing anymore. We also disabled the DIH delta import. so as I understand
> there shouldn't be any commits on the master.
> I also tried with
> <str name="commitReserveDuration">50:50:50</str>
> and get the same failure.

If it's in an environment where there are no commits, that's really
odd.  I would suspect underlying filesystem or network issues.  There's
one problem that's not well known, but is very common - problems with
NIC firmware, most commonly Broadcom NICs.  These problems result in
things working correctly almost all the time, but when there is a high
network load, things break in strange ways, and the resulting errors
rarely look like they are network-related.

Most embedded NICs are either Broadcom or Realtek, both of which are
famous for their firmware problems.  Broadcom NICs are very common on
Dell and HP servers.  Upgrading the firmware (which is not usually the
same thing as upgrading the driver) is the only fix.  NICs from other
manufacturers also have upgradable firmware, but don't usually have the
same kind of high-profile problems as Broadcom.

The NIC firmware might not have anything to do with this problem, but
it's the only thing left that I can think of.  I personally haven't used
replication since Solr 1.4.1, but a lot of people do.  I can't say that
there's no bugs, but so far I'm not seeing the kind of problem reports
that appear when a bug in a critical piece of the software exists.

Thanks,
Shawn

Reply via email to