On Tue, Sep 01, 2020 at 03:43:37PM +0100, Jose M Calhariz wrote:
> Package: openafs-client
> Version: 1.8.6-1~dsi10+1
> Severity: normal
> 
> I am using a private backport of openafs from testing.  On this server I
> am getting multiples strange errors about openafs cache.  This server
> is different in that it runs apache to serve personal web pages and every
> web page runs under a different openafs user.  So is normal for this
> server to be simultaneuous running code under 100 or 200 different openafs 
> users.
> 
> The an example of errors on the logs are:
> 
> afs: disk cache read error in CacheItems slot 350195 off 28015620/35000020 
> code -4/80
> afs: Error while alloc'ing cache slot for file 204:536874423.964.4794; 
> failing with an i/o error
> 
> I am not certain this types of errors are to be ignored and there have
> been reports of problems accessing openafs files.  I am using this bug
> report to collect more information about this cache errors and the
> possibility of being an indication of important errors with the openafs
> cache code.

This error message is supposed to indicate that a read from the cache
filesystem got EIO, which in turn is supposed to indicate a physical
problem with the drive.  That said, I'm not going to jump to conclusions
and try to blame your drive, as there are several other things that could
be coming into play.

While the log message itself is pretty old, there's been a lot of work
recently to more accurately report EIO in error conditions (mostly instead
of ENOENT, since returning ENOENT can cause that to get cached at the VFS
layer and produce strange user-visible behavior).

Having a lot of users present makes me suspect that the credentials used by
the kernel to read/write the cache file are not being saved/restored
properly, and indeed we recently merged to 1.8.x (not in a release yet)
https://gerrit.openafs.org/14082 and https://gerrit.openafs.org/14099 which
improve such credentials management.

My recommendation would be to try pulling in those two patches to your
build before proceeding to try to trace the source of the EIO.

Thanks for the report!

-Ben

Reply via email to