On Tue, Sep 01, 2020 at 03:43:37PM +0100, Jose M Calhariz wrote: > Package: openafs-client > Version: 1.8.6-1~dsi10+1 > Severity: normal > > I am using a private backport of openafs from testing. On this server I > am getting multiples strange errors about openafs cache. This server > is different in that it runs apache to serve personal web pages and every > web page runs under a different openafs user. So is normal for this > server to be simultaneuous running code under 100 or 200 different openafs > users. > > The an example of errors on the logs are: > > afs: disk cache read error in CacheItems slot 350195 off 28015620/35000020 > code -4/80 > afs: Error while alloc'ing cache slot for file 204:536874423.964.4794; > failing with an i/o error > > I am not certain this types of errors are to be ignored and there have > been reports of problems accessing openafs files. I am using this bug > report to collect more information about this cache errors and the > possibility of being an indication of important errors with the openafs > cache code.
This error message is supposed to indicate that a read from the cache filesystem got EIO, which in turn is supposed to indicate a physical problem with the drive. That said, I'm not going to jump to conclusions and try to blame your drive, as there are several other things that could be coming into play. While the log message itself is pretty old, there's been a lot of work recently to more accurately report EIO in error conditions (mostly instead of ENOENT, since returning ENOENT can cause that to get cached at the VFS layer and produce strange user-visible behavior). Having a lot of users present makes me suspect that the credentials used by the kernel to read/write the cache file are not being saved/restored properly, and indeed we recently merged to 1.8.x (not in a release yet) https://gerrit.openafs.org/14082 and https://gerrit.openafs.org/14099 which improve such credentials management. My recommendation would be to try pulling in those two patches to your build before proceeding to try to trace the source of the EIO. Thanks for the report! -Ben