Hi On Wed, Sep 30, 2020 at 04:18:27PM +0100, Jose M Calhariz wrote: > Hi, > > any progress on this? There is anything I can do to help?
Hi, I collected internal debug messages about this using ClientTracing.py Is there anyone to send this debug messages and try to solve the problem. Kind regards Jose M Calhariz > > > Kind regards > Jose M Calhariz > > > > On Fri, Sep 04, 2020 at 06:12:30PM +0100, Jose M Calhariz wrote: > > Hi, > > > > I have made an update to my private backport. It is better but I still > > see the same errors on the logs. This machine is a VM and no other VM > > or host is reporting IO errors of any kind, that I know off. > > > > It was my first time using gerrit, so can you please check if the > > I have downloaded the correct patches? > > > > ee578e9.diff > > 179a418.diff > > > > There is a way to decode this errors and try to understand better what > > is happening and find a fix? > > > > > > [ 9.760892] openafs: loading out-of-tree module taints kernel. > > [ 9.760898] openafs: module license > > 'http://www.openafs.org/dl/license10.html' taints kernel. > > [ 9.762091] openafs: module verification failed: signature and/or > > required key missing - tainting kernel > > [ 9.778441] Key type afs_pag registered > > [ 8245.094223] afs: disk cache read error in CacheItems slot 211006 off > > 16880500/19660820 code -4/80 > > [ 8245.094254] afs: disk cache read error in CacheItems slot 211006 off > > 16880500/19660820 code -4/80 > > [ 8245.094277] afs: disk cache read error in CacheItems slot 211006 off > > 16880500/19660820 code -4/80 > > [ 8245.094299] afs: disk cache read error in CacheItems slot 211006 off > > 16880500/19660820 code -4/80 > > [10181.679636] afs: disk cache read error in CacheItems slot 156531 off > > 12522500/19660820 code -4/80 > > [10181.679638] afs: Error while alloc'ing cache slot for file > > 204:536874423.516.5309; failing with an i/o error > > [11438.241843] afs_UFSGetVolSlot: error -4 reading volumeinfo > > [11438.242213] afs_UFSGetVolSlot: error -4 reading volumeinfo > > > > > > Kind regards > > Jose M Calhariz > > > > > > > > > > > > > > On Wed, Sep 02, 2020 at 07:28:50PM +0100, Jose M Calhariz wrote: > > > Hi, > > > > > > I will then update my private backport and see if the things improve. > > > I will report here the results of your sugestion. Thank you. > > > > > > Kind regards > > > Jose M Calhariz > > > > > > On Tue, Sep 01, 2020 at 04:07:55PM -0700, Benjamin Kaduk wrote: > > > > On Tue, Sep 01, 2020 at 03:43:37PM +0100, Jose M Calhariz wrote: > > > > > Package: openafs-client > > > > > Version: 1.8.6-1~dsi10+1 > > > > > Severity: normal > > > > > > > > > > I am using a private backport of openafs from testing. On this > > > > > server I > > > > > am getting multiples strange errors about openafs cache. This server > > > > > is different in that it runs apache to serve personal web pages and > > > > > every > > > > > web page runs under a different openafs user. So is normal for this > > > > > server to be simultaneuous running code under 100 or 200 different > > > > > openafs > > > > > users. > > > > > > > > > > The an example of errors on the logs are: > > > > > > > > > > afs: disk cache read error in CacheItems slot 350195 off > > > > > 28015620/35000020 code -4/80 > > > > > afs: Error while alloc'ing cache slot for file > > > > > 204:536874423.964.4794; failing with an i/o error > > > > > > > > > > I am not certain this types of errors are to be ignored and there have > > > > > been reports of problems accessing openafs files. I am using this bug > > > > > report to collect more information about this cache errors and the > > > > > possibility of being an indication of important errors with the > > > > > openafs > > > > > cache code. > > > > > > > > This error message is supposed to indicate that a read from the cache > > > > filesystem got EIO, which in turn is supposed to indicate a physical > > > > problem with the drive. That said, I'm not going to jump to conclusions > > > > and try to blame your drive, as there are several other things that > > > > could > > > > be coming into play. > > > > > > > > While the log message itself is pretty old, there's been a lot of work > > > > recently to more accurately report EIO in error conditions (mostly > > > > instead > > > > of ENOENT, since returning ENOENT can cause that to get cached at the > > > > VFS > > > > layer and produce strange user-visible behavior). > > > > > > > > Having a lot of users present makes me suspect that the credentials > > > > used by > > > > the kernel to read/write the cache file are not being saved/restored > > > > properly, and indeed we recently merged to 1.8.x (not in a release yet) > > > > https://gerrit.openafs.org/14082 and https://gerrit.openafs.org/14099 > > > > which > > > > improve such credentials management. > > > > > > > > My recommendation would be to try pulling in those two patches to your > > > > build before proceeding to try to trace the source of the EIO. > > > > > > > > Thanks for the report! > > > > > > > > -Ben > > > > > > > > > > > > > > > > -- -- Já sorriu pro seu micro hoje? Já? Mas que imbecil!
signature.asc
Description: PGP signature