-------- Forwarded Message -------- From: Jason Kendall <jakend...@gmail.com> To: Ben Hutchings <b...@decadent.org.uk> Subject: Re: Bug#599823: linux-2.6: XEN and NFS causes duplicate filenames with large directories Date: Mon, 11 Oct 2010 14:50:39 -0400
On 10-10-11 01:19 PM, Ben Hutchings wrote: > On Mon, Oct 11, 2010 at 12:49:33PM -0400, Jason Kendall wrote: > >> Package: linux-2.6 >> Severity: important >> Tags: upstream >> > Which version? > uname was further in the report (i used reportbug so It should have been there. At the time of report it was 2.6.32-5-686-bigmem. >> 2. Duplicate filenames are given when doing an "ls" >> 3. Trigger happens when a rename (mv) happens on a directory with a large >> number of files. >> 4. Does not matter which machine does the rename/mv (Any box connected to >> the NFS) the duplicate filenames still show up under DomU >> 5. Does not appear to happen to directories with a limited number of files. >> I have one directory with> 9k files which this does happen on (mail >> directory) >> > This is probably an effect of the NFS block size - any directory smaller > than a single block is likely to be readable atomically. > > Upped the block size and same issue. (rw,rsize=32768,wsize=32768,hard,fg,nolock,nfsvers=3,tcp,actimeo=0,addr=10.0.0.7). Prior, it was just mounted with defaults >> 6. To clear the issue, you have to either rename the file back to the >> original, or reboot the DomU >> > This last point is the troubling one. If this condition was transient I > would be tempted to say it's not a bug. It sounds like the client treats > its version of the directory as being correct as of the time the directory > listing was completed, whereas it should either (1) treat the listing as > correct at the time the directory listing started, therefore stale when it > the directory is next read; or (2) detect that the directory changed and so > discard the listing from its cache immediately. > > >> A little direction on how to continue diagnosing this issue, or a fix :) >> would be good. >> > > Please test Linux 2.6.36-rc6 as packaged in experimental. > > > Just tested, same issue. Looking at a pcap, NFSClient doesn't appear to be asking the server for the filenames, however, there is a large number of "ACCESS" and "GETATTR" requests. Most are returned as "Directory", a few are returned as "Regular File". Of the Regular files, there is 3 returned, all the same file handle, and appear to be the same stats. There is matching GETATTR calls prior to each Regular File Reply, and a number of requests in between each one. touching the file to update the mtime does not resolve the issue. I can't seem to find a way to force a NFS cache flush. For the record: r...@mx2:/home/jakendall/Maildir# ls cur/127840* -l -rw------- 1 jakendall users 79124 Jul 6 04:10 cur/1278403851.H569630P9192.mx1.ostlabs.com:2,Sa -rw------- 1 jakendall users 79124 Jul 6 04:10 cur/1278403851.H569630P9192.mx1.ostlabs.com:2,Sa -rw------- 1 jakendall users 79124 Jul 6 04:10 cur/1278403851.H569630P9192.mx1.ostlabs.com:2,Sa Doing a umount / mount on the drive doesn't clear it out either: r...@mx2:/home/jakendall/Maildir# cd / r...@mx2:/# umount /home r...@mx2:/# mount /home r...@mx2:/# cd /home/jakendall/Maildir/ r...@mx2:/home/jakendall/Maildir# cd /home/jakendall/Maildir/ r...@mx2:/home/jakendall/Maildir# ls cur/127840* -l -rw------- 1 jakendall users 79124 Oct 11 14:33 cur/1278403851.H569630P9192.mx1.ostlabs.com:2,Sa -rw------- 1 jakendall users 79124 Oct 11 14:33 cur/1278403851.H569630P9192.mx1.ostlabs.com:2,Sa -rw------- 1 jakendall users 79124 Oct 11 14:33 cur/1278403851.H569630P9192.mx1.ostlabs.com:2,Sa r...@mx2:/home/jakendall/Maildir# The PCAP after this shows the READDIR, for the file, and the file shows up in 3 different Call/Reply once each. Regards, Jason -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org