Opps.. didn't reply all..

---------- Forwarded message ----------
From: Jason Kendall <jakend...@gmail.com>
Date: Mon, Oct 11, 2010 at 2:50 PM
Subject: Re: Bug#599823: linux-2.6: XEN and NFS causes duplicate filenames
with large directories
To: Ben Hutchings <b...@decadent.org.uk>




On 10-10-11 01:19 PM, Ben Hutchings wrote:

> On Mon, Oct 11, 2010 at 12:49:33PM -0400, Jason Kendall wrote:
>
>
>> Package: linux-2.6
>> Severity: important
>> Tags: upstream
>>
>>
> Which version?
>
>
uname was further in the report (i used reportbug so It should have been
there. At the time of report it was 2.6.32-5-686-bigmem.

 2. Duplicate filenames are given when doing an "ls"
>> 3. Trigger happens when a rename (mv) happens on a directory with a large
>> number of files.
>> 4. Does not matter which machine does the rename/mv (Any box connected to
>> the NFS) the duplicate filenames still show up under DomU
>> 5. Does not appear to happen to directories with a limited number of
>> files. I have one directory with>  9k files which this does happen on (mail
>> directory)
>>
>>
> This is probably an effect of the NFS block size - any directory smaller
> than a single block is likely to be readable atomically.
>
>
>
Upped the block size and same issue.
(rw,rsize=32768,wsize=32768,hard,fg,nolock,nfsvers=3,tcp,actimeo=0,addr=10.0.0.7).

Prior, it was just mounted with defaults


 6. To clear the issue, you have to either rename the file back to the
>> original, or reboot the DomU
>>
>>
> This last point is the troubling one.  If this condition was transient I
> would be tempted to say it's not a bug.  It sounds like the client treats
> its version of the directory as being correct as of the time the directory
> listing was completed, whereas it should either (1) treat the listing as
> correct at the time the directory listing started, therefore stale when it
> the directory is next read; or (2) detect that the directory changed and so
> discard the listing from its cache immediately.
>
>
>
>> A little direction on how to continue diagnosing this issue, or a fix :)
>> would be good.
>>
>>
>
> Please test Linux 2.6.36-rc6 as packaged in experimental.
>
>
>
>
Just tested, same issue.

Looking at a pcap, NFSClient doesn't appear to be asking the server for the
filenames, however, there is a large number of "ACCESS" and "GETATTR"
requests.  Most are returned as "Directory", a few are returned as "Regular
File". Of the Regular files, there is 3 returned, all the same file handle,
and appear to be the same stats. There is matching GETATTR calls prior to
each Regular File Reply, and a number of requests in between each one.

touching the file to update the mtime does not resolve the issue.

I can't seem to find a way to force a NFS cache flush.

For the record:

r...@mx2:/home/jakendall/Maildir# ls cur/127840* -l
-rw------- 1 jakendall users 79124 Jul  6 04:10 cur/
1278403851.H569630P9192.mx1.ostlabs.com:2,Sa
-rw------- 1 jakendall users 79124 Jul  6 04:10 cur/
1278403851.H569630P9192.mx1.ostlabs.com:2,Sa
-rw------- 1 jakendall users 79124 Jul  6 04:10 cur/
1278403851.H569630P9192.mx1.ostlabs.com:2,Sa


Doing a umount / mount on the drive doesn't clear it out either:

r...@mx2:/home/jakendall/Maildir# cd /
r...@mx2:/# umount /home
r...@mx2:/# mount /home
r...@mx2:/# cd /home/jakendall/Maildir/
r...@mx2:/home/jakendall/Maildir# cd /home/jakendall/Maildir/
r...@mx2:/home/jakendall/Maildir# ls cur/127840* -l
-rw------- 1 jakendall users 79124 Oct 11 14:33 cur/
1278403851.H569630P9192.mx1.ostlabs.com:2,Sa
-rw------- 1 jakendall users 79124 Oct 11 14:33 cur/
1278403851.H569630P9192.mx1.ostlabs.com:2,Sa
-rw------- 1 jakendall users 79124 Oct 11 14:33 cur/
1278403851.H569630P9192.mx1.ostlabs.com:2,Sa
r...@mx2:/home/jakendall/Maildir#

The PCAP after this shows the READDIR, for the file, and the file shows up
in 3 different Call/Reply once each.

Regards,
Jason

Reply via email to