We have an NFS server running on ubuntu 12.04, and after upgrading one client from 10.04 to 12.04 the other day we are hitting a similar (possibly the same) problem. The server setup has not been touched for months.
We have a directory with a lot of .xml files (~1009700 of them). Running an ls on this directory from another client running 12.04 initially produced this message: [423354.265296] NFS: directory xxx/OLD contains a readdir loop.Please contact your server vendor. The file: 900015.xml\xffffffa1;s0z\xffffffda\xffffffa0\xffffffa0\xffffff91]c\x03\xffffff88\xffffffff\xffffffffml\xffffffa3\xffffffa3\x1b\xfffffff1' \xffffffb0\xffffff91]c\x03\xffffff88\xffffffff\xffffffffml#q%G\xffffff8c\xffffffa0\xffffffc0\xffffff91]c\x03\xffffff88\xffffffff\xffffffffxml\xffffffc4\xffffffe3>\xffffff9f\xffffffa8\xffffffd0\xffffff91\xffffff91]c\x03\xffff\x0f\xffffffbf\xfffffff0\xffffff91]c\x03\xffffff88\xffffffff\xffffffffxml}\xffffff9e\xffffff88\xffffffc3P has duplicate cookie 514419709fml\xffffffbb\xffffffb6\xfffffff2 Doing an cp -a on this file, removing the original file and moving the copy back in place fixes the corrupted filename, but the duplicate cookie problem remains. Running a find | sort on the server and on the clients and diffing the output reveals no difference with 10.04 clients, but with the 12.04 client (and the problematic file moved away) we get ~10 duplicate entries in the output on the 12.04 client. Our 10.04-clients seem unaffected. I've tried a 12.04 client with kernel 3.8 which shows the same problem. I've tried mounting with different nfs versions, and the only change was that with nfsvers=2 I managed to list around ~700k files before it broke (as opposed to ~300k files otherwise)). It also breaks rsync with rsync: readdir("/the-path/OLD"): Too many levels of symbolic links (40) Server information: --- Linux xxx 3.2.0-39-generic #62-Ubuntu SMP Thu Feb 28 00:28:53 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux /dev/drbd0 on /data type ext3 (rw,noatime) ii nfs-common 1:1.2.5-3ubuntu3.1 NFS support files common to client and server ii nfs-kernel-server 1:1.2.5-3ubuntu3.1 support for NFS kernel server -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1240143 Title: NFS client reports a 'readdir loop' with a corrupt name Status in “linux” package in Ubuntu: Confirmed Bug description: We have an NFS server running on a RedHat system. One particular directory contains many, many RPMs (96850). It reports that there is a 'readdir loop', and the loop in question contains corrupted names. I assume the name corruption is happening on the Linux kernel end, not the server end: "NFS: directory Development/rpms contains a readdir loop.Please contact your server vendor. The file: foo-bar-11.0flange-12345.AB5.x86_64.rpmmpmpmmT53 has duplicate cookie 1110018804" "NFS: directory Development/rpms contains a readdir loop.Please contact your server vendor. The file: widget-wiggle-11.0-12356.AB5.x86_64.rpmpm.AB5.x86_64.rpm\xffffffffm has duplicate cookie 353422206" Since the corrupted names are never displayed in an 'ls' of the directory (even whilst the problem is occurring), I assume that this is a presentation problem in the warning message. Unfortunately the problem had gone away by the time I tried using tcpdump to capture the on-the-wire data. jfletcher@gromit:~$ cat /proc/version Linux version 3.2.0-29-generic (buildd@allspice) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 jfletcher@gromit:~$ lsb_release -rd Description: Ubuntu 12.04.3 LTS Release: 12.04 The lspci information would not be useful - the system was running under KVM, with a single interface. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1240143/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp