Verification for disco. I went and installed 5.0.0-40-generic from -proposed to a disco box:
$ uname -rv 5.0.0-40-generic #44-Ubuntu SMP Wed Jan 15 02:03:45 UTC 2020 I do not have access to a multi tier cifs mount, so I will try and connect to a known one with a bad username / password to ensure that the server forwarding and caching works as expected, since that occurs before authentication. I enabled tracing with: # modprobe cifs # echo 'module cifs +p' > /sys/kernel/debug/dynamic_debug/control # echo 'file fs/cifs/* +p' > /sys/kernel/debug/dynamic_debug/control # echo 7 > /proc/fs/cifs/cifsFYI To mount a cifs share, you need cifs-utils: $ sudo apt install cifs-utils >From there I accessed the multi tier cifs server: $ sudo mount -v -t cifs //<domain>/<toplevel>/<country>/<sharename> -o defaults,user=aaa /mnt/share Checking dmesg we get: Status code returned 0xc0000257 STATUS_PATH_NOT_COVERED fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000257 to POSIX err -66 fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share> fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share> fs/cifs/dfs_cache.c: do_dfs_cache_find: search path: \<Regional DFS Server>\Root\Country\<Share> fs/cifs/dfs_cache.c: do_dfs_cache_find: cache miss fs/cifs/dfs_cache.c: do_dfs_cache_find: DFS referral request for \<Actual DFS Server>\Root\Country\<Share> fs/cifs/smb2ops.c: smb2_get_dfs_refer path <\<Actual DFS Server>\Root\Country\<Share>> fs/cifs/misc.c: num_referrals: 1 dfs flags: 0x2 ... fs/cifs/dns_resolve.c: dns_resolve_server_name_to_ip: resolved: <Actual File Server> to <IPV4 Address> fs/cifs/connect.c: Username: aaa This is in line with what is expected, since it resolves the next tier file server instead of failing and going back up the tree. I had also previously supplied the customer with a test build based on the previous bionic hwe kernel, 5.0.0-37-generic #40~18.04.1, with the commit applied, and all was tested and working. With my verification of disco kernel from -proposed the customer's prior test kernel, I am happy to mark this as verified. Next, I just need this patch to land in eoan, and then into a bionic HWE kernel to get this fixed for the customer. ** Tags removed: verification-needed-disco ** Tags added: verification-done-disco -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854887 Title: cifs: DFS Caching feature causing problems traversing multi-tier DFS setups Status in linux package in Ubuntu: Fix Released Status in linux source package in Disco: Fix Committed Status in linux source package in Eoan: Fix Committed Status in linux source package in Focal: Fix Released Bug description: BugLink: https://bugs.launchpad.net/bugs/1854887 [Impact] There is a problem where kernels 5.0-rc1 and onwards cannot mount a multi tier cifs DFS setup, while kernels 4.20 and below can mount the share fine. The DFS tiering structure looks like this: Domain virtual DFS (i.e. \\company.com\folders\share) |-- Domain controller DFS (i.e. \\regional-dc.company.com\folders\share) |-- Regional DFS Server (i.e. \\regional-dfs.company.com\folders\share) |-- Actual file server (i.e. \\regional-svr.company.com\share) On the 5.x series kernels, after getting the DFS referrals list through to the Regional DFS Server, which responds with the correct server/share, instead of going to the Actual file server, the kernel backtracks from the Regional DFS Server back to the Domain controller and requests the share there. Of course, this share does not exist on the Domain controller, as it only exists on the Actual file server, and the connection dies. We have collected a packet capture, and the flow looks like this: Legend: -------------------------------------------------- DC = Domain Controller / Domain DFS Root RDC = Regional Domain Controller / Domain DFS Root RDS = Regional DFS Server AFS = Actual File Server 4.18.0-21-generic Ubuntu kernel - Good Host: request/response -------------------------------------------------------------------- DC: company.com\folders DC: Referral List RDC: start convo RDC: <Regional Domain Controller>\Folders\Country\<Share> referral RDC: <Regional Domain Controller>\Folders\Country\<Share> referral RDS: start convo RDS: <Regional DFS Server>\Root\Country\<Share> RDS: STATUS_PATH_NOT_COVERED RDS: request referrals RDS: Referral List AFS: convo started AFS: <Actual File Server>\<Share> AFS: Good response 5.0.0-26-generic Ubuntu kernel - Bad Host: request/response ------------------------------------------------------------ DC: company.com\folders RDC: start convo RDC: <Regional Domain Controller>\Folders\Country\<Share> RDC: STATUS_PATH_NOT_COVERED RDS: start convo RDS: <Regional DFS Server>\Root\Country\<Share> RDS: STATUS_PATH_NOT_COVERED RDC: <Regional DFS Server>\Root\Country\<Share> RDC: STATUS_PATH_NOT_COVERED From there the debugging output was more or less the same between the two kernel versions, until the problematic area: Linux 4.18: Full log: https://paste.ubuntu.com/p/D9XwBbvTXc/ Status code returned 0xc0000257 STATUS_PATH_NOT_COVERED fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000257 to POSIX err -66 fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share> fs/cifs/smb2ops.c: smb2_get_dfs_refer path <\<Regional DFS Server>\Root\Country\<Share>> fs/cifs/misc.c: num_referrals: 1 dfs flags: 0x2 ... fs/cifs/dns_resolve.c: dns_resolve_server_name_to_ip: resolved: <Actual File Server> to <IPV4 Address> fs/cifs/connect.c: Username: XXX // mounts the share successfully Linux 5.0: Full log: https://paste.ubuntu.com/p/9sXPj7WMQv/ Status code returned 0xc0000257 STATUS_PATH_NOT_COVERED fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000257 to POSIX err -66 fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share> fs/cifs/connect.c: build_unc_path_to_root: full_path=\\<Regional DFS Server>\Root\Country\<Share> fs/cifs/dfs_cache.c: do_dfs_cache_find: search path: \<Regional DFS Server>\Root\Country\<Share> fs/cifs/dfs_cache.c: do_dfs_cache_find: cache miss fs/cifs/dfs_cache.c: do_dfs_cache_find: DFS referral request for \<Regional DFS Server>\Root\Country\<Share> fs/cifs/smb2ops.c: smb2_get_dfs_refer path <\<Regional DFS Server>\Root\Country\<Share>> fs/cifs/smb2pdu.c: SMB2 IOCTL Status code returned 0xc0000225 STATUS_NOT_FOUND fs/cifs/smb2maperror.c: Mapping SMB2 status code 0xc0000225 to POSIX err -2 // mounting the share fails shortly after This has quite a big impact to customers who need to mount their multi-tier DFS mounts, as they have to remain on the 4.15 bionic kernel and cannot use the HWE kernel for their machines. [Fix] After some debugging, I narrowed the cause down to a new DFS caching feature introduced in 5.0-rc1. I started a discussion with the upstream maintainer of cifs, which you can read here: https://lore.kernel.org/linux-cifs/05aa2995-e85e- 0ff4-d003-5bb08bd17...@canonical.com/T/#u This discussion resulted in the below upstream commit, which was merged in the 5.5 development window: commit 5bb30a4dd60e2a10a4de9932daff23e503f1dd2b Author: Paulo Alcantara (SUSE) <p...@cjr.nz> Date: Fri Nov 22 12:30:56 2019 -0300 Subject: cifs: Fix retrieval of DFS referrals in cifs_mount() You can read it here: https://github.com/torvalds/linux/commit/5bb30a4dd60e2a10a4de9932daff23e503f1dd2b This commit sets referrals to be passed to the newest resolved root server, instead of older ones up the order. This ensures that we keep descending down the tree instead of backtracking, which what was happening. This commit has been submitted for upstream -stable, and is still being processed. The commit is needed on kernels 5.0 and up. I will update this section if it is accepted for -stable. [Testcase] To test this commit you need a multi-tier cifs DFS with a similar structure as the tree mentioned in the Impact section. From there, you simply try and mount a cifs share. On patched kernels, the mount will succeed. On broken kernels, the mount will fail. I have prepared a test kernel for Bionic HWE, based on 5.0.0-37.40~18.04 which you can find here: https://launchpad.net/~mruffell/+archive/ubuntu/sf245466-test This test kernel has been tested by the customer and mounts the cifs DFS correctly. [Regression Potential] I believe the risk of regression for this commit is low. All changes are limited to DFS within cifs, and only change the behaviour of what server is the root server referrals are sent to. The commit is a clean cherry pick for disco, eoan and focal. The maintainer has submitted the commit for upstream -stable, and we have tested the commit with the customer, and things are now working as intended. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1854887/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp