On Thu, Jun 5, 2025 at 9:46 AM Stephen Smalley <stephen.smalley.w...@gmail.com> wrote: > > On Thu, Jun 5, 2025 at 12:03 AM Collin Funk <collin.fu...@gmail.com> wrote: > > > > Hi, > > > > Using the following testdir: > > > > $ git clone https://git.savannah.gnu.org/git/gnulib.git && cd gnulib > > $ ./gnulib-tool --create-testdir --dir testdir1 --single-configure > > `./gnulib-tool --list | grep acl` > > > > I see the following result: > > > > $ cd testdir1 && ./configure && make check > > [...] > > FAIL: test-copy-acl.sh > > [...] > > FAIL: test-file-has-acl.sh > > > > This occurs with these two kernels: > > > > $ uname -r > > 6.14.9-300.fc42.x86_64 > > $ uname -r > > 6.14.8-300.fc42.x86_64 > > > > But with this kernel: > > > > $ uname -r > > 6.14.6-300.fc42.x86_64 > > > > The result is: > > > > $ cd testdir1 && ./configure && make check > > [...] > > PASS: test-copy-acl.sh > > [...] > > PASS: test-file-has-acl.sh > > > > Here is the test-suite.log from 6.14.9-300.fc42.x86_64: > > > > FAIL: test-copy-acl.sh > > ====================== > > > > /home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl: > > preserving permissions for 'tmpfile2': Numerical result out of range > > FAIL test-copy-acl.sh (exit status: 1) > > > > FAIL: test-file-has-acl.sh > > ========================== > > > > file_has_acl("tmpfile0") returned no, expected yes > > FAIL test-file-has-acl.sh (exit status: 1) > > > > To investigate further, I created the testdir again after applying the > > following diff: > > > > diff --git a/tests/test-copy-acl.sh b/tests/test-copy-acl.sh > > index 061755f124..f9457e884f 100755 > > --- a/tests/test-copy-acl.sh > > +++ b/tests/test-copy-acl.sh > > @@ -209,7 +209,7 @@ cd "$builddir" || > > { > > echo "Simple contents" > "$2" > > chmod 600 "$2" > > - ${CHECKER} "$builddir"/test-copy-acl${EXEEXT} "$1" "$2" || exit 1 > > + ${CHECKER} strace "$builddir"/test-copy-acl${EXEEXT} "$1" "$2" || > > exit 1 > > ${CHECKER} "$builddir"/test-sameacls${EXEEXT} "$1" "$2" || exit 1 > > func_test_same_acls "$1" "$2" || exit 1 > > } > > > > Then running the test from inside testdir1/gltests: > > > > $ ./test-copy-acl.sh > > [...] > > access("/etc/selinux/config", F_OK) = 0 > > openat(AT_FDCWD, "tmpfile0", O_RDONLY) = 3 > > fstat(3, {st_mode=S_IFREG|0610, st_size=16, ...}) = 0 > > openat(AT_FDCWD, "tmpfile2", O_WRONLY) = 4 > > fchmod(4, 0610) = 0 > > flistxattr(3, NULL, 0) = 17 > > flistxattr(3, 0x7ffda3f6c900, 17) = -1 ERANGE (Numerical result > > out of range) > > write(2, "/home/collin/.local/src/gnulib/t"..., > > 63/home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl: ) = 63 > > write(2, "preserving permissions for 'tmpf"..., 37preserving > > permissions for 'tmpfile2') = 37 > > write(2, ": Numerical result out of range", 31: Numerical result out of > > range) = 31 > > write(2, "\n", 1 > > ) = 1 > > exit_group(1) = ? > > +++ exited with 1 +++ > > > > So, we get the buffer size from 'flistxattr(3, NULL, 0)' and then call > > it again after allocating it 'flistxattr(3, 0x7ffda3f6c900, 17)'. This > > shouldn't fail with ERANGE then. > > > > To confirm, I replaced 'strace' with 'gdb --args'. Here is the result: > > > > (gdb) b qcopy_acl > > Breakpoint 1 at 0x400a10: file qcopy-acl.c, line 84. > > (gdb) run > > Starting program: > > /home/collin/.local/src/gnulib/testdir1/gltests/test-copy-acl tmpfile0 > > tmpfile2 > > [Thread debugging using libthread_db enabled] > > Using host libthread_db library "/lib64/libthread_db.so.1". > > > > Breakpoint 1, qcopy_acl (src_name=src_name@entry=0x7fffffffd7c3 > > "tmpfile0", source_desc=source_desc@entry=3, > > dst_name=dst_name@entry=0x7fffffffd7cc "tmpfile2", > > dest_desc=dest_desc@entry=4, mode=mode@entry=392) at qcopy-acl.c:84 > > 84 ret = chmod_or_fchmod (dst_name, dest_desc, mode); > > (gdb) n > > 90 if (ret == 0) > > (gdb) n > > 92 ret = source_desc <= 0 || dest_desc <= 0 > > (gdb) s > > attr_copy_fd (src_path=src_path@entry=0x7fffffffd7c3 "tmpfile0", > > src_fd=src_fd@entry=3, dst_path=dst_path@entry=0x7fffffffd7cc "tmpfile2", > > dst_fd=dst_fd@entry=4, check=check@entry=0x4009b0 > > <is_attr_permissions>, ctx=ctx@entry=0x0) at libattr/attr_copy_fd.c:73 > > 73 if (check == NULL) > > (gdb) n > > 76 size = flistxattr (src_fd, NULL, 0); > > (gdb) n > > 77 if (size < 0) { > > (gdb) print size > > $1 = 17 > > (gdb) n > > 86 names = (char *) my_alloc (size+1); > > (gdb) n > > 92 size = flistxattr (src_fd, names, size); > > (gdb) print errno > > $2 = 0 > > (gdb) n > > 93 if (size < 0) { > > (gdb) print size > > $3 = -1 > > (gdb) print errno > > $4 = 34 > > > > After confirming with the Fedora Kernel tags [1], I am fairly confident > > that it was caused by this commit [2]. > > > > But I am not familiar enough with ACLs, SELinux, or the Kernel to know > > the fix. > > > > Adding the lists where this was discussed and some of the signers to CC, > > since they will know better than me. > > Thank you for the bug report. Looks like the security xattr handling > is somehow replacing the overall length with just the length of the > security.selinux xattr rather than adding it to the length of the acl > xattr. Will check to see if this is already fixed on vfs.fixes; if > not, will look into a fix although it wasn't immediately obvious to me > why this is happening. There is also another patch related to this > pending that is supposed to go through the LSM tree which might fix > it.
Sorry, mea culpa; should be fixed by https://lore.kernel.org/selinux/20250605164852.2016-1-stephen.smalley.w...@gmail.com/ > > > > > Collin > > > > [1] https://gitlab.com/cki-project/kernel-ark > > [2] > > https://github.com/torvalds/linux/commit/8b0ba61df5a1c44e2b3cf683831a4fc5e24ea99d