Eryk Sun <[email protected]> added the comment:
I'm trying to give os.link() and follow_symlinks the benefit of the doubt, but
the implementation just seems buggy to me.
POSIX says that "[i]f path1 names a symbolic link, it is implementation-defined
whether link() follows the symbolic link, or creates a new link to the symbolic
link itself" [1]. In Linux, link() does not follow symlinks. One has to call
linkat() with AT_SYMLINK_FOLLOW:
AT_SYMLINK_FOLLOW (since Linux 2.6.18)
By default, linkat(), does not dereference oldpath if it is a
symbolic link (like link()). The flag AT_SYMLINK_FOLLOW can be
specified in flags to cause oldpath to be dereferenced if it is
a symbolic link.
The behavior is apparently the same in FreeBSD [2].
Thus the following implementation in os.link() seems buggy.
#ifdef HAVE_LINKAT
if ((src_dir_fd != DEFAULT_DIR_FD) ||
(dst_dir_fd != DEFAULT_DIR_FD) ||
(!follow_symlinks))
result = linkat(src_dir_fd, src->narrow,
dst_dir_fd, dst->narrow,
follow_symlinks ? AT_SYMLINK_FOLLOW : 0);
else
#endif /* HAVE_LINKAT */
The only way that the value of follow_symlinks matters in Linux is if
src_dir_fd or dst_dir_fd is used with a real file descriptor (i.e. not
DEFAULT_DIR_FD, which is AT_FDCWD). Otherwise, the default True value of
follow_symlinks is an outright lie. For example:
>>> os.link in os.supports_follow_symlinks
True
>>> open('spam', 'w').close()
>>> os.symlink('spam', 'spamlink1')
>>> os.link('spamlink1', 'spamlink2')
spamlink2 was created as a hardlink to spamlink1, not its target, i.e. it's a
symlink:
>>> os.lstat('spamlink1').st_ino == os.lstat('spamlink2').st_ino
True
>>> os.readlink('spamlink2')
'spam'
In contrast, if src_dir_fd is passed, then follow_symlinks=True is implemented
as advertised (via AT_SYMLINK_FOLLOW):
>>> fd = os.open('.', 0)
>>> os.link('spamlink1', 'spamlink3', src_dir_fd=fd)
spamlink3 was created as a hardlink to spam, the target of spamlink1:
>>> os.lstat('spam').st_ino == os.lstat('spamlink3').st_ino
True
That the value of an unrelated parameter -- src_dir_fd -- changes the behavior
of the follow_symlinks parameter is obviously a bug that should be addressed.
POSIX mandates that "[i]f both fd1 and fd2 have value AT_FDCWD, the behavior
shall be identical to a call to link(), except that symbolic links shall be
handled as specified by the value of flag". It's already using AT_FDCWD as a
default value, so the implementation of os.link() should just unconditionally
call linkat() if it's available. Then the value of follow_symlinks, true or
false, will be honored, with or without passing src_dir_fd or dst_dir_fd.
That said, since os.link() hasn't been working as advertised, this change needs
to be accompanied by changing the default value of follow_symlinks to False.
That will retain the status quo behavior for most systems, except in the rare
case that src_dir_fd or dst_dir_fd is used. If it isn't changed to False, then
suddenly os.link() calls will start following symlinks, whereas prior to the
change they did not because link() was being called instead of linkat().
---
In Windows, CreateHardLinkW [3] is incorrectly documented as following symlinks
(i.e. "[i]f the path points to a symbolic link, the function creates a hard
link to the target"). Actually, it opens the file to be hard-linked with the
NTAPI option FILE_OPEN_REPARSE_POINT (same as WinAPI
FILE_FLAG_OPEN_REPARSE_POINT). Thus no type of reparse point is followed,
including symlinks.
---
[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html
[2]: https://www.unix.com/man-page/FreeBSD/2/link
[3]:
https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createhardlinkw
----------
nosy: +eryksun
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue41355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com