On Sun, Nov 24, 2024 at 8:32 AM Cedric Blancher
<[email protected]> wrote:
>
> On Sat, 23 Nov 2024 at 17:47, Jeremy Drake <[email protected]> wrote:
> >
> > On Sat, 23 Nov 2024, Cedric Blancher via Cygwin wrote:
> >
> > > Good afternoon!
> > >
> > > Does Cygwin do a silly rename if a Cygwin file is open but gets
> > > /bin/rm at the same time?
> >
> > Yes! See function try_to_bin in winsup/cygwin/syscalls.cc:
> > /* Create unique filename. Start with a dot, followed by "cyg"
> > transposed into the Unicode low surrogate area (U+dc00) on file
> > systems supporting Unicode (except Samba), followed by the inode
> > number in hex, followed by a path hash in hex. The combination
> > allows to remove multiple hardlinks to the same file. */
>
> That code is wrong.
>
> bash -c 'printf ".\udc63\udc79\udc67#\n"' | iconv -f UTF-8
> .iconv: illegal input sequence at position 1
>
> 334 RtlAppendUnicodeToString (&recycler,
> 335 (pc.fs_flags () & FILE_UNICODE_ON_DISK
> 336 && !pc.fs_is_samba ())
> 337 ? L".\xdc63\xdc79\xdc67" : L".cyg");
>
> SAMBA is right to reject L".\xdc63\xdc79\xdc67", because it is not a
> valid UTF-16 sequence. ReFS with validation, OpenZFS and so on will
> all REJECT such file names, and neither can NFSv4 because file names
> must be valid Unicode (even if nfsd would not validate then filesystem
> being shared via nfsd will reject that).
> So this can only work on ntfs, and only if it is not validating the
> input UTF.16 sequence.
>
> AFAIK FILE_UNICODE_ON_DISK means that the wchar_t sequences must be
> valid UTF-16, and not just be a random sequence of 16bit values.
>
> @Corinna Vinschen Could this sequence please be changed to a VALID
> UTF-8 sequence, such as \u[fffc]\u[fffc]\u[fffc]? That might work with
> SAMBA, ReFS, OpenZFS NFSv4, ...
That does not help with existing Cygwin installations and Cygwin
32bit, which is stuck at Cygwin 3.3.x ... ;-(
I agree that the L".\xdc63\xdc79\xdc67" prefix will backfire on
something like ReFS, OpenZFS etc (SAMBA uses the prefix for
filesystems which do NOT have |FILE_UNICODE_ON_DISK| set), but for
ms-nfs41-client I just stomp over the issues with this patch (wording
still needs to be improved):
---- snip ----
diff --git a/daemon/setattr.c b/daemon/setattr.c
index 9eaafb5..6e9729e 100644
--- a/daemon/setattr.c
+++ b/daemon/setattr.c
@@ -284,6 +284,46 @@ static int handle_nfs41_rename(void
*daemon_context, setattr_upcall_args *args)
EASSERT((rename->FileNameLength%sizeof(WCHAR)) == 0);
+#define CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE 1
+
+#ifdef CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE
+ /*
+ * Stomp Cygwin "silly rename" invalid Unicode sequence
+ *
+ * Cygwin has it's own variation of "silly rename" (i.e. if
+ * someone deletes a file while someone else still has
+ * a valid fd to that file it first renames that file with a
+ * special prefix, see
+ * newlib-cygwin/winsup/cygwin/syscalls.cc, function
+ * |try_to_bin()|).
+ *
+ * Unfortunately on filesystems supporting Unicode
+ * (i.e. |FILE_UNICODE_ON_DISK|) Cygwin adds the prefix
+ * L".\xdc63\xdc79\xdc67", which is NOT a valid UTF-16 sequence,
+ * and will be rejected by a filesystem validating the
+ * UTF-16 sequence (e.g. SAMBA, ReFS, OpenZFS, ...).
+ * In our case the NFSv4.1 protocol requires valid UTF-8
+ * sequences, and the NFS server will reject filenames if either
+ * the server or the exported filesystem will validate the UTF-8
+ * sequence.
+ *
+ * Since Cygwin only does a |rename()| and never a lookup by
+ * that filename we just stomp the prefix with the prefix used
+ * for non-|FILE_UNICODE_ON_DISK| filesystems.
+ * We ignore the side-effects here, e.g. that Win32 will still
+ * "remember" the original filename in the file name cache.
+ */
+ if ((rename->FileNameLength > (4*sizeof(wchar_t))) &&
+ (!memcmp(rename->FileName,
+ L".\xdc63\xdc79\xdc67", (4*sizeof(wchar_t))))) {
+ DPRINTF(1, ("handle_nfs41_rename(args->path='%s'): "
+ "Cygwin sillyrename prefix \".\\xdc63\\xdc79\\xdc67\" "
+ "detected, squishing prefix to \".cyg\"\n",
+ args->path));
+ (void)memcpy(rename->FileName, L".cyg", 4*sizeof(wchar_t));
+ }
+#endif /* CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE */
+
dst_path.len = (unsigned short)WideCharToMultiByte(CP_UTF8,
WC_ERR_INVALID_CHARS|WC_NO_BEST_FIT_CHARS,
rename->FileName, rename->FileNameLength/sizeof(WCHAR),
---- snip ----
----
Bye,
Roland
--
__ . . __
(o.\ \/ /.o) [email protected]
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple