Thanks for the feedback, Luis... On Thu, 18 Mar 2021 14:57:58 +0000, Luis Henriques wrote:
> On Fri, Mar 12, 2021 at 02:30:15AM +0100, David Disseldorp wrote: > > From: Luis Henriques <[email protected]> > > I guess you can drop the above as you've added quite a bit of work on top > of my original patch (which was just an RFC) ;-) It's still mostly your code, so I'd like to keep it as-is... > > This helper function uses the copy_file_range(2) Linux syscall for > > in-kernel transfer between two file descriptors. Copy-on-write > > enabled filesystems may optimize I/O by using reflinks. > > > > Signed-off-by: David Disseldorp <[email protected]> > > --- > > configure.ac | 6 ++++++ > > src/extern.h | 4 ++++ > > src/util.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ > > 3 files changed, 60 insertions(+) > > > > diff --git a/configure.ac b/configure.ac > > index 0a35e4c..ed6e3d2 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -51,6 +51,12 @@ AC_COMPILE_CHECK_RETTYPE([minor], [0]) > > AC_CHECK_FUNCS([fchmod fchown]) > > # This is needed for mingw build > > AC_CHECK_FUNCS([setmode getpwuid getpwnam getgrgid getgrnam pipe fork > > getuid geteuid]) > > +AC_CHECK_FUNC([copy_file_range], > > + [AC_CHECK_SIZEOF(off_t) > > + AC_CHECK_SIZEOF(loff_t) > > + if (test "$ac_cv_sizeof_off_t" -eq "$ac_cv_sizeof_loff_t"); > > then > > + AC_DEFINE(HAVE_CPIO_REFLINK, [1], [cpio --reflink > > support]) > > + fi]) > > > > # gnulib modules > > gl_INIT > > diff --git a/src/extern.h b/src/extern.h > > index 3b59053..6be95cf 100644 > > --- a/src/extern.h > > +++ b/src/extern.h > > @@ -171,6 +171,10 @@ void tape_toss_input (int in_des, off_t num_bytes); > > void copy_files_tape_to_disk (int in_des, int out_des, off_t num_bytes); > > void copy_files_disk_to_tape (int in_des, int out_des, off_t num_bytes, > > char *filename); > > void copy_files_disk_to_disk (int in_des, int out_des, off_t num_bytes, > > char *filename); > > +#ifdef HAVE_CPIO_REFLINK > > +ssize_t copy_files_range (int in_des, int out_des, off_t num_bytes); > > +#endif > > + > > void warn_if_file_changed (char *file_name, off_t old_file_size, > > time_t old_file_mtime); > > void create_all_directories (char const *name); > > diff --git a/src/util.c b/src/util.c > > index d81ef30..4c0d75f 100644 > > --- a/src/util.c > > +++ b/src/util.c > > @@ -516,6 +516,56 @@ copy_files_disk_to_tape (int in_des, int out_des, > > off_t num_bytes, > > in_buff += size; > > } > > } > > + > > +#ifdef HAVE_CPIO_REFLINK > > +/* > > + * Use the copy_file_range(2) syscall for in-kernel transfer between two > > file > > + * descriptors. Copy-on-write enabled filesystems may optimize I/O by using > > + * reflinks. > > + */ > > +ssize_t > > +copy_files_range (int in_des, int out_des, off_t num_bytes) > > +{ > > + loff_t in_off = 0, out_off = 0; > > + ssize_t total = 0; > > + ssize_t ret; > > + > > + in_off = lseek (in_des, 0, SEEK_CUR); > > + if (in_off < 0) > > + return -errno; > > + > > + out_off = lseek (out_des, 0, SEEK_CUR); > > + if (out_off < 0) > > + return -errno; > > + > > As I've already told you somewhere else, copy_file_range (crf for short!) > implementations may expand any existing holes in the original file. > That's OK, I guess, but it may be worth adding a comment here and possibly > on the cpio(1) manpage, when documenting the --reflink option. > > Another option would be to add a bit more complexity here by calling crf > in a loop and lseek'ing SEEK_DATA/SEEK_HOLE. That's what my original RFC > patch was doing, but I agree that it's probably not worth it. > > My suggestion is to, at least, add here a small comment mentioning this > decision and referring to the cfr manpage for the details. It's still a little unclear to me under what condition hole expansion can occur (splice only, or also in some reflink cases, etc.). I'll add back the SEEK_DATA / SEEK_HOLE logic that you had, as it should at least speed up processing of Dracut padding entries. Cheers, David
