Thanks for the feedback, Luis...

On Thu, 18 Mar 2021 14:57:58 +0000, Luis Henriques wrote:

> On Fri, Mar 12, 2021 at 02:30:15AM +0100, David Disseldorp wrote:
> > From: Luis Henriques <[email protected]>  
> 
> I guess you can drop the above as you've added quite a bit of work on top
> of my original patch (which was just an RFC) ;-)

It's still mostly your code, so I'd like to keep it as-is...

> > This helper function uses the copy_file_range(2) Linux syscall for
> > in-kernel transfer between two file descriptors. Copy-on-write
> > enabled filesystems may optimize I/O by using reflinks.
> > 
> > Signed-off-by: David Disseldorp <[email protected]>
> > ---
> >  configure.ac |  6 ++++++
> >  src/extern.h |  4 ++++
> >  src/util.c   | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 60 insertions(+)
> > 
> > diff --git a/configure.ac b/configure.ac
> > index 0a35e4c..ed6e3d2 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -51,6 +51,12 @@ AC_COMPILE_CHECK_RETTYPE([minor], [0])
> >  AC_CHECK_FUNCS([fchmod fchown])
> >  # This is needed for mingw build
> >  AC_CHECK_FUNCS([setmode getpwuid getpwnam getgrgid getgrnam pipe fork 
> > getuid geteuid])
> > +AC_CHECK_FUNC([copy_file_range],
> > +              [AC_CHECK_SIZEOF(off_t)
> > +               AC_CHECK_SIZEOF(loff_t)
> > +               if (test "$ac_cv_sizeof_off_t" -eq "$ac_cv_sizeof_loff_t"); 
> > then
> > +                   AC_DEFINE(HAVE_CPIO_REFLINK, [1], [cpio --reflink 
> > support])
> > +               fi])
> >  
> >  # gnulib modules
> >  gl_INIT
> > diff --git a/src/extern.h b/src/extern.h
> > index 3b59053..6be95cf 100644
> > --- a/src/extern.h
> > +++ b/src/extern.h
> > @@ -171,6 +171,10 @@ void tape_toss_input (int in_des, off_t num_bytes);
> >  void copy_files_tape_to_disk (int in_des, int out_des, off_t num_bytes);
> >  void copy_files_disk_to_tape (int in_des, int out_des, off_t num_bytes, 
> > char *filename);
> >  void copy_files_disk_to_disk (int in_des, int out_des, off_t num_bytes, 
> > char *filename);
> > +#ifdef HAVE_CPIO_REFLINK
> > +ssize_t copy_files_range (int in_des, int out_des, off_t num_bytes);
> > +#endif
> > +
> >  void warn_if_file_changed (char *file_name, off_t old_file_size,
> >                             time_t old_file_mtime);
> >  void create_all_directories (char const *name);
> > diff --git a/src/util.c b/src/util.c
> > index d81ef30..4c0d75f 100644
> > --- a/src/util.c
> > +++ b/src/util.c
> > @@ -516,6 +516,56 @@ copy_files_disk_to_tape (int in_des, int out_des, 
> > off_t num_bytes,
> >        in_buff += size;
> >      }
> >  }
> > +
> > +#ifdef HAVE_CPIO_REFLINK
> > +/*
> > + * Use the copy_file_range(2) syscall for in-kernel transfer between two 
> > file
> > + * descriptors. Copy-on-write enabled filesystems may optimize I/O by using
> > + * reflinks.
> > + */
> > +ssize_t
> > +copy_files_range (int in_des, int out_des, off_t num_bytes)
> > +{
> > +  loff_t in_off = 0, out_off = 0;
> > +  ssize_t total = 0;
> > +  ssize_t ret;
> > +
> > +  in_off = lseek (in_des, 0, SEEK_CUR);
> > +  if (in_off < 0)
> > +    return -errno;
> > +
> > +  out_off = lseek (out_des, 0, SEEK_CUR);
> > +  if (out_off < 0)
> > +    return -errno;
> > +  
> 
> As I've already told you somewhere else, copy_file_range (crf for short!)
> implementations may expand any existing holes in the original file.
> That's OK, I guess, but it may be worth adding a comment here and possibly
> on the cpio(1) manpage, when documenting the --reflink option.
> 
> Another option would be to add a bit more complexity here by calling crf
> in a loop and lseek'ing SEEK_DATA/SEEK_HOLE.  That's what my original RFC
> patch was doing, but I agree that it's probably not worth it.
> 
> My suggestion is to, at least, add here a small comment mentioning this
> decision and referring to the cfr manpage for the details.

It's still a little unclear to me under what condition hole expansion
can occur (splice only, or also in some reflink cases, etc.). I'll add
back the SEEK_DATA / SEEK_HOLE logic that you had, as it should at least
speed up processing of Dracut padding entries.

Cheers, David

Reply via email to