execvp() [01] is powerful:

- it performs PATH search [02] if necessary,

- it falls back to executing the file with the shell as a shell script in
  case the underlying execv() or execve() fails with ENOEXEC.

However, execvp() is not async-signal-safe [03], and so we shouldn't call
it in a child process forked from a multi-threaded parent process [04],
which the libnbd client application may well be.

Introduce three APIs for safely emulating execvp() with execve():

- nbd_internal_execvpe_init(): to be called in the parent process, before
  fork(). Allocate and format candidate pathnames for execution with
  execve(), based on PATH. Allocate a buffer for the longer command line
  that the shell script fallback needs.

- nbd_internal_fork_safe_execvpe(): to be called in the child process. Try
  the candidates formatted previously in sequence with execve(), as long
  as execve() fails with errors related to pathname resolution /
  inode-level file type / execution permission. Stop iterating on fatal
  errors; if the fatal error is ENOEXEC, activate the shell script
  fallback. As a small added feature, take the environment from an
  explicit parameter (effectively imitating the GNU-specific execvpe()
  function -- cf. commit dc64ac5cdd0b, "states: Use execvp instead of
  execvpe", 2019-11-15).

- nbd_internal_execvpe_uninit(): to be called in the parent process, after
  fork(). Release the resources allocated with
  nbd_internal_execvpe_init().

The main idea with the above distribution of responsibilities is that any
pre-scanning of PATH -- which was suggested by Eric Blake -- should not
include activities that are performed by execve() anyway. I've considered
and abandoned two approaches:

- During scanning (i.e., in nbd_internal_execvpe_init()), check each
  candidate with access(X_OK) [05].

  I dropped this because the informative APPLICATION USAGE section of the
  same specification [06] advises against it, saying "[a]n application
  should instead attempt the action itself and handle the [EACCES] error
  that occurs if the file is not accessible".

- During scanning, open each candidate with O_EXEC [07] until one open()
  succeeds. Save the sole file descriptor for
  nbd_internal_fork_safe_execvpe(). In the latter, call fexecve() [08],
  which is free of all error conditions that necessitate iteration over
  candidates. Still implement the ENOEXEC (shell script) fallback.

  I dropped this because (a) fexecve() is a quite recent interface
  (highlighted by Eric); (b) for an application to open a file for
  execution with fexecve(), it's more robust to pass O_EXEC to open() than
  not to pass it, per APPLICATION USAGE [09], but on Linux/glibc, O_EXEC
  does not seem supported, only O_PATH does [10].

Thus the chosen approach -- pre-generate filenames -- contains a  small
TOCTTOU race (highlighted by Eric) after all, but it should be harmless.

Implementation-defined details:

- If PATH searching is necessary, but PATH is absent [01], then fall back
  to confstr(_CS_PATH) [11], similarly to Linux/glibc execvpe() [12].

  If PATH is set but empty ("set to null") [02], or PATH is unset and
  confstr(_CS_PATH) fails or returns no information or returns the empty
  string, we fail the initial scanning (!) with ENOENT. This is consistent
  with bash's behavior on Linux/glibc:

  $ PATH= ls
  bash: ls: No such file or directory

Details chosen for unspecified behavior:

- Hardwire "/bin/sh" as the shell's pathname [01].

[01] https://pubs.opengroup.org/onlinepubs/9699919799/functions/execvp.html
[02] 
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03
[03] 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03
[04] https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html
[05] https://pubs.opengroup.org/onlinepubs/9699919799/functions/access.html
[06] 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/access.html#tag_16_09_07
[07] https://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html
[08] https://pubs.opengroup.org/onlinepubs/9699919799/functions/fexecve.html
[09] 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/fexecve.html#tag_16_111_07
[10] https://man7.org/linux/man-pages/man2/open.2.html
[11] https://pubs.opengroup.org/onlinepubs/9699919799/functions/confstr.html
[12] https://man7.org/linux/man-pages/man3/execvpe.3.html

Suggested-by: Eric Blake <ebl...@redhat.com>
Signed-off-by: Laszlo Ersek <ler...@redhat.com>
---
 lib/internal.h |  22 ++
 lib/utils.c    | 351 ++++++++++++++++++++
 2 files changed, 373 insertions(+)

diff --git a/lib/internal.h b/lib/internal.h
index 9c9021ed366e..8c4e32013e40 100644
--- a/lib/internal.h
+++ b/lib/internal.h
@@ -368,6 +368,18 @@ struct command {
   uint32_t error; /* Local errno value */
 };
 
+struct execvpe {
+  string_vector pathnames;
+
+  /* Note: "const_string_vector" is not a good type for "sh_argv" below. Even 
if
+   * we reserved enough space in a "const_string_vector",
+   * const_string_vector_append() would still not be async-signal-safe, due to
+   * the underlying const_string_vector_insert() calling assert().
+   */
+  char **sh_argv;
+  size_t num_sh_args;
+};
+
 /* Test if a callback is "null" or not, and set it to null. */
 #define CALLBACK_IS_NULL(cb)     ((cb).callback == NULL && (cb).free == NULL)
 #define CALLBACK_IS_NOT_NULL(cb) (! CALLBACK_IS_NULL ((cb)))
@@ -540,4 +552,14 @@ extern void nbd_internal_fork_safe_assert (int result, 
const char *file,
                                   __func__, #expression))
 #endif
 
+extern int nbd_internal_execvpe_init (struct execvpe *ctx, const char *file,
+                                      size_t num_args)
+  LIBNBD_ATTRIBUTE_NONNULL (1, 2);
+extern void nbd_internal_execvpe_uninit (struct execvpe *ctx)
+  LIBNBD_ATTRIBUTE_NONNULL (1);
+extern int nbd_internal_fork_safe_execvpe (struct execvpe *ctx,
+                                           const string_vector *argv,
+                                           char * const *envp)
+  LIBNBD_ATTRIBUTE_NONNULL (1, 2, 3);
+
 #endif /* LIBNBD_INTERNAL_H */
diff --git a/lib/utils.c b/lib/utils.c
index 5dd00f82e073..9a96ed6ed674 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -28,6 +28,7 @@
 #include <limits.h>
 
 #include "minmax.h"
+#include "checked-overflow.h"
 
 #include "internal.h"
 
@@ -463,3 +464,353 @@ nbd_internal_fork_safe_assert (int result, const char 
*file, long line,
   xwrite (STDERR_FILENO, "' failed.\n", 10);
   abort ();
 }
+
+/* Returns the value of the PATH environment variable -- falling back to
+ * confstr(_CS_PATH) if PATH is absent -- as a dynamically allocated string. On
+ * failure, sets "errno" and returns NULL.
+ */
+static char *
+get_path (void)
+  LIBNBD_ATTRIBUTE_ALLOC_DEALLOC (free)
+{
+  char *path;
+  bool env_path_found;
+  size_t path_size, path_size2;
+
+  /* FIXME: lock the environment. */
+  path = getenv ("PATH");
+  if ((env_path_found = (path != NULL)))
+    path = strdup (path);
+  /* FIXME: unlock the environment. */
+
+  if (env_path_found) {
+    /* This handles out-of-memory as well. */
+    return path;
+  }
+
+  errno = 0;
+  path_size = confstr (_CS_PATH, NULL, 0);
+  if (path_size == 0) {
+    /* If _CS_PATH does not have a configuration-defined value, just store
+     * ENOENT to "errno".
+     */
+    if (errno == 0)
+      errno = ENOENT;
+
+    return NULL;
+  }
+
+  path = malloc (path_size);
+  if (path == NULL)
+    return NULL;
+
+  path_size2 = confstr (_CS_PATH, path, path_size);
+  assert (path_size2 == path_size);
+  return path;
+}
+
+/* nbd_internal_execvpe_init() and nbd_internal_fork_safe_execvpe() together
+ * present an execvp() alternative that is async-signal-safe.
+ *
+ * nbd_internal_execvpe_init() may only be called before fork(), for filling in
+ * the caller-allocated, uninitialized "ctx" structure. If
+ * nbd_internal_execvpe_init() succeeds, then fork() may be called.
+ * Subsequently, in the child process, nbd_internal_fork_safe_execvpe() may be
+ * called with the inherited "ctx" structure, while in the parent process,
+ * nbd_internal_execvpe_uninit() must be called to uninitialize (evacuate) the
+ * "ctx" structure.
+ *
+ * On failure, "ctx" will not have been modified, "errno" is set, and -1 is
+ * returned. Failures include:
+ *
+ * - Errors forwarded from underlying functions such as strdup(), confstr(),
+ *   malloc(), string_vector_append().
+ *
+ * - ENOENT: "file" is an empty string.
+ *
+ * - ENOENT: "file" does not contain a <slash> "/" character, the PATH
+ *           environment variable is not set, and confstr() doesn't associate a
+ *           configuration-defined value with _CS_PATH.
+ *
+ * - ENOENT: "file" does not contain a <slash> "/" character, and: (a) the PATH
+ *           environment variable is set to the empty string, or (b) PATH is 
not
+ *           set, and confstr() outputs the empty string for _CS_PATH.
+ *
+ * - EOVERFLOW: the sizes or counts of necessary objects could not be 
expressed.
+ *
+ * - EINVAL: "num_args" is less than 2.
+ *
+ * On success, the "ctx" structure will have been filled in, and 0 is returned.
+ *
+ * - "pathnames" member:
+ *
+ *   - All strings pointed-to by elements of the "pathnames" string_vector
+ *     member are owned by "pathnames".
+ *
+ *   - If "file" contains a <slash> "/" character, then the sole entry in
+ *     "pathnames" is a copy of "file".
+ *
+ *   - If "file" does not contain a <slash> "/" character:
+ *
+ *     Let "system path" be defined as the value of the PATH environment
+ *     variable, if the latter exists, and as the value output by confstr() for
+ *     _CS_PATH otherwise. Per the ENOENT specifications above, "system path" 
is
+ *     a non-empty string. Let "system path" further be of the form
+ *
+ *       <prefix_0> [n = 1]
+ *
+ *     or
+ *
+ *       <prefix_0>:<prefix_1>:...:<prefix_(n-1)> [n >= 2]
+ *
+ *     where for each 0 <= i < n, <prefix_i> does not contain the <colon> ":"
+ *     character. In the (n = 1) case, <prefix_0> is never empty (see ENOENT
+ *     above), while in the (n >= 2) case, any individual <prefix_i> may or may
+ *     not be empty.
+ *
+ *     The "pathnames" string_vector member has n elements; for each 0 <= i < 
n,
+ *     element i is of the form
+ *
+ *       suffix(curdir(<prefix_i>))file
+ *
+ *     where
+ *
+ *       curdir(x) := "."  if x = ""
+ *       curdir(x) := x    otherwise
+ *
+ *     and
+ *
+ *       suffix(x) := x      if "x" ends with a <slash> "/"
+ *       suffix(x) := x "/"  otherwise
+ *
+ *     This transformation implements the POSIX XBD / PATH environment variable
+ *     semantics, creating candidate pathnames for execution by
+ *     nbd_internal_fork_safe_execvpe(). nbd_internal_fork_safe_execvpe() will
+ *     iterate over the candidate pathnames with execve() until execve()
+ *     succeeds, or fails with an error that is due to neither pathname
+ *     resolution, nor the candidate not being a regular file, nor the 
candidate
+ *     lacking execution permission.
+ *
+ * - The "sh_argv" array member will have at least (num_args + 1) elements
+ *   allocated, and none populated.
+ *
+ *   (The minimum value of "num_args" is 2 -- see EINVAL above. According to
+ *   POSIX, "[t]he argument /arg0/ should point to a filename string that is
+ *   associated with the process being started by one of the /exec/ functions",
+ *   plus "num_args" includes the null pointer that terminates the argument
+ *   list.)
+ *
+ *   This allocation is made in anticipation of execve() failing for a
+ *   particular candidate inside nbd_internal_fork_safe_execvpe() with ENOEXEC
+ *   ("[t]he new process image file has the appropriate access permission but
+ *   has an unrecognized format"). While that failure terminates the iteration,
+ *   the failed call
+ *
+ *     execve (pathnames[i],
+ *             { argv[0], argv[1], ..., NULL }, // (num_args >= 2) elements
+ *             { envp[0], envp[1], ..., NULL })
+ *
+ *   must be repeated as
+ *
+ *     execve (<shell-path>,
+ *             { argv[0], pathnames[i],         // ((num_args + 1) >= 3)
+                 argv[1], ..., NULL },          // elements
+ *             { envp[0], envp[1], ..., NULL })
+ *
+ *   for emulating execvp(). The allocation in the "sh_argv" member makes it
+ *   possible just to *link* the original "argv" elements and the 
"pathnames[i]"
+ *   candidate into the right positions.
+ *
+ *   (POSIX leaves the shell pathname unspecified; "/bin/sh" should be good
+ *   enough.)
+ *
+ *   The shell *binary* will see itself being executed under the name 
"argv[0]",
+ *   will receive "pathnames[i]" as the pathname of the shell *script* to read
+ *   and interpret ("command_file" in POSIX terminology), will expose
+ *   "pathnames[i]" as the positional parameter $0 to the script, and will
+ *   forward "argv[1]" and the rest to the script as positional parameters $1
+ *   and onward.
+ */
+int
+nbd_internal_execvpe_init (struct execvpe *ctx, const char *file,
+                           size_t num_args)
+{
+  int rc;
+  char *sys_path;
+  string_vector pathnames;
+  char *pathname;
+  size_t num_sh_args;
+  char **sh_argv;
+  size_t sh_argv_bytes;
+
+  rc = -1;
+
+  if (file[0] == '\0') {
+    errno = ENOENT;
+    return rc;
+  }
+
+  /* First phase. */
+  sys_path = NULL;
+  pathnames = (string_vector)empty_vector;
+
+  if (strchr (file, '/') == NULL) {
+    size_t file_len;
+    const char *sys_path_element, *scan;
+    bool finish;
+
+    sys_path = get_path ();
+    if (sys_path == NULL)
+      return rc;
+
+    if (sys_path[0] == '\0') {
+      errno = ENOENT;
+      goto free_sys_path;
+    }
+
+    pathname = NULL;
+    file_len = strlen (file);
+    sys_path_element = sys_path;
+    scan = sys_path;
+    do {
+      assert (sys_path_element <= scan);
+      finish = (*scan == '\0');
+      if (finish || *scan == ':') {
+        const char *sys_path_copy_start;
+        size_t sys_path_copy_size;
+        size_t sep_copy_size;
+        size_t pathname_size;
+        char *p;
+
+        if (scan == sys_path_element) {
+          sys_path_copy_start = ".";
+          sys_path_copy_size = 1;
+        } else {
+          sys_path_copy_start = sys_path_element;
+          sys_path_copy_size = scan - sys_path_element;
+        }
+
+        assert (sys_path_copy_size >= 1);
+        sep_copy_size = (sys_path_copy_start[sys_path_copy_size - 1] != '/');
+
+        if (ADD_OVERFLOW (sys_path_copy_size, sep_copy_size, &pathname_size) ||
+            ADD_OVERFLOW (pathname_size, file_len, &pathname_size) ||
+            ADD_OVERFLOW (pathname_size, 1u, &pathname_size)) {
+          errno = EOVERFLOW;
+          goto empty_pathnames;
+        }
+
+        pathname = malloc (pathname_size);
+        if (pathname == NULL)
+          goto empty_pathnames;
+        p = pathname;
+
+        memcpy (p, sys_path_copy_start, sys_path_copy_size);
+        p += sys_path_copy_size;
+
+        memcpy (p, "/", sep_copy_size);
+        p += sep_copy_size;
+
+        memcpy (p, file, file_len);
+        p += file_len;
+
+        *p++ = '\0';
+
+        if (string_vector_append (&pathnames, pathname) == -1)
+          goto empty_pathnames;
+        /* Ownership transferred. */
+        pathname = NULL;
+
+        sys_path_element = scan + 1;
+      }
+
+      ++scan;
+    } while (!finish);
+  } else {
+    pathname = strdup (file);
+    if (pathname == NULL)
+      return rc;
+
+    if (string_vector_append (&pathnames, pathname) == -1)
+      goto empty_pathnames;
+    /* Ownership transferred. */
+    pathname = NULL;
+  }
+
+  /* Second phase. */
+  if (num_args < 2) {
+    errno = EINVAL;
+    goto empty_pathnames;
+  }
+  if (ADD_OVERFLOW (num_args, 1u, &num_sh_args) ||
+      MUL_OVERFLOW (num_sh_args, sizeof *sh_argv, &sh_argv_bytes)) {
+    errno = EOVERFLOW;
+    goto empty_pathnames;
+  }
+  sh_argv = malloc (sh_argv_bytes);
+  if (sh_argv == NULL)
+    goto empty_pathnames;
+
+  /* Commit. */
+  ctx->pathnames = pathnames;
+  ctx->sh_argv = sh_argv;
+  ctx->num_sh_args = num_sh_args;
+  rc = 0;
+  /* Fall through, for freeing temporaries. */
+
+empty_pathnames:
+  if (rc == -1) {
+    free (pathname);
+    string_vector_empty (&pathnames);
+  }
+
+free_sys_path:
+  free (sys_path);
+
+  return rc;
+}
+
+void
+nbd_internal_execvpe_uninit (struct execvpe *ctx)
+{
+  free (ctx->sh_argv);
+  ctx->num_sh_args = 0;
+  string_vector_empty (&ctx->pathnames);
+}
+
+int
+nbd_internal_fork_safe_execvpe (struct execvpe *ctx, const string_vector *argv,
+                                char * const *envp)
+{
+  size_t pathname_idx;
+
+  NBD_INTERNAL_FORK_SAFE_ASSERT (ctx->pathnames.len > 0);
+
+  pathname_idx = 0;
+  do {
+    (void)execve (ctx->pathnames.ptr[pathname_idx], argv->ptr, envp);
+    if (errno != EACCES && errno != ELOOP && errno != ENAMETOOLONG &&
+        errno != ENOENT && errno != ENOTDIR)
+      break;
+
+    ++pathname_idx;
+  } while (pathname_idx < ctx->pathnames.len);
+
+  if (errno == ENOEXEC) {
+    char **sh_argp;
+    size_t argv_idx;
+
+    NBD_INTERNAL_FORK_SAFE_ASSERT (ctx->num_sh_args >= argv->len);
+    NBD_INTERNAL_FORK_SAFE_ASSERT (ctx->num_sh_args - argv->len == 1);
+
+    sh_argp = ctx->sh_argv;
+    *sh_argp++ = argv->ptr[0];
+    *sh_argp++ = ctx->pathnames.ptr[pathname_idx];
+    for (argv_idx = 1; argv_idx < argv->len; ++argv_idx)
+      *sh_argp++ = argv->ptr[argv_idx];
+
+    (void)execve ("/bin/sh", ctx->sh_argv, envp);
+  }
+
+  return -1;
+}

_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Reply via email to