execvp() [01] is powerful: - it performs PATH search [02] if necessary,
- it falls back to executing the file with the shell as a shell script in case the underlying execv() or execve() fails with ENOEXEC. However, execvp() is not async-signal-safe [03], and so we shouldn't call it in a child process forked from a multi-threaded parent process [04], which the libnbd client application may well be. Introduce three APIs for safely emulating execvp() with execve(): - nbd_internal_execvpe_init(): to be called in the parent process, before fork(). Allocate and format candidate pathnames for execution with execve(), based on PATH. Allocate a buffer for the longer command line that the shell script fallback needs. - nbd_internal_fork_safe_execvpe(): to be called in the child process. Try the candidates formatted previously in sequence with execve(), as long as execve() fails with errors related to pathname resolution / inode-level file type / execution permission. Stop iterating on fatal errors; if the fatal error is ENOEXEC, activate the shell script fallback. As a small added feature, take the environment from an explicit parameter (effectively imitating the GNU-specific execvpe() function -- cf. commit dc64ac5cdd0b, "states: Use execvp instead of execvpe", 2019-11-15). - nbd_internal_execvpe_uninit(): to be called in the parent process, after fork(). Release the resources allocated with nbd_internal_execvpe_init(). The main idea with the above distribution of responsibilities is that any pre-scanning of PATH -- which was suggested by Eric Blake -- should not include activities that are performed by execve() anyway. I've considered and abandoned two approaches: - During scanning (i.e., in nbd_internal_execvpe_init()), check each candidate with access(X_OK) [05]. I dropped this because the informative APPLICATION USAGE section of the same specification [06] advises against it, saying "[a]n application should instead attempt the action itself and handle the [EACCES] error that occurs if the file is not accessible". - During scanning, open each candidate with O_EXEC [07] until one open() succeeds. Save the sole file descriptor for nbd_internal_fork_safe_execvpe(). In the latter, call fexecve() [08], which is free of all error conditions that necessitate iteration over candidates. Still implement the ENOEXEC (shell script) fallback. I dropped this because (a) fexecve() is a quite recent interface (highlighted by Eric); (b) for an application to open a file for execution with fexecve(), it's more robust to pass O_EXEC to open() than not to pass it, per APPLICATION USAGE [09], but on Linux/glibc, O_EXEC does not seem supported, only O_PATH does [10]. Thus the chosen approach -- pre-generate filenames -- contains a small TOCTTOU race (highlighted by Eric) after all, but it should be harmless. Implementation-defined details: - If PATH searching is necessary, but PATH is absent [01], then fall back to confstr(_CS_PATH) [11], similarly to Linux/glibc execvpe() [12]. If PATH is set but empty ("set to null") [02], or PATH is unset and confstr(_CS_PATH) fails or returns no information or returns the empty string, we fail the initial scanning (!) with ENOENT. This is consistent with bash's behavior on Linux/glibc: $ PATH= ls bash: ls: No such file or directory Details chosen for unspecified behavior: - Hardwire "/bin/sh" as the shell's pathname [01]. [01] https://pubs.opengroup.org/onlinepubs/9699919799/functions/execvp.html [02] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 [03] https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03 [04] https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html [05] https://pubs.opengroup.org/onlinepubs/9699919799/functions/access.html [06] https://pubs.opengroup.org/onlinepubs/9699919799/functions/access.html#tag_16_09_07 [07] https://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html [08] https://pubs.opengroup.org/onlinepubs/9699919799/functions/fexecve.html [09] https://pubs.opengroup.org/onlinepubs/9699919799/functions/fexecve.html#tag_16_111_07 [10] https://man7.org/linux/man-pages/man2/open.2.html [11] https://pubs.opengroup.org/onlinepubs/9699919799/functions/confstr.html [12] https://man7.org/linux/man-pages/man3/execvpe.3.html Suggested-by: Eric Blake <ebl...@redhat.com> Signed-off-by: Laszlo Ersek <ler...@redhat.com> --- lib/internal.h | 22 ++ lib/utils.c | 351 ++++++++++++++++++++ 2 files changed, 373 insertions(+) diff --git a/lib/internal.h b/lib/internal.h index 9c9021ed366e..8c4e32013e40 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -368,6 +368,18 @@ struct command { uint32_t error; /* Local errno value */ }; +struct execvpe { + string_vector pathnames; + + /* Note: "const_string_vector" is not a good type for "sh_argv" below. Even if + * we reserved enough space in a "const_string_vector", + * const_string_vector_append() would still not be async-signal-safe, due to + * the underlying const_string_vector_insert() calling assert(). + */ + char **sh_argv; + size_t num_sh_args; +}; + /* Test if a callback is "null" or not, and set it to null. */ #define CALLBACK_IS_NULL(cb) ((cb).callback == NULL && (cb).free == NULL) #define CALLBACK_IS_NOT_NULL(cb) (! CALLBACK_IS_NULL ((cb))) @@ -540,4 +552,14 @@ extern void nbd_internal_fork_safe_assert (int result, const char *file, __func__, #expression)) #endif +extern int nbd_internal_execvpe_init (struct execvpe *ctx, const char *file, + size_t num_args) + LIBNBD_ATTRIBUTE_NONNULL (1, 2); +extern void nbd_internal_execvpe_uninit (struct execvpe *ctx) + LIBNBD_ATTRIBUTE_NONNULL (1); +extern int nbd_internal_fork_safe_execvpe (struct execvpe *ctx, + const string_vector *argv, + char * const *envp) + LIBNBD_ATTRIBUTE_NONNULL (1, 2, 3); + #endif /* LIBNBD_INTERNAL_H */ diff --git a/lib/utils.c b/lib/utils.c index 5dd00f82e073..9a96ed6ed674 100644 --- a/lib/utils.c +++ b/lib/utils.c @@ -28,6 +28,7 @@ #include <limits.h> #include "minmax.h" +#include "checked-overflow.h" #include "internal.h" @@ -463,3 +464,353 @@ nbd_internal_fork_safe_assert (int result, const char *file, long line, xwrite (STDERR_FILENO, "' failed.\n", 10); abort (); } + +/* Returns the value of the PATH environment variable -- falling back to + * confstr(_CS_PATH) if PATH is absent -- as a dynamically allocated string. On + * failure, sets "errno" and returns NULL. + */ +static char * +get_path (void) + LIBNBD_ATTRIBUTE_ALLOC_DEALLOC (free) +{ + char *path; + bool env_path_found; + size_t path_size, path_size2; + + /* FIXME: lock the environment. */ + path = getenv ("PATH"); + if ((env_path_found = (path != NULL))) + path = strdup (path); + /* FIXME: unlock the environment. */ + + if (env_path_found) { + /* This handles out-of-memory as well. */ + return path; + } + + errno = 0; + path_size = confstr (_CS_PATH, NULL, 0); + if (path_size == 0) { + /* If _CS_PATH does not have a configuration-defined value, just store + * ENOENT to "errno". + */ + if (errno == 0) + errno = ENOENT; + + return NULL; + } + + path = malloc (path_size); + if (path == NULL) + return NULL; + + path_size2 = confstr (_CS_PATH, path, path_size); + assert (path_size2 == path_size); + return path; +} + +/* nbd_internal_execvpe_init() and nbd_internal_fork_safe_execvpe() together + * present an execvp() alternative that is async-signal-safe. + * + * nbd_internal_execvpe_init() may only be called before fork(), for filling in + * the caller-allocated, uninitialized "ctx" structure. If + * nbd_internal_execvpe_init() succeeds, then fork() may be called. + * Subsequently, in the child process, nbd_internal_fork_safe_execvpe() may be + * called with the inherited "ctx" structure, while in the parent process, + * nbd_internal_execvpe_uninit() must be called to uninitialize (evacuate) the + * "ctx" structure. + * + * On failure, "ctx" will not have been modified, "errno" is set, and -1 is + * returned. Failures include: + * + * - Errors forwarded from underlying functions such as strdup(), confstr(), + * malloc(), string_vector_append(). + * + * - ENOENT: "file" is an empty string. + * + * - ENOENT: "file" does not contain a <slash> "/" character, the PATH + * environment variable is not set, and confstr() doesn't associate a + * configuration-defined value with _CS_PATH. + * + * - ENOENT: "file" does not contain a <slash> "/" character, and: (a) the PATH + * environment variable is set to the empty string, or (b) PATH is not + * set, and confstr() outputs the empty string for _CS_PATH. + * + * - EOVERFLOW: the sizes or counts of necessary objects could not be expressed. + * + * - EINVAL: "num_args" is less than 2. + * + * On success, the "ctx" structure will have been filled in, and 0 is returned. + * + * - "pathnames" member: + * + * - All strings pointed-to by elements of the "pathnames" string_vector + * member are owned by "pathnames". + * + * - If "file" contains a <slash> "/" character, then the sole entry in + * "pathnames" is a copy of "file". + * + * - If "file" does not contain a <slash> "/" character: + * + * Let "system path" be defined as the value of the PATH environment + * variable, if the latter exists, and as the value output by confstr() for + * _CS_PATH otherwise. Per the ENOENT specifications above, "system path" is + * a non-empty string. Let "system path" further be of the form + * + * <prefix_0> [n = 1] + * + * or + * + * <prefix_0>:<prefix_1>:...:<prefix_(n-1)> [n >= 2] + * + * where for each 0 <= i < n, <prefix_i> does not contain the <colon> ":" + * character. In the (n = 1) case, <prefix_0> is never empty (see ENOENT + * above), while in the (n >= 2) case, any individual <prefix_i> may or may + * not be empty. + * + * The "pathnames" string_vector member has n elements; for each 0 <= i < n, + * element i is of the form + * + * suffix(curdir(<prefix_i>))file + * + * where + * + * curdir(x) := "." if x = "" + * curdir(x) := x otherwise + * + * and + * + * suffix(x) := x if "x" ends with a <slash> "/" + * suffix(x) := x "/" otherwise + * + * This transformation implements the POSIX XBD / PATH environment variable + * semantics, creating candidate pathnames for execution by + * nbd_internal_fork_safe_execvpe(). nbd_internal_fork_safe_execvpe() will + * iterate over the candidate pathnames with execve() until execve() + * succeeds, or fails with an error that is due to neither pathname + * resolution, nor the candidate not being a regular file, nor the candidate + * lacking execution permission. + * + * - The "sh_argv" array member will have at least (num_args + 1) elements + * allocated, and none populated. + * + * (The minimum value of "num_args" is 2 -- see EINVAL above. According to + * POSIX, "[t]he argument /arg0/ should point to a filename string that is + * associated with the process being started by one of the /exec/ functions", + * plus "num_args" includes the null pointer that terminates the argument + * list.) + * + * This allocation is made in anticipation of execve() failing for a + * particular candidate inside nbd_internal_fork_safe_execvpe() with ENOEXEC + * ("[t]he new process image file has the appropriate access permission but + * has an unrecognized format"). While that failure terminates the iteration, + * the failed call + * + * execve (pathnames[i], + * { argv[0], argv[1], ..., NULL }, // (num_args >= 2) elements + * { envp[0], envp[1], ..., NULL }) + * + * must be repeated as + * + * execve (<shell-path>, + * { argv[0], pathnames[i], // ((num_args + 1) >= 3) + argv[1], ..., NULL }, // elements + * { envp[0], envp[1], ..., NULL }) + * + * for emulating execvp(). The allocation in the "sh_argv" member makes it + * possible just to *link* the original "argv" elements and the "pathnames[i]" + * candidate into the right positions. + * + * (POSIX leaves the shell pathname unspecified; "/bin/sh" should be good + * enough.) + * + * The shell *binary* will see itself being executed under the name "argv[0]", + * will receive "pathnames[i]" as the pathname of the shell *script* to read + * and interpret ("command_file" in POSIX terminology), will expose + * "pathnames[i]" as the positional parameter $0 to the script, and will + * forward "argv[1]" and the rest to the script as positional parameters $1 + * and onward. + */ +int +nbd_internal_execvpe_init (struct execvpe *ctx, const char *file, + size_t num_args) +{ + int rc; + char *sys_path; + string_vector pathnames; + char *pathname; + size_t num_sh_args; + char **sh_argv; + size_t sh_argv_bytes; + + rc = -1; + + if (file[0] == '\0') { + errno = ENOENT; + return rc; + } + + /* First phase. */ + sys_path = NULL; + pathnames = (string_vector)empty_vector; + + if (strchr (file, '/') == NULL) { + size_t file_len; + const char *sys_path_element, *scan; + bool finish; + + sys_path = get_path (); + if (sys_path == NULL) + return rc; + + if (sys_path[0] == '\0') { + errno = ENOENT; + goto free_sys_path; + } + + pathname = NULL; + file_len = strlen (file); + sys_path_element = sys_path; + scan = sys_path; + do { + assert (sys_path_element <= scan); + finish = (*scan == '\0'); + if (finish || *scan == ':') { + const char *sys_path_copy_start; + size_t sys_path_copy_size; + size_t sep_copy_size; + size_t pathname_size; + char *p; + + if (scan == sys_path_element) { + sys_path_copy_start = "."; + sys_path_copy_size = 1; + } else { + sys_path_copy_start = sys_path_element; + sys_path_copy_size = scan - sys_path_element; + } + + assert (sys_path_copy_size >= 1); + sep_copy_size = (sys_path_copy_start[sys_path_copy_size - 1] != '/'); + + if (ADD_OVERFLOW (sys_path_copy_size, sep_copy_size, &pathname_size) || + ADD_OVERFLOW (pathname_size, file_len, &pathname_size) || + ADD_OVERFLOW (pathname_size, 1u, &pathname_size)) { + errno = EOVERFLOW; + goto empty_pathnames; + } + + pathname = malloc (pathname_size); + if (pathname == NULL) + goto empty_pathnames; + p = pathname; + + memcpy (p, sys_path_copy_start, sys_path_copy_size); + p += sys_path_copy_size; + + memcpy (p, "/", sep_copy_size); + p += sep_copy_size; + + memcpy (p, file, file_len); + p += file_len; + + *p++ = '\0'; + + if (string_vector_append (&pathnames, pathname) == -1) + goto empty_pathnames; + /* Ownership transferred. */ + pathname = NULL; + + sys_path_element = scan + 1; + } + + ++scan; + } while (!finish); + } else { + pathname = strdup (file); + if (pathname == NULL) + return rc; + + if (string_vector_append (&pathnames, pathname) == -1) + goto empty_pathnames; + /* Ownership transferred. */ + pathname = NULL; + } + + /* Second phase. */ + if (num_args < 2) { + errno = EINVAL; + goto empty_pathnames; + } + if (ADD_OVERFLOW (num_args, 1u, &num_sh_args) || + MUL_OVERFLOW (num_sh_args, sizeof *sh_argv, &sh_argv_bytes)) { + errno = EOVERFLOW; + goto empty_pathnames; + } + sh_argv = malloc (sh_argv_bytes); + if (sh_argv == NULL) + goto empty_pathnames; + + /* Commit. */ + ctx->pathnames = pathnames; + ctx->sh_argv = sh_argv; + ctx->num_sh_args = num_sh_args; + rc = 0; + /* Fall through, for freeing temporaries. */ + +empty_pathnames: + if (rc == -1) { + free (pathname); + string_vector_empty (&pathnames); + } + +free_sys_path: + free (sys_path); + + return rc; +} + +void +nbd_internal_execvpe_uninit (struct execvpe *ctx) +{ + free (ctx->sh_argv); + ctx->num_sh_args = 0; + string_vector_empty (&ctx->pathnames); +} + +int +nbd_internal_fork_safe_execvpe (struct execvpe *ctx, const string_vector *argv, + char * const *envp) +{ + size_t pathname_idx; + + NBD_INTERNAL_FORK_SAFE_ASSERT (ctx->pathnames.len > 0); + + pathname_idx = 0; + do { + (void)execve (ctx->pathnames.ptr[pathname_idx], argv->ptr, envp); + if (errno != EACCES && errno != ELOOP && errno != ENAMETOOLONG && + errno != ENOENT && errno != ENOTDIR) + break; + + ++pathname_idx; + } while (pathname_idx < ctx->pathnames.len); + + if (errno == ENOEXEC) { + char **sh_argp; + size_t argv_idx; + + NBD_INTERNAL_FORK_SAFE_ASSERT (ctx->num_sh_args >= argv->len); + NBD_INTERNAL_FORK_SAFE_ASSERT (ctx->num_sh_args - argv->len == 1); + + sh_argp = ctx->sh_argv; + *sh_argp++ = argv->ptr[0]; + *sh_argp++ = ctx->pathnames.ptr[pathname_idx]; + for (argv_idx = 1; argv_idx < argv->len; ++argv_idx) + *sh_argp++ = argv->ptr[argv_idx]; + + (void)execve ("/bin/sh", ctx->sh_argv, envp); + } + + return -1; +} _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs