Pádraig Brady wrote: > num_processors() already uses _NPROCESSORS_ONLN (online processors) > so I then wondered how this be different to that returned by > pthread_getaffinity_np() ? > > A quick google for cpuset shows: > http://www.kernel.org/doc/man-pages/online/pages/man7/cpuset.7.html > > Also this is what sysconf seems to query for the variables above: > $ strace -e open getconf _NPROCESSORS_ONLN > open("/proc/stat" > $ strace -e open getconf _NPROCESSORS_CONF > open("/sys/devices/system/cpu" > > So looking at the /proc/stat code: > http://lxr.linux.no/#linux+v2.6.31/fs/proc/stat.c > Shows it calls for_each_online_cpu() > Which according to the following is each CPU available to scheduler: > http://lxr.linux.no/#linux+v2.6.31/include/linux/cpumask.h#L451 > However that's system wide and a particular process > could be in a smaller cpuset. > > pthread_getaffinity_np instead calls sched_getaffinity which > can return a smaller set as seen here: > http://lxr.linux.no/#linux+v2.6.31/kernel/sched.c#L6484
Thanks for presenting these investigations. > I do wonder though whether it would be better > to have num_processors() try to return this by default? Certainly, yes. The implementation of omp_get_num_threads() in GCC's libgomp does the same thing. > Also I'm wondering why you used the pthread interface to this? > I didn't notice pthread_getaffinity_np() in POSIX for example > (is that what the _np represents?), so why not call sched_getaffinity > directly without needing to link with the pthread library. > From experience the sched_getaffinity() call has been a moving target: > http://www.pixelbeat.org/programming/gcc/c_c++_notes.html#affinity > but it has been stable for a long time and one could just check > for the current stable interface. Good point. Additionally, NetBSD 5 also has a pthread_getaffinity_np function, but with a different API! (cpu_set_t vs. cpuset_t.) On that platform, it's based on sched_getaffinity_np() which also has a different API than sched_getaffinity() in glibc. But at least it's a different function name. > Right. So in that case I would push the sched_getaffinity() > down into num_processors in gnulib. Yes, and with the same argumentation the check of the environment variable OMP_NUM_THREADS (which I don't see in Giuseppe's patch) belongs here as well. Here is a proposed change to the gnulib 'nproc' module. It will require changes (simplification) on Giuseppe's side, of course. 2009-11-01 Bruno Haible <br...@clisp.org> Make num_processors more flexible and consistent. * lib/nproc.h (enum nproc_query): New type. (num_processors): Add a 'query' argument. * lib/nproc.c: Include <stdlib.h>, <sched.h>, c-ctype.h. (num_processors): Add a 'query' argument. Test the value of the OMP_NUM_THREADS environment variable if requested. On Linux, NetBSD, mingw, count the number of CPUs available for the current process. * m4/nproc.m4 (gl_PREREQ_NPROC): Require AC_USE_SYSTEM_EXTENSIONS. Check for sched_getaffinity and sched_getaffinity_np. * modules/nproc (Depends-on): Add c-ctype, extensions. *** NEWS.orig 2009-11-01 14:55:37.000000000 +0100 --- NEWS 2009-11-01 14:20:47.000000000 +0100 *************** *** 6,11 **** --- 6,13 ---- Date Modules Changes + 2009-11-01 nproc The num_processors function now takes an argument. + 2009-10-10 utimens The use of this module now requires linking with $(LIB_CLOCK_GETTIME). *** lib/nproc.h.orig 2009-11-01 14:55:37.000000000 +0100 --- lib/nproc.h 2009-11-01 14:20:57.000000000 +0100 *************** *** 16,29 **** along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ ! /* Written by Glen Lenker. */ /* Allow the use in C++ code. */ #ifdef __cplusplus extern "C" { #endif ! unsigned long int num_processors (void); #ifdef __cplusplus } --- 16,46 ---- along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ ! /* Written by Glen Lenker and Bruno Haible. */ /* Allow the use in C++ code. */ #ifdef __cplusplus extern "C" { #endif ! /* A "processor" in this context means a thread execution unit, that is either ! - an execution core in a (possibly multi-core) chip, in a (possibly multi- ! chip) module, in a single computer, or ! - a thread execution unit inside a core ! (hyper-threading, see <http://en.wikipedia.org/wiki/Hyper-threading>). ! Which of the two definitions is used, is unspecified. */ ! ! enum nproc_query ! { ! NPROC_ALL, /* total number of processors */ ! NPROC_CURRENT, /* processors available to the current process */ ! NPROC_CURRENT_OVERRIDABLE /* likewise, but overridable through the ! OMP_NUM_THREADS environment variable */ ! }; ! ! /* Return the total number of processors. The result is guaranteed to ! be at least 1. */ ! extern unsigned long int num_processors (enum nproc_query query); #ifdef __cplusplus } *** lib/nproc.c.orig 2009-11-01 14:55:37.000000000 +0100 --- lib/nproc.c 2009-11-01 14:54:52.000000000 +0100 *************** *** 16,28 **** along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ ! /* Written by Glen Lenker. */ #include <config.h> #include "nproc.h" #include <unistd.h> #include <sys/types.h> #if HAVE_SYS_PSTAT_H --- 16,37 ---- along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ ! /* Written by Glen Lenker and Bruno Haible. */ #include <config.h> #include "nproc.h" + #include <stdlib.h> #include <unistd.h> + #if HAVE_PTHREAD_AFFINITY_NP && 0 + # include <pthread.h> + # include <sched.h> + #endif + #if HAVE_SCHED_GETAFFINITY || HAVE_SCHED_GETAFFINITY_NP + # include <sched.h> + #endif + #include <sys/types.h> #if HAVE_SYS_PSTAT_H *************** *** 46,73 **** # include <windows.h> #endif #define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0])) - /* Return the total number of processors. The result is guaranteed to - be at least 1. */ unsigned long int ! num_processors (void) { #if defined _SC_NPROCESSORS_ONLN ! { /* This works on glibc, MacOS X 10.5, FreeBSD, AIX, OSF/1, Solaris, Cygwin, ! Haiku. */ ! long int nprocs = sysconf (_SC_NPROCESSORS_ONLN); ! if (0 < nprocs) ! return nprocs; ! } #endif #if HAVE_PSTAT_GETDYNAMIC { /* This works on HP-UX. */ struct pst_dynamic psd; ! if (0 <= pstat_getdynamic (&psd, sizeof psd, 1, 0) ! && 0 < psd.psd_proc_cnt) ! return psd.psd_proc_cnt; } #endif --- 55,269 ---- # include <windows.h> #endif + #include "c-ctype.h" + #define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0])) unsigned long int ! num_processors (enum nproc_query query) { + if (query == NPROC_CURRENT_OVERRIDABLE) + { + /* Test the environment variable OMP_NUM_THREADS, recognized also by all + programs that are based on OpenMP. The OpenMP spec says that the + value assigned to the environment variable "may have leading and + trailing white space". */ + const char *envvalue = getenv ("OMP_NUM_THREADS"); + + if (envvalue != NULL) + { + while (*envvalue != '\0' && c_isspace (*envvalue)) + envvalue++; + /* Convert it from decimal to 'unsigned long'. */ + if (c_isdigit (*envvalue)) + { + char *endptr = NULL; + unsigned long int value = strtoul (envvalue, &endptr, 10); + + if (endptr != NULL) + { + while (*endptr != '\0' && c_isspace (*endptr)) + endptr++; + if (*endptr == '\0') + return (value > 0 ? value : 1); + } + } + } + + query = NPROC_CURRENT; + } + /* Here query is one of NPROC_ALL, NPROC_CURRENT. */ + + if (query == NPROC_CURRENT) + { + /* glibc >= 2.3.3 with NPTL and NetBSD 5 have pthread_getaffinity_np, + but with different APIs. Also it requires linking with -lpthread. + Therefore this code is not enabled. + glibc >= 2.3.4 has sched_getaffinity whereas NetBSD 5 has + sched_getaffinity_np. */ + #if HAVE_PTHREAD_AFFINITY_NP && defined __GLIBC__ && 0 + { + cpu_set_t set; + + if (pthread_getaffinity_np (pthread_self (), sizeof (set), &set) == 0) + { + unsigned long count; + + # ifdef CPU_COUNT + /* glibc >= 2.6 has the CPU_COUNT macro. */ + count = CPU_COUNT (&set); + # else + size_t i; + + count = 0; + for (i = 0; i < CPU_SETSIZE; i++) + if (CPU_ISSET (i, &set)) + count++; + # endif + if (count > 0) + return count; + } + } + #elif HAVE_PTHREAD_AFFINITY_NP && defined __NetBSD__ && 0 + { + cpuset_t *set; + + set = cpuset_create (); + if (set != NULL) + { + unsigned long count = 0; + + if (pthread_getaffinity_np (pthread_self (), cpuset_size (set), set) + == 0) + { + cpuid_t i; + + for (i = 0;; i++) + { + int ret = cpuset_isset (i, set); + if (ret < 0) + break; + if (ret > 0) + count++; + } + } + cpuset_destroy (set); + if (count > 0) + return count; + } + } + #elif HAVE_SCHED_GETAFFINITY /* glibc >= 2.3.4 */ + { + cpu_set_t set; + + if (sched_getaffinity (0, sizeof (set), &set) == 0) + { + unsigned long count; + + # ifdef CPU_COUNT + /* glibc >= 2.6 has the CPU_COUNT macro. */ + count = CPU_COUNT (&set); + # else + size_t i; + + count = 0; + for (i = 0; i < CPU_SETSIZE; i++) + if (CPU_ISSET (i, &set)) + count++; + # endif + if (count > 0) + return count; + } + } + #elif HAVE_SCHED_GETAFFINITY_NP /* NetBSD >= 5 */ + { + cpuset_t *set; + + set = cpuset_create (); + if (set != NULL) + { + unsigned long count = 0; + + if (sched_getaffinity_np (getpid (), cpuset_size (set), set) == 0) + { + cpuid_t i; + + for (i = 0;; i++) + { + int ret = cpuset_isset (i, set); + if (ret < 0) + break; + if (ret > 0) + count++; + } + } + cpuset_destroy (set); + if (count > 0) + return count; + } + } + #endif + + #if (defined _WIN32 || defined __WIN32__) && ! defined __CYGWIN__ + { /* This works on native Windows platforms. */ + DWORD_PTR process_mask; + DWORD_PTR system_mask; + + if (GetProcessAffinityMask (GetCurrentProcess (), + &process_mask, &system_mask)) + { + DWORD_PTR mask = process_mask; + unsigned long count = 0; + + for (; mask != 0; mask = mask >> 1) + if (mask & 1) + count++; + if (count > 0) + return count; + } + } + #endif + #if defined _SC_NPROCESSORS_ONLN ! { /* This works on glibc, MacOS X 10.5, FreeBSD, AIX, OSF/1, Solaris, ! Cygwin, Haiku. */ ! long int nprocs = sysconf (_SC_NPROCESSORS_ONLN); ! if (nprocs > 0) ! return nprocs; ! } ! #endif ! } ! else /* query == NPROC_ALL */ ! { ! #if defined _SC_NPROCESSORS_CONF ! { /* This works on glibc, MacOS X 10.5, FreeBSD, AIX, OSF/1, Solaris, ! Cygwin, Haiku. */ ! long int nprocs = sysconf (_SC_NPROCESSORS_CONF); ! if (nprocs > 0) ! return nprocs; ! } #endif + } #if HAVE_PSTAT_GETDYNAMIC { /* This works on HP-UX. */ struct pst_dynamic psd; ! if (pstat_getdynamic (&psd, sizeof psd, 1, 0) >= 0) ! { ! /* The field psd_proc_cnt contains the number of active processors. ! In newer releases of HP-UX 11, the field psd_max_proc_cnt includes ! deactivated processors. */ ! if (query == NPROC_CURRENT) ! { ! if (psd.psd_proc_cnt > 0) ! return psd.psd_proc_cnt; ! } ! else ! { ! if (psd.psd_max_proc_cnt > 0) ! return psd.psd_max_proc_cnt; ! } ! } } #endif *************** *** 75,87 **** { /* This works on IRIX. */ /* MP_NPROCS yields the number of installed processors. MP_NAPROCS yields the number of processors available to unprivileged ! processes. We need the latter. */ ! int nprocs = sysmp (MP_NAPROCS); ! if (0 < nprocs) return nprocs; } #endif #if HAVE_SYSCTL && defined HW_NCPU { /* This works on MacOS X, FreeBSD, NetBSD, OpenBSD. */ int nprocs; --- 271,289 ---- { /* This works on IRIX. */ /* MP_NPROCS yields the number of installed processors. MP_NAPROCS yields the number of processors available to unprivileged ! processes. */ ! int nprocs = ! sysmp (query == NPROC_CURRENT && getpid () != 0 ! ? MP_NAPROCS ! : MP_NPROCS); ! if (nprocs > 0) return nprocs; } #endif + /* Finally, as fallback, use the APIs that don't distinguish between + NPROC_CURRENT and NPROC_ALL. */ + #if HAVE_SYSCTL && defined HW_NCPU { /* This works on MacOS X, FreeBSD, NetBSD, OpenBSD. */ int nprocs; *** m4/nproc.m4.orig 2009-11-01 14:55:37.000000000 +0100 --- m4/nproc.m4 2009-11-01 14:31:13.000000000 +0100 *************** *** 1,4 **** ! # nproc.m4 serial 3 dnl Copyright (C) 2009 Free Software Foundation, Inc. dnl This file is free software; the Free Software Foundation dnl gives unlimited permission to copy and/or distribute it, --- 1,4 ---- ! # nproc.m4 serial 4 dnl Copyright (C) 2009 Free Software Foundation, Inc. dnl This file is free software; the Free Software Foundation dnl gives unlimited permission to copy and/or distribute it, *************** *** 12,17 **** --- 12,19 ---- # Prerequisites of lib/nproc.c. AC_DEFUN([gl_PREREQ_NPROC], [ + dnl Persuade glibc <sched.h> to declare CPU_SETSIZE, CPU_ISSET etc. + AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS]) AC_CHECK_HEADERS([sys/pstat.h sys/sysmp.h sys/param.h],,, [AC_INCLUDES_DEFAULT]) dnl <sys/sysctl.h> requires <sys/param.h> on OpenBSD 4.0. *************** *** 21,25 **** # include <sys/param.h> #endif ]) ! AC_CHECK_FUNCS([pstat_getdynamic sysmp sysctl]) ]) --- 23,28 ---- # include <sys/param.h> #endif ]) ! AC_CHECK_FUNCS([sched_getaffinity sched_getaffinity_np \ ! pstat_getdynamic sysmp sysctl]) ]) *** modules/nproc.orig 2009-11-01 14:55:37.000000000 +0100 --- modules/nproc 2009-11-01 14:31:44.000000000 +0100 *************** *** 7,12 **** --- 7,14 ---- m4/nproc.m4 Depends-on: + c-ctype + extensions unistd configure.ac: