Wed, 20 Jul 2011 11:31:41 +0000, Pauli Virtanen wrote: [clip] > There is a sharp order-of-magnitude change of speed in malloc+memset of > an array, which is not present in memset itself. (This is then also > reflected in the Numpy performance -- floating point operations probably > don't cost much compared to memory access speed.) It seems that either > the kernel or the C library changes the way it handles allocation at > that point.
The explanation seems to be the following: (a) When the process adjusts the size of its heap, the kernel must zero new pages it gives to the process (because they might contain sensitive information from other processes) [1] (b) GNU libc hangs onto some memory even after free() is called, so that the heap size doesn't need to be adjusted continuously. This is controlled by parameters that can be tuned with the mallopt() function. [2] Because of (a), there is a performance hit probably equivalent to `memset(buf, 0, size)` or more (kernel overheads?) for using newly allocated memory the first time. But because of (b), this hit mainly applies to buffers larger than some threshold. Preallocating can get rid of this overhead, but it probably only matters in places where you reuse the same memory many times, and the operations done are not much more expensive than whatever the kernel needs to do. Alternatively, you can call mallopt(M_TRIM_THRESHOLD, N); mallopt(M_TOP_PAD, N); mallopt(M_MMAP_MAX, 0); with large enough `N`, and let libc manage the memory reuse for you. .. [1] http://stackoverflow.com/questions/1327261 .. [2] http://www.gnu.org/s/hello/manual/libc/Malloc-Tunable-Parameters.html -- Pauli Virtanen _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion