On Sat, 31 Dec 2011, Slaven Rezic wrote:
Maybe it's the same issue as reported in
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638409 for ImageMagick.
I also saw a similar problem with a Perl script using either
GraphicsMagick or ImageMagick on a virtual machine instance. After
rebuilding graphicsmagick using the --without-openmp switch in configure
everything worked OK.
Please try using the OMP_WAIT_POLICY environment variable as described
at
"http://gcc.gnu.org/onlinedocs/libgomp/OMP_005fWAIT_005fPOLICY.html#OMP_005fWAIT_005fPOLICY".
See if setting OMP_WAIT_POLICY=PASSIVE in the server environment makes
much of the CPU issue go away.
Note that setting OMP_NUM_THREADS to a small number will help if the
server uses fork() and so server processes are competing with each
other. For point of reference, Flickr (which uses the forking model)
sets OMP_NUM_THREADS=2 and also uses GOMP_CPU_AFFINITY to bind work
processes to certain pre-determined CPUs.
GOMP does seem to be more wasteful of CPU than some other OpenMP
implementations. It is pretty typical for an OpenMP implementation to
try a busy spin-lock for a short time (e.g. 10ms) before reverting to
an OS mutex. This may even include while an idle OpenMP thread is
waiting for a task assignment. If the thread is always in the busy
spin-lock then it will consume 100% CPU. Even if it is in the busy
spin-lock for a shorter time, it will be indicated to have consumed
100% CPU since Linux evaluates CPU consumption based on if the thread
was scheduled during that scheduler tick. It seems that when
OMP_WAIT_POLICY is set to ACTIVE GOMP uses busy spin-lock while if it
is set to PASSIVE it goes straight to OS mutex. Otherwise it likely
uses a mix.
Check out the difference in the reported user (actually user + system)
time consumed and the computed "iter/s cpu" for these various
settings:
Minimal operations (memory-bandwidth bound):
% env gm benchmark -duration 5 convert -size 4000x3000 tile:model.pnm -noop
null:
Results: 24 threads 251 iter 65.43s user 5.01s total 50.100 iter/s (3.836
iter/s cpu)
% env OMP_WAIT_POLICY=ACTIVE gm benchmark -duration 5 convert -size 4000x3000
tile:model.pnm -noop null:
Results: 24 threads 180 iter 116.18s user 5.01s total 35.928 iter/s (1.549
iter/s cpu)
% env OMP_WAIT_POLICY=PASSIVE gm benchmark -duration 5 convert -size 4000x3000
tile:model.pnm -noop null:
Results: 24 threads 232 iter 39.06s user 5.01s total 46.307 iter/s (5.940
iter/s cpu)
Rotate image by 90 degrees (lots of thread contention):
% gm benchmark -duration 5 convert -size 4000x3000 tile:model.pnm -rotate 90
null:
Results: 24 threads 71 iter 54.86s user 5.05s total 14.059 iter/s (1.294 iter/s
cpu)
% env OMP_WAIT_POLICY=ACTIVE gm benchmark -duration 5 convert -size 4000x3000
tile:model.pnm -rotate 90 null:
Results: 24 threads 55 iter 116.00s user 5.07s total 10.848 iter/s (0.474
iter/s cpu)
% env OMP_WAIT_POLICY=PASSIVE gm benchmark -duration 5 convert -size 4000x3000
tile:model.pnm -rotate 90 null:
Results: 24 threads 71 iter 45.29s user 5.07s total 14.004 iter/s (1.568 iter/s
cpu)
Image resize:
% gm benchmark -duration 5 convert -size 4000x3000 tile:model.pnm -resize 50%
null:
Results: 24 threads 38 iter 100.31s user 5.09s total 7.466 iter/s (0.379 iter/s
cpu)
% env OMP_WAIT_POLICY=ACTIVE gm benchmark -duration 5 convert -size 4000x3000
tile:model.pnm -resize 50% null:
Results: 24 threads 39 iter 118.29s user 5.01s total 7.784 iter/s (0.330 iter/s
cpu)
% env OMP_WAIT_POLICY=PASSIVE gm benchmark -duration 5 convert -size 4000x3000
tile:model.pnm -resize 50% null:
Results: 24 threads 37 iter 88.86s user 5.08s total 7.283 iter/s (0.416 iter/s
cpu)
Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org