On Sat, 31 Dec 2011, Slaven Rezic wrote:

Maybe it's the same issue as reported in
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638409 for ImageMagick.

I also saw a similar problem with a Perl script using either
GraphicsMagick or ImageMagick on a virtual machine instance. After
rebuilding graphicsmagick using the --without-openmp switch in configure
everything worked OK.

Please try using the OMP_WAIT_POLICY environment variable as described at "http://gcc.gnu.org/onlinedocs/libgomp/OMP_005fWAIT_005fPOLICY.html#OMP_005fWAIT_005fPOLICY";.

See if setting OMP_WAIT_POLICY=PASSIVE in the server environment makes much of the CPU issue go away.

Note that setting OMP_NUM_THREADS to a small number will help if the server uses fork() and so server processes are competing with each other. For point of reference, Flickr (which uses the forking model) sets OMP_NUM_THREADS=2 and also uses GOMP_CPU_AFFINITY to bind work processes to certain pre-determined CPUs.

GOMP does seem to be more wasteful of CPU than some other OpenMP implementations. It is pretty typical for an OpenMP implementation to try a busy spin-lock for a short time (e.g. 10ms) before reverting to an OS mutex. This may even include while an idle OpenMP thread is waiting for a task assignment. If the thread is always in the busy spin-lock then it will consume 100% CPU. Even if it is in the busy spin-lock for a shorter time, it will be indicated to have consumed 100% CPU since Linux evaluates CPU consumption based on if the thread was scheduled during that scheduler tick. It seems that when OMP_WAIT_POLICY is set to ACTIVE GOMP uses busy spin-lock while if it is set to PASSIVE it goes straight to OS mutex. Otherwise it likely uses a mix.

Check out the difference in the reported user (actually user + system) time consumed and the computed "iter/s cpu" for these various settings:

Minimal operations (memory-bandwidth bound):

% env gm benchmark -duration 5 convert -size 4000x3000 tile:model.pnm -noop 
null:
Results: 24 threads 251 iter 65.43s user 5.01s total 50.100 iter/s (3.836 
iter/s cpu)

% env OMP_WAIT_POLICY=ACTIVE gm benchmark -duration 5 convert -size 4000x3000 
tile:model.pnm -noop null:
Results: 24 threads 180 iter 116.18s user 5.01s total 35.928 iter/s (1.549 
iter/s cpu)

% env OMP_WAIT_POLICY=PASSIVE gm benchmark -duration 5 convert -size 4000x3000 
tile:model.pnm -noop null:
Results: 24 threads 232 iter 39.06s user 5.01s total 46.307 iter/s (5.940 
iter/s cpu)

Rotate image by 90 degrees (lots of thread contention):

% gm benchmark -duration 5 convert -size 4000x3000 tile:model.pnm -rotate 90 
null:
Results: 24 threads 71 iter 54.86s user 5.05s total 14.059 iter/s (1.294 iter/s 
cpu)

% env OMP_WAIT_POLICY=ACTIVE gm benchmark -duration 5 convert -size 4000x3000 
tile:model.pnm -rotate 90 null:
Results: 24 threads 55 iter 116.00s user 5.07s total 10.848 iter/s (0.474 
iter/s cpu)

% env OMP_WAIT_POLICY=PASSIVE gm benchmark -duration 5 convert -size 4000x3000 
tile:model.pnm -rotate 90 null:
Results: 24 threads 71 iter 45.29s user 5.07s total 14.004 iter/s (1.568 iter/s 
cpu)

Image resize:

% gm benchmark -duration 5 convert -size 4000x3000 tile:model.pnm -resize 50% 
null:
Results: 24 threads 38 iter 100.31s user 5.09s total 7.466 iter/s (0.379 iter/s 
cpu)

% env OMP_WAIT_POLICY=ACTIVE gm benchmark -duration 5 convert -size 4000x3000 
tile:model.pnm -resize 50% null:
Results: 24 threads 39 iter 118.29s user 5.01s total 7.784 iter/s (0.330 iter/s 
cpu)

% env OMP_WAIT_POLICY=PASSIVE gm benchmark -duration 5 convert -size 4000x3000 
tile:model.pnm -resize 50% null:
Results: 24 threads 37 iter 88.86s user 5.08s total 7.283 iter/s (0.416 iter/s 
cpu)

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to