[Beowulf] Python libraries slow to load across Scyld cluster

Don Kirkby Thu, 15 Jan 2015 15:25:49 -0800

I'm new to the list, so please let me know if I'm asking in the wrong place.


We're running a Scyld Beowulf cluster on CentOS 5.9, and I'm trying to run some 
Django admin commands on a compute node. The problem is that it can take three 
to five minutes to launch ten MPI processes across the four compute nodes and 
the head node. (We're using OpenMPI.) 

I traced the delay to smaller and smaller parts of the code until I created two 
example scripts that just import a Python library and print out timing 
information. Here's the first script that imports the collections module: 

from datetime import datetime 
t0 = datetime.now() 
print 'started at {}'.format(t0) 

import collections 
print 'imported at {}'.format(datetime.now() - t0) 



When I run that with mpirun -host n0 python cached_imports_collections.py, it 
initially takes about 10 seconds to import. However, repeated runs take less 
time, until it takes less than 0.01 seconds to import. 

Running an equivalent script to import the decimal module takes about 30 
seconds, and never speeds up like that. It may be too large to get cached 
completely. I ran beostatus while the script was running, and I didn't see the 
network traffic go over 100 kBps. For comparison, running wc on a 100MB file on 
a compute node causes the network traffic to go over 3000 kBps. 

I looked in the Scyld admin guide (PDF) and the reference guide (PDF), and 
found the bplib command that manages which libraries are cached and not 
transmitted with the job processes. However, its list of library directories 
already includes the Python installation. 

Is there some way to increase the size of the bplib cache, or am I doing 
something inefficient in the way I launch my Python processes? 

Thanks, 
Don Kirkby 
British Columbia Centre for Excellence in HIV/AIDS 
cfenet.ubc.ca

_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

[Beowulf] Python libraries slow to load across Scyld cluster

Reply via email to