On Apr 4, 2012, at 6:01 AM, Michael Hanke wrote:

> Package: condor
> Version: 7.7.5~dfsg.1-2
> Severity: normal
> 
> Hi,
> 
> We are running a backport of the Debian package of Condor 7.7.5 from
> experimental on a cluster of Debian stable machines. Since upgrading
> from 7.7.4 we noticed an increased memory demand for pretty much all
> jobs.
> 
> I recently ran a week-long job that starts off at 10GB size and should
> not gain significant memory size throughout the process (as confirmed
> with Condor 7.7.4). After the upgrade to 7.7.5 the job continuously
> increases it memory demands and I have to kill it after two days when it
> exceeds 150GB consumption. However, the continuous growth is not limited
> to this particular job -- most type of long-running jobs on this machine
> are Python-based, though.
> 
> Looking into the 7.7.5 changelog I see a number of memory-related
> aspects, but nothing that is a perfect match. I checked that this is not
> just about Condor reporting increasing memory consumption, but the
> respective cluster nodes actually run out of memory, because the job
> grows and grows.
> 
> I'd be glad to get some feedback on what the problem could be and if
> there is a workaround.


I've scanned the changes in Condor 7.7.5 and I also don't see anything
that would explain a change in the memory behavior of jobs. I assume 
you're submitting your jobs under the vanilla universe.

Have you tried logging into the execute nodes and running the programs  
interactively? That's a good way to test if something other than Condor 
changed and is responsible for the difference.

You can also try running condor_ssh_to_job while a job is running to get 
an interactive session with the same environment as your job. You can 
examine the environment variables, etc. for any odd settings. You even  
submit a sleep job, then use condor_ssh_to_job to start your program 
interactively in the environment Condor sets up, possibly tweaking 
environment variables first.

Thanks and regards,
Jaime Frey
UW-Madison Condor Team




-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to