Package: dmtcp Version: 1.2.4-1 Severity: normal
Setting --interval for the dmtcp_coordinator has no effect (same for env var). This seems to be a new bug -- it used to work prior 1.2.4. What happens is illustrated by the coordinator log below. Initially the coordinator gets the correct checkpointing interval, but as soon as a client connects it gets reset to zero. Afterwards it can be reenable via dmtcp_command, and checkpoints will be created at the desired interval. Consequently, repeated automated checkpointing in Condor also doesn't work with 1.2.4 as the interval is only given to the coordinator once upon start. Here is the log: michael@meiner ~/debian/dmtcptest % DMTCP_CHECKPOINT_INTERVAL=30 dmtcp_coordinator dmtcp_coordinator starting... Port: 7779 Checkpoint Interval: 30 Exit on last client: 0 Type '?' for help. i Checkpoint Interval: 30 [30806] NOTE at dmtcp_coordinator.cpp:1020 in onConnect; REASON='worker connected' hello_remote.from = 1313a2c6-30822-4f57196d(-1) [30806] NOTE at dmtcp_coordinator.cpp:1026 in onConnect; REASON='CheckpointInterval Updated' oldInterval = 30 theCheckpointInterval = 0 i Checkpoint Interval: Disabled (checkpoint manually instead) <HERE I MANUALLY REENABLE THE INTERVAL VIA DMTCP_COMMAND> Checkpoint Interval: 30 [30806] NOTE at dmtcp_coordinator.cpp:1294 in startCheckpoint; REASON='starting checkpoint, suspending all nodes' s.numPeers = 1 [30806] NOTE at dmtcp_coordinator.cpp:1296 in startCheckpoint; REASON='Incremented Generation' UniquePid::ComputationId().generation() = 1 [30806] NOTE at dmtcp_coordinator.cpp:630 in onData; REASON='locking all nodes' [30806] NOTE at dmtcp_coordinator.cpp:665 in onData; REASON='draining all nodes' [30806] NOTE at dmtcp_coordinator.cpp:671 in onData; REASON='checkpointing all nodes' [30806] NOTE at dmtcp_coordinator.cpp:681 in onData; REASON='building name service database' [30806] NOTE at dmtcp_coordinator.cpp:700 in onData; REASON='entertaining queries now' [30806] NOTE at dmtcp_coordinator.cpp:705 in onData; REASON='refilling all nodes' [30806] NOTE at dmtcp_coordinator.cpp:734 in onData; REASON='restarting all nodes' [30806] NOTE at dmtcp_coordinator.cpp:905 in onDisconnect; REASON='client disconnected' client.identity() = 1313a2c6-30822-4f57196d ^C[30806] NOTE at dmtcp_coordinator.cpp:522 in handleUserCommand; REASON='killing all connected peers and quitting ...' DMTCP coordinator exiting... (per request) -- System Information: Debian Release: wheezy/sid APT prefers testing APT policy: (500, 'testing'), (1, 'experimental') Architecture: i386 (i686) Kernel: Linux 3.1.0-1-686-pae (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages dmtcp depends on: ii libc6 2.13-26 ii libgcc1 1:4.6.1-4 ii libmtcp1 1.2.4-1 ii libstdc++6 4.6.1-4 dmtcp recommends no packages. dmtcp suggests no packages. -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org