On Sun, Nov 05, 2006 at 10:41:50AM +0100, Ed Schofield <[EMAIL PROTECTED]> was heard to say: > Update: I can run aptitude fine under valgrind. It prints out lots of > messages describing (real or imagined) memory errors, but it runs > without segfaulting. Running the same command without valgrind produces > a segfault every time. Valgrind's memcheck tool traps every single > memory access and performs it synthetically. It also uses much _more_ > memory than the program would alone. So I think we can conclude that > this segfault is not just due to my machine running out of RAM, but due > to old-fashioned heap corruption. > > I'll attach the valgrind log file below. Perhaps the lines > > ==1444== Mismatched free() / delete / delete [] > ==1444== at 0x401CCBC: operator delete(void*) (vg_replace_malloc.c:244) > ==1444== by 0x82708E2: reset_surrounding_or_memoization() (apt.cc:89) > > indicate the problem??
Might be. > ==1444== Conditional jump or move depends on uninitialised value(s) > ==1444== at 0x826D942: > aptitudeDepCache::build_selection_list(OpProgress&, bool, bool, char > const*) (aptcache.cc:349) > ==1444== by 0x826E71E: aptitudeDepCache::Init(OpProgress*, bool, > bool, char const*) (aptcache.cc:193) > ==1444== by 0x826E94F: aptitudeCacheFile::Open(OpProgress&, bool, > bool, char const*) (aptcache.cc:1638) > ==1444== by 0x8272E1C: apt_load_cache(OpProgress*, bool, char const*) > (apt.cc:295) > ==1444== by 0x81D8022: cmdline_upgrade(int, char**, char const*, > bool, bool, bool, bool, bool, bool, bool, bool, bool, int) > (cmdline_upgrade.cc:37) > ==1444== by 0x80E6C08: main (main.cc:480) This looks like it might be a real bug; I'm not sure how it's lasted so long. Apparently one of aptitude's internal state parameters, indicating the current state of a package, is being left uninitialized when the program starts up. I think that "Unknown" is probably right here. > ==1444== Conditional jump or move depends on uninitialised value(s) > ==1444== at 0x4069187: pkgTagSection::Scan(char const*, unsigned > long) (in /usr/lib/libapt-pkg-libc6.3-6.so.3.11.0) > ==1444== by 0x82C1075: insert_tags(pkgCache::VerIterator const&, > pkgCache::VerFileIterator const&) (tags.cc:164) > ==1444== by 0x82C1A6E: load_tags(OpProgress&) (tags.cc:221) > ==1444== by 0x8272FB3: apt_load_cache(OpProgress*, bool, char const*) > (apt.cc:331) > ==1444== by 0x81D8022: cmdline_upgrade(int, char**, char const*, > bool, bool, bool, bool, bool, bool, bool, bool, bool, int) > (cmdline_upgrade.cc:37) > ==1444== by 0x80E6C08: main (main.cc:480) I'm not sure where this comes from. It looks to me like the values that should influence Scan's behavior are all either initialized by aptitude or generated by apt routines. I know that I've noticed valgrind apparently being confused by references into the apt cache in the past; maybe that's what this is. > ==1444== Mismatched free() / delete / delete [] > ==1444== at 0x401CCBC: operator delete(void*) (vg_replace_malloc.c:244) > ==1444== by 0x82708E2: reset_surrounding_or_memoization() (apt.cc:89) > ==1444== by 0x8270EB1: apt_close_cache() (signal.h:544) > ==1444== by 0x828054E: > download_install_manager::finish(pkgAcquire::RunResult, OpProgress&) > (download_install_manager.cc:179) > ==1444== by 0x81D98CA: cmdline_do_download(download_manager*) > (cmdline_util.cc:185) > ==1444== by 0x81D8474: cmdline_upgrade(int, char**, char const*, > bool, bool, bool, bool, bool, bool, bool, bool, bool, int) > (cmdline_upgrade.cc:110) > ==1444== by 0x80E6C08: main (main.cc:480) > ==1444== Address 0x6F67028 is 0 bytes inside a block of size 616,560 > alloc'd > ==1444== at 0x401D7C1: operator new[](unsigned) (vg_replace_malloc.c:195) > ==1444== by 0x8271CD7: surrounding_or(pkgCache::DepIterator, > pkgCache::DepIterator&, pkgCache::DepIterator&, pkgCache*) (apt.cc:479) > ==1444== by 0x8272489: package_recommended(pkgCache::PkgIterator > const&) (apt.cc:570) > ==1444== by 0x81C80D5: cmdline_show_preview(bool, > std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>, > std::allocator<pkgCache::PkgIterator> >&, > std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>, > std::allocator<pkgCache::PkgIterator> >&, > std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>, > std::allocator<pkgCache::PkgIterator> >&, bool, bool, bool, int) > (cmdline_prompt.cc:493) > ==1444== by 0x81C873D: cmdline_do_prompt(bool, > std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>, > std::allocator<pkgCache::PkgIterator> >&, > std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>, > std::allocator<pkgCache::PkgIterator> >&, > std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>, > std::allocator<pkgCache::PkgIterator> >&, > std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>, > std::allocator<pkgCache::PkgIterator> >&, bool, bool, bool, bool, int, > bool, bool) (cmdline_prompt.cc:736) > ==1444== by 0x81D8434: cmdline_upgrade(int, char**, char const*, > bool, bool, bool, bool, bool, bool, bool, bool, bool, int) > (cmdline_upgrade.cc:99) > ==1444== by 0x80E6C08: main (main.cc:480) That looks like a definite aptitude bug. I don't know if it's causing your crash, though. Could you see what happens if you apply the attached patch? Thanks, Daniel
diff -rN -u old-head/src/generic/apt/aptcache.cc new-head/src/generic/apt/aptcache.cc --- old-head/src/generic/apt/aptcache.cc 2006-11-07 17:46:59.000000000 -0800 +++ new-head/src/generic/apt/aptcache.cc 2006-11-07 17:46:59.000000000 -0800 @@ -226,6 +226,7 @@ package_states[i].reinstall=false; package_states[i].install_reason=manual; package_states[i].remove_reason=manual; + package_states[i].selection_state = pkgCache::State::Unknown; } if(WithLock && lock==-1) diff -rN -u old-head/src/generic/apt/apt.cc new-head/src/generic/apt/apt.cc --- old-head/src/generic/apt/apt.cc 2006-11-07 17:46:59.000000000 -0800 +++ new-head/src/generic/apt/apt.cc 2006-11-07 17:46:59.000000000 -0800 @@ -80,13 +80,13 @@ static void reset_interesting_dep_memoization() { - delete cached_deps_interesting; + delete[] cached_deps_interesting; cached_deps_interesting = NULL; } static void reset_surrounding_or_memoization() { - delete cached_surrounding_or; + delete[] cached_surrounding_or; cached_surrounding_or = NULL; }