Hi, On Fri, Nov 14, 2008 at 08:33:53PM +0100, Jim Meyering wrote: > Please describe precisely the set-up required to demonstrate > the problem. > > For example, I've just done the following: > (10k empty files and 5000 empty directories, all at the same level. > True, this is not "deep" as you said, but what does your "deep" mean? > a single linear tree, a/a/a/a/.../a to a depth of 1000? or many trees, > each to a depth of 50 each)
well, the real world data we have is a backup, so by nature the directory structure can get very deep, e.g. backup-foo/home/bla/workspace/devel/debian/nmu/foobar/debian This is just an example, I guess there are deeper cases in the real data. So what I did to resemble the situation without using the real data (because the real data is some GB and copying takes about half an hour..) is create a bunch of directories and files and then find each directory and do the same in these directories. I let these script loop a while and then used it to start the test. Thats not perfect science, but it seems to be nearest to the real world scenario. > Then ran this, which shows it allocated 25MB total: > > $ valgrind cp -al a b > ... > ==6374== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 1) > ==6374== malloc/free: in use at exit: 0 bytes in 0 blocks. > ==6374== malloc/free: 63,242 allocs, 63,242 frees, 24,784,149 bytes > allocated. > ==6374== For counts of detected errors, rerun with: -v > ==6374== All heap blocks were freed -- no leaks are possible. > > FYI, cp has to keep track of a lot of dev/inode/name info. Yeah, thats okay, but I hardly believe that it really needs to use almost the whole system memory, leading to OOM situations. Wouldn't it be possible to be more clever about the memory usage? We now solved the problem for us by not using cp to copy the previous backups (because we just found an option for that in the rsnapshot configuration) and that solves the problem for us. So obvious rsync seems to be able to copy the whole real world data without needing about 900 MB (!) of RAM while cp does. > However, if you don't need to preserve hard-link relationships, > use these options in place of "-al": Well, the real world data makes heavy use of hardlinks. Its an rsnapshot backup. In the test scenario I copied with cp -a instead of cp -al. Same result. Best Regards, Patrick -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]