On Fri, Jun 06, 2014 at 11:25:52AM +0200, Johannes Schauer wrote: > Hi > > Quoting Michael Vogt (2014-06-06 11:15:29) > > There is a small typo in the above script. gc.collect should be > > gc.collect(). > > right. I noticed this too but it was too late and I already sent in my > bugreport. See my other submission to it for my updated results. > > > I verified that the following works and does not leak fds: > > """ > > class LeakTestCase(unittest.TestCase): > > def test_leak(self): > > # clenaup gc first > > import gc > > gc.collect() > > # see what fds we have > > fds = os.listdir("/proc/self/fd") > > testfile = __file__ > > tagf = apt_pkg.TagFile(testfile) > > tagf.step() > > del tagf > > import gc > > gc.collect() > > # ensure fd is closed > > self.assertEqual(fds, os.listdir("/proc/self/fd")) > > """ > > > > Unfortunately just doing a "del tagf" is not enough, the gc call is > > needed afterwards. The reason that the del is not enough is that there > > is there is a cyclic reference from the tagf to tagf.section. The > > garbage collector breaks it, but a simple del sees a refcount > > > 0. This particular case could maybe fixed by copying the data from the > > pkgTagFile to a pkgTagSection instead of letting it operator on the > > Buffer of pkgTagFile. But that requires somework (plus additional > > memory for the copied data). > > The problem is, as you also identified above, that as long as the Python > object > for apt_pkg.TagFile is around, the file stays open. > > I switched from apt_pkg to debian.deb822 because in my use case I want to read > from file A, modify the data and want to write back to A again. For that I > obviously should not have the original fd from reading A open when I write > back > to A. One possible workaround would be to copy all of A into a StringIO and > then pass that to apt_pkg.TagFile. This would probably work but I think the > expectation is that after doing: > > mypkgs = list(apt_pkg.TagFile("Packages")) > > or: > > mypkgs = [] > for pkg in apt_pkg.TagFile("Packages"): > mypkgs.append(pkgs) > > that there are no files left open. Currently in both cases, the fd is still > around. So after the last pkgTagSection is retrieved, the file should be > closed. >
The easiest way to make this work would be to make the reference from TagFile to TagSection weak, but this creates a small performance issue, because we'd need to re-create the TagSection for each section, unless it is referenced somewhere outside. So I'm not sure that's a good idea (same for mvo's idea). What we really should do is add (1) a close() method, and (2) context manager support to TagFile. This way you can just explicitly close the file when you're done with it. As a workaround, open the file yourself and pass the open file to the tag file. The best way would be: with open("Packages") as fobj: for pkg in apt_pkg.TagFile(fobj): mypkgs.append(pkgs) You need to keep the file object open as long as the tag file exists, so just TagFile(open("Packages")) would fail. -- Julian Andres Klode - Debian Developer, Ubuntu Member See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/. Please do not top-post if possible. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org