On 11/18/2015 09:48 AM, Peter Stuge wrote: > Peter Stuge wrote: >> Robin H. Johnson wrote: >>> However, the largest sticking point, even with parallel threads, is that >>> it seems the base ChangeLog generation is incredibly slow. It averages >>> above 350ms per package right now (at 19k packages in a full cycle, it's >>> a long time), but some packages can take up to 5 seconds so far. >> >> Which code is doing this generation? Sorry - ENOOVERVIEW. :\ > > Bump. Does anyone know where I can take a look at this code? >
I don't know, but since no one else is answering, I'll try to find out. There are a few bugs on b.g.o. (search "changelog") that suggest `egencache --update-changelog` is being used. The egencache command is part of portage, so.... $ git clone http://anongit.gentoo.org/git/proj/portage.git Looking at bin/egencache, you'll find a bunch of indirection, but ultimately, the generate_changelog() method of the GenChangeLogs class is doing the work. The implementation is straightforward. I suspect the slow part is, # now grab all the commits revlist_cmd = ['git', self._work_tree, 'rev-list'] if self._changelog_reversed: revlist_cmd.append('--reverse') revlist_cmd.extend(['HEAD', '--', '.']) commits = self.grab(revlist_cmd).split() where @staticmethod def grab(cmd): p = subprocess.Popen(cmd, stdout=subprocess.PIPE) return _unicode_decode(p.communicate()[0], encoding=_encodings['stdio'], errors='strict') That's taking about half a second if I run it from the command-line.