On 11/18/2015 09:48 AM, Peter Stuge wrote:
> Peter Stuge wrote:
>> Robin H. Johnson wrote:
>>> However, the largest sticking point, even with parallel threads, is that
>>> it seems the base ChangeLog generation is incredibly slow. It averages
>>> above 350ms per package right now (at 19k packages in a full cycle, it's
>>> a long time), but some packages can take up to 5 seconds so far.
>>
>> Which code is doing this generation? Sorry - ENOOVERVIEW. :\
> 
> Bump. Does anyone know where I can take a look at this code?
> 

I don't know, but since no one else is answering, I'll try to find out.
There are a few bugs on b.g.o. (search "changelog") that suggest
`egencache --update-changelog` is being used. The egencache command is
part of portage, so....

  $ git clone http://anongit.gentoo.org/git/proj/portage.git

Looking at bin/egencache, you'll find a bunch of indirection, but
ultimately, the generate_changelog() method of the GenChangeLogs class
is doing the work. The implementation is straightforward. I suspect the
slow part is,

  # now grab all the commits
  revlist_cmd = ['git', self._work_tree, 'rev-list']
  if self._changelog_reversed:
      revlist_cmd.append('--reverse')
      revlist_cmd.extend(['HEAD', '--', '.'])
      commits = self.grab(revlist_cmd).split()

where

  @staticmethod
  def grab(cmd):
      p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
      return _unicode_decode(p.communicate()[0],
                             encoding=_encodings['stdio'],
                             errors='strict')

That's taking about half a second if I run it from the command-line.


Reply via email to