David Malcolm <dmalc...@redhat.com>: > > Still, if anyone else is brave enough to write a script that will munch > > through gcc-patches producing committer/date/subject-line triples, I'll > > give it a try. > > I don't think committer/date/subject-line triples are adequate: the > dates are unlikely to match up, for one thing.
Agreed. They're unlikely to match up exactly. > I think such a solution would need to somehow locate and match patches > themselves. > > I was feeling brave, so I had a go at writing a scraper; see: > https://github.com/davidmalcolm/patch-finder > for what I have so far (tested with Python 2.7). > > This can scrape the gcc-patches archives and locate mails containing > patches, extracting the patches (some of them anyway...). The idea > would be to stuff the patches into some kind of big data store, and > somehow them try to locate them (perhaps within a rough date "window"). > > Does this seem like a viable approach? I think it's as good as we're likely to get given the data available. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>