I'm working on a script to track contributors so that (A) we can track
project health for ASF board report purposes and (B) we can possibly
share a nice "Thank you" listing contributors in release
announcements.  Other purposes might crop up.  GitHub's contributors
report has serious shortcomings[1] so I'm not using that.

So far I have something like this:
  git log main --since="3 months ago" --pretty="Author: %an <%ae>%n%B"
| awk -F': ' '/^(Author|Co-authored-by): / {print $2}'  | sort | uniq
-c

But needs deduplication because most people have multiple entries.
With the complexity of deduplication, I'd convert this to Python and
put in dev-tools/scripts and create a "contributors.txt" file
somewhere that contains a full name, primary email, and email aliases.

I'm sure it's debatable to go this route vs CHANGES.txt but the latter
is harder to parse and ... I dunno; I don't like that it's so custom
compared to a generic Git metadata approach.  But maybe the dedupe
wouldn't be necessary (just fix CHANGES.txt for dups), and wouldn't
include trivial edits (for better/worse).  CHANGES.txt would be more
accurate for version-specific contribution attribution (since
CHANGES.txt is organized this way but harder to do between arbitrary
commits/dates.

[1] 
https://docs.github.com/en/repositories/viewing-activity-and-data-for-your-repository/viewing-a-projects-contributors#troubleshooting-contributors

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Reply via email to