I've been thinking about this for years, ever since the unpleasantness of that first git commit that I mistakenly attributed to myself.
With this new --amend=FILE option, you'll be able to maintain a list of <SHA,CODE+> pairs where SHA is a 40-byte SHA1 (alone on a line) referring to a commit in the current project, and CODE refers to one or more consecutive lines of Perl code. Pairs must be separated by one or more blank lines. Using that file, invoke gitlog-to-changelog with this new option, and it'll do the rest. Here's the one I expect to use for coreutils: --------------------------------------------------------- # This file is expected to be used via gitlog-to-changelog's --amend=FILE # option. It specifies what changes to make to each given SHA1's commit # log and metadata, using Perl-eval'able expressions. 3a169f4c5d9159283548178668d2fae6fced3030 # fix title: s/all tile types/all file types/ e181802521d4e19e367dbe8cfa877296bb5dafb2 # fix the title! s,seq:,factor:, 3ece0355d52e41a1b079c0c46477a32250278c11 # correct the URL s,<[^>]+>,<http://bugs.debian.org/412688>, ed5c4e770a27862813c0182be8680abeb005d15b # Wrong bug ID: s,/363011,/350541, # in this: # Suggested by Josselin Mouette in <http://bugs.debian.org/363011> 1379ed974f1fa39b12e2ffab18b3f7a607082202 # Due to a bug in vc-dwim, I mis-attributed a patch by Paul to myself. # Change the Author: to be Paul. Note the escaped "@": s,Jim .*>,Paul Eggert <eggert\@cs.ucla.edu>, 209850fd7e1e89cf8937310878bd22d70e3588a5 s/isspace/isblank/ # in this: # * tests/misc/uniq: New file. Test for the above, but only # when isspace(0240). 760bc6f7e73014e934a744a9d46ea8dbf5ba25c8 s/Now, each/Now, the/; s!(elicits.*)\.!first $1, and the second works properly.! # change the log from this: # Without this, `truncate -s '> -1' F` would truncate F to length 0, # and `truncate -s " +1" F` would truncate F to 1 byte. Now, each # elicits a diagnostic. # to this: # Without this, `truncate -s '> -1' F` would truncate F to length 0, # and `truncate -s " +1" F` would truncate F to 1 byte. Now, the # first elicits a diagnostic, and the second works properly. --------------------------------------------------------- To compare before/after, I ran this: diff -u <(build-aux/gitlog-to-changelog) \ <(build-aux/gitlog-to-changelog --amend=F) Here are the induced diffs to the generated ChangeLog: --- /proc/self/fd/11 2011-11-01 18:11:36.348907196 +0100 +++ /proc/self/fd/12 2011-11-01 18:11:36.349907223 +0100 @@ -9162,7 +9162,7 @@ 2009-09-21 Pádraig Brady <p...@draigbrady.com> - ls: handle disabling of colors consistently for all tile types + ls: handle disabling of colors consistently for all file types * src/ls.c (print_color_indicator): Use consistent syntax for all file and directory subtypes, and fall back to the color of the base type if there is no enabled color for the subtype. @@ -12887,7 +12887,7 @@ 2008-12-01 Jim Meyering <meyer...@redhat.com> - seq: plug a leak + factor: plug a leak * src/factor.c (emit_ul_factor): Call mpz_clear. avoid warnings about initialization of automatic aggregates @@ -13077,7 +13077,7 @@ cp was doing the same, even without --link, for every directory in the source hierarchy, while it can do its job with entries merely for the command-line arguments. Prompted by a report from Patrick Shoenfeld. - Details <http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/15081>. + Details <http://bugs.debian.org/412688>. * src/copy.c (copy_internal): Refrain from remembering name,dev,inode for most files, when invoked via cp --link. Record an infloop-avoidance triple for each directory specified @@ -14531,8 +14531,8 @@ truncate: ignore whitespace in --size parameters Without this, `truncate -s '> -1' F` would truncate F to length 0, - and `truncate -s " +1" F` would truncate F to 1 byte. Now, each - elicits a diagnostic. + and `truncate -s " +1" F` would truncate F to 1 byte. Now, the + first elicits a diagnostic, and the second works properly. * src/truncate.c: Skip leading white space in the --size option argument and any white space after one of the relative modifiers, so that the presence of a +/- modifier can be detected reliably. @@ -15916,7 +15916,7 @@ avoid problems with sign-extended "char" operand to is* functions * src/cut.c (set_fields): Apply to_uchar to isblank operands. * src/uniq.c (find_field): Likewise. - * src/seq.c (scan_arg): Likewise, for isspace. + * src/seq.c (scan_arg): Likewise, for isblank. * tests/misc/uniq: New file. Test for the above, but only when isspace(0240). * tests/Makefile.am (TESTS): Add misc/uniq. @@ -17551,7 +17551,7 @@ (errno_may_be_empty, ignorable_failure): New functions. * src/remove.c (is_empty_dir): Move function to ... * src/system.h (is_empty_dir): ...here, and make it inline. - Suggested by Josselin Mouette in <http://bugs.debian.org/363011> + Suggested by Josselin Mouette in <http://bugs.debian.org/350541> via Bob Proulx. * NEWS: Mention the improvement. @@ -18230,7 +18230,7 @@ to work when p/1 has a lot of indirect symlinks. (I'm surprised that it works on Linux. Perhaps a Linux bug?) -2007-11-16 Jim Meyering <meyer...@redhat.com> +2007-11-16 Paul Eggert <egg...@cs.ucla.edu> Port to Solaris 'make' and use a Posixish shell on Solaris. * bootstrap.conf (gnulib_modules): Add gnu-make, posix-shell. Here's a proposed patch. Though note that I will move some description and the example from the ChangeLog into the code: ====================================================================== >From c190d4ffca4643e40cc22a953ef55f2944bebdd8 Mon Sep 17 00:00:00 2001 From: Jim Meyering <meyer...@redhat.com> Date: Tue, 1 Nov 2011 18:04:21 +0100 Subject: [PATCH] gitlog-to-changelog: provide a ChangeLog-repair mechanism Git logs are often treated as immutable, because editing them changes the SHA1 checksums of all descendants. Thus, errors in git logs tend to stay there forever. However, when we generate a ChangeLog file -- typically for distribution -- from that git log, we can actually make corrections in the generated file. The key lies in recording in machine-readable/applicable form the desired corrections. For example, here's one from coreutils: 3a169f4c5d9159283548178668d2fae6fced3030 # fix title: s/all tile types/all file types/ It specifies a commit SHA1 value (the first line), optional comments, and a simple sed-style expression to perform the correction. Note that if you want to perform two substitutions, you must use a semicolon to separate them, even if you put them on separate lines. * build-aux/gitlog-to-changelog (parse_amend_file): New function. (usage): Describe it; alphabetize option descriptions. (main): Honor the new option, carefully. --- ChangeLog | 19 +++++++ build-aux/gitlog-to-changelog | 109 +++++++++++++++++++++++++++++++++++++++-- 2 files changed, 123 insertions(+), 5 deletions(-) diff --git a/ChangeLog b/ChangeLog index 1855e40..6575dd5 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,24 @@ 2011-11-01 Jim Meyering <meyer...@redhat.com> + gitlog-to-changelog: provide a ChangeLog-repair mechanism + Git logs are often treated as immutable, because editing them + changes the SHA1 checksums of all descendants. Thus, errors in + git logs tend to stay there forever. However, when we generate + a ChangeLog file -- typically for distribution -- from that git log, + we can actually make corrections in the generated file. The key + lies in recording in machine-readable/applicable form the desired + corrections. For example, here's one from coreutils: + 3a169f4c5d9159283548178668d2fae6fced3030 + # fix title: + s/all tile types/all file types/ + It specifies a commit SHA1 value (the first line), optional comments, + and a simple sed-style expression to perform the correction. Note + that if you want to perform two substitutions, you must use a + semicolon to separate them, even if you put them on separate lines. + * build-aux/gitlog-to-changelog (parse_amend_file): New function. + (usage): Describe it; alphabetize option descriptions. + (main): Honor the new option, carefully. + gitlog-to-changelog: avoid an infloop * build-aux/gitlog-to-changelog: Don't infloop for a commit log that ends up being empty. diff --git a/build-aux/gitlog-to-changelog b/build-aux/gitlog-to-changelog index 4612d38..139f0af 100755 --- a/build-aux/gitlog-to-changelog +++ b/build-aux/gitlog-to-changelog @@ -3,7 +3,7 @@ eval '(exit $?0)' && eval 'exec perl -wS "$0" ${1+"$@"}' if 0; # Convert git log output to ChangeLog format. -my $VERSION = '2011-10-31 16:06'; # UTC +my $VERSION = '2011-11-01 17:03'; # UTC # The definition above must lie within the first 8 lines in order # for the Emacs time-stamp write hook (at end) to update it. # If you change this file with Emacs, please let the write hook @@ -60,13 +60,15 @@ $ME, they may be preceded by '--'. OPTIONS: + --amend=FILE FILE maps from an SHA1 to perl code (i.e., s/old/new/) that + makes a change to SHA1's commit log text or metadata. + --append-dot append a dot to the first line of each commit message if + there is no other punctuation or blank at the end. --since=DATE convert only the logs since DATE; the default is to convert all log entries. --format=FMT set format string for commit subject and body; see 'man git-log' for the list of format metacharacters; the default is '%s%n%b%n' - --append-dot append a dot to the first line of each commit message if - there is no other punctuation or blank at the end. --help display this help and exit --version output version information and exit @@ -101,9 +103,60 @@ sub quoted_cmd(@) return join (' ', map {shell_quote $_} @_); } +# Parse file F. +# Comment lines (starting with "#") are ignored. +# F must consist of <SHA,CODE+> pairs where SHA is a 40-byte SHA1 +# (alone on a line) referring to a commit in the current project, and +# CODE refers to one or more consecutive lines of Perl code. +# Pairs must be separated by one or more blank line. +sub parse_amend_file($) +{ + my ($f) = @_; + + open F, '<', $f + or die "$ME: $f: failed to open for reading: $!\n"; + + my $fail; + my $h = {}; + my $in_code = 0; + my $sha; + while (defined (my $line = <F>)) + { + $line =~ /^\#/ + and next; + chomp $line; + $line eq '' + and $in_code = 0, next; + + if (!$in_code) + { + $line =~ /^([0-9a-fA-F]{40})$/ + or (warn "$ME: $f:$.: invalid line; expected an SHA1\n"), + $fail = 1, next; + $sha = lc $1; + $in_code = 1; + exists $h->{$sha} + and (warn "$ME: $f:$.: duplicate SHA1\n"), + $fail = 1, next; + } + else + { + $h->{$sha} ||= ''; + $h->{$sha} .= "$line\n"; + } + } + close F; + + $fail + and exit 1; + + return $h; +} + { my $since_date; my $format_string = '%s%n%b%n'; + my $amend_file; my $append_dot = 0; GetOptions ( @@ -111,14 +164,20 @@ sub quoted_cmd(@) version => sub { print "$ME version $VERSION\n"; exit }, 'since=s' => \$since_date, 'format=s' => \$format_string, + 'amend=s' => \$amend_file, 'append-dot' => \$append_dot, ) or usage 1; + defined $since_date and unshift @ARGV, "--since=$since_date"; + # This is a hash that maps an SHA1 to perl code (i.e., s/old/new/) + # that makes a correction in the log or attribution of that commit. + my $amend_code = defined $amend_file ? parse_amend_file $amend_file : {}; + my @cmd = (qw (git log --log-size), - '--pretty=format:%ct %an <%ae>%n%n'.$format_string, @ARGV); + '--pretty=format:%H:%ct %an <%ae>%n%n'.$format_string, @ARGV); open PIPE, '-|', @cmd or die ("$ME: failed to run `". quoted_cmd (@cmd) ."': $!\n" . "(Is your Git too old? Version 1.5.1 or later is required.)\n"); @@ -137,7 +196,34 @@ sub quoted_cmd(@) $n_read == $log_nbytes or die "$ME:$.: unexpected EOF\n"; - my @line = split "\n", $log; + # Extract leading hash. + my ($sha, $rest) = split ':', $log, 2; + defined $sha + or die "$ME:$.: malformed log entry\n"; + $sha =~ /^[0-9a-fA-F]{40}$/ + or die "$ME:$.: invalid SHA1: $sha\n"; + + # If this commit's log requires any transformation, do it now. + my $code = $amend_code->{$sha}; + if (defined $code) + { + eval 'use Safe'; + my $s = new Safe; + # Put the unpreprocessed entry into "$_". + $_ = $rest; + + # Let $code operate on it, safely. + my $r = $s->reval("$code") + or die "$ME:$.:$sha: failed to eval \"$code\":\n$@\n"; + + # Note that we've used this entry. + delete $amend_code->{$sha}; + + # Update $rest upon success. + $rest = $_; + } + + my @line = split "\n", $rest; my $author_line = shift @line; defined $author_line or die "$ME:$.: unexpected EOF\n"; @@ -200,6 +286,19 @@ sub quoted_cmd(@) close PIPE or die "$ME: error closing pipe from " . quoted_cmd (@cmd) . "\n"; # FIXME-someday: include $PROCESS_STATUS in the diagnostic + + # Barf if we fail to use an entry in the --amend=F specified file. + my $fail = 0; + if ($amend_code) + { + foreach my $sha (keys %$amend_code) + { + warn "$ME:$amend_file: unused entry: $sha\n"; + $fail = 1; + } + } + + exit $fail; } # Local Variables: -- 1.7.7.1.476.g9890