(Moved to gcc-patches, missed this when I replied to the initial email)

Regarding patch at: https://gcc.gnu.org/pipermail/gcc/2021-June/236357.html

On 10.06.21 14:45, Jonathan Wakely wrote:

As well as the "contrig" typo that Florian noticed, the subject says
"in in" which should be "is in". And it should be CC'd to gcc-patches.

I like this more than my attempt, however ...
--- a/contrib/gcc-changelog/git_repository.py
+++ b/contrib/gcc-changelog/git_repository.py
@@ -59,8 +59,9 @@ def parse_git_revisions(repo_path, revisions, ref_name=None):

             date = datetime.utcfromtimestamp(c.committed_date)
             author = '%s  <%s>' % (c.author.name, c.author.email)
-            git_info = GitInfo(c.hexsha, date, author,
-                               c.message.split('\n'), modified_files)
+            message = c.message.split('\n')
+            git_info = GitInfo(c.hexsha, date, author, message[0],
+                               message[1:], modified_files)
Doesn't using message[1:] here mean that other checks which currently
look at all of self.info.lines will no longer check the subject line?
...
For example, we have:
         # Skip Update copyright years commits
         if self.info.lines and self.info.lines[0] == 'Update copyright years.':
             return
This will never match now, because you've extracted that into the
'subject' instead.

Well, it never matched before for git_email.py, it only did match for
git_repository.py. I think the difference between your work and mine was
that I started with git_email.py – and you started with git_repository.py.

I now pass again the whole message to git_commit.py – also for emails. I
think that's more consistent when checking for an empty line as second line.

And for the copyright case, I added a testcase :-)

Aside: We should also have a check that the second line is blank, i.e.
the commit message is a single line subject, followed by blank,
followed by the body. And if we enforced that, then message[2:] would
be better than message[1:].
Added as check – but I pass now all (also subject + '\n' + body) for the
email, which I think it easier to grasp. But as three is always an empty
line for emails (to separate header and body), I cannot add a testcase
for it, unfortunately.

Updated patch enclosed.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
contrib/gcc-changelog: Check that PR in subject is in changelog

This patch checks that a '[PRnnnn]' and '(PRnnnn)' also appears as PR in the
changelog part of the commit message.  And it does likewise for 'PR comp/nnnn'
except that then also the component name is checked.  (Note that the reverse
is permitted, i.e. PR(s) only appearing in the changelog.)
To avoid false positives, PR numbers in the subject line are ignored,
if 'revert' appears.
Additionally, reject commits with a nonempty second line.

contrib/ChangeLog:

	* gcc-changelog/git_commit.py (pr_regex): Add ?P<pr> for group('pr').
	(subject_pr_skip_regex, subject_pr_regex, subject_pr2_regex): New.
	(GitInfo.__init__, GitCommit.parse_changelog): Check subject PRs.
	* gcc-changelog/git_email.py (SUBJECT_PREFIX, subject_patch_regex): New.
	(GitEmail.__init__): Parse 'Subject:' and pass it to GitInfo.
	* gcc-changelog/test_email.py (test_pr_only_in_subject,
	test_wrong_pr_comp_in_subject, test_copyright_years): New.
	* gcc-changelog/test_patches.txt (0030-PR-c-92746, pr-check1.patch):
	Update to avoid triggering the new check.
	(0001-rs6000-Support-doubleword, pr-wrong-comp.patch,
	copyright-years.patch): New.

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index bd8c1ff7af2..6752c1111a8 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -156,7 +156,11 @@ author_line_regex = \
         re.compile(r'^(?P<datetime>\d{4}-\d{2}-\d{2})\ {2}(?P<name>.*  <.*>)')
 additional_author_regex = re.compile(r'^\t(?P<spaces>\ *)?(?P<name>.*  <.*>)')
 changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
-pr_regex = re.compile(r'\tPR (?P<component>[a-z+-]+\/)?([0-9]+)$')
+subject_pr_skip_regex = re.compile(r'[Rr]evert')
+subject_pr_regex = \
+        re.compile(r'(^|\W)PR\s+(?P<component>[a-zA-Z+-]+)/(?P<pr>\d{4,7})')
+subject_pr2_regex = re.compile(r'[(\[]PR\s*(?P<pr>\d{4,7})[)\]]')
+pr_regex = re.compile(r'\tPR (?P<component>[a-z+-]+\/)?(?P<pr>[0-9]+)$')
 dr_regex = re.compile(r'\tDR ([0-9]+)$')
 star_prefix_regex = re.compile(r'\t\*(?P<spaces>\ *)(?P<content>.*)')
 end_of_location_regex = re.compile(r'[\[<(:]')
@@ -298,6 +302,7 @@ class GitCommit:
         self.top_level_authors = []
         self.co_authors = []
         self.top_level_prs = []
+        self.subject_prs = set()
         self.cherry_pick_commit = None
         self.revert_commit = None
         self.commit_to_info_hook = commit_to_info_hook
@@ -307,6 +312,22 @@ class GitCommit:
         if self.info.lines and self.info.lines[0] == 'Update copyright years.':
             return
 
+        if len(self.info.lines) > 1 and self.info.lines[1] != '':
+            self.errors.append(Error('Expected empty second line in commit message',
+                                     info.lines[0]))
+
+        # Extract PR numbers form the subject line
+        # Match either [PRnnnn] / (PRnnnn) or PR component/nnnn
+        if self.info.lines and not subject_pr_skip_regex.search(info.lines[0]):
+            self.subject_prs = \
+                 set([m.group('pr')
+                      for m in subject_pr2_regex.finditer(info.lines[0])])
+            for m in subject_pr_regex.finditer(info.lines[0]):
+                if not m.group('component') in bug_components:
+                    self.errors.append(Error('invalid PR component in subject',
+                                             info.lines[0]))
+                self.subject_prs.add(m.group('pr'))
+
         # Identify first if the commit is a Revert commit
         for line in self.info.lines:
             m = revert_regex.match(line)
@@ -346,6 +367,10 @@ class GitCommit:
             if not self.errors:
                 self.check_mentioned_files()
                 self.check_for_correct_changelog()
+        if self.subject_prs:
+            self.errors.append(Error("PR %s in subject but not in changelog" %
+                                     ", ".join(self.subject_prs),
+                                     self.info.lines[0]))
 
     @property
     def success(self):
@@ -460,7 +485,8 @@ class GitCommit:
                     else:
                         author_tuple = (m.group('name'), None)
                 elif pr_regex.match(line):
-                    component = pr_regex.match(line).group('component')
+                    m = pr_regex.match(line)
+                    component = m.group('component')
                     if not component:
                         self.errors.append(Error('missing PR component', line))
                         continue
@@ -469,6 +495,8 @@ class GitCommit:
                         continue
                     else:
                         pr_line = line.lstrip()
+                    if m.group('pr') in self.subject_prs:
+                        self.subject_prs.remove(m.group('pr'))
                 elif dr_regex.match(line):
                     pr_line = line.lstrip()
 
diff --git a/contrib/gcc-changelog/git_email.py b/contrib/gcc-changelog/git_email.py
index fa62e3ad2f7..5fa8947e630 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -17,6 +17,7 @@
 # <http://www.gnu.org/licenses/>.  */
 
 import os
+import re
 import sys
 from itertools import takewhile
 
@@ -28,6 +29,8 @@ from unidiff import PatchSet, PatchedFile
 
 DATE_PREFIX = 'Date: '
 FROM_PREFIX = 'From: '
+SUBJECT_PREFIX = 'Subject: '
+subject_patch_regex = re.compile(r'^\[PATCH( \d+/\d+)?\] ')
 unidiff_supports_renaming = hasattr(PatchedFile(), 'is_rename')
 
 
@@ -37,7 +40,9 @@ class GitEmail(GitCommit):
         diff = PatchSet.from_filename(filename)
         date = None
         author = None
+        subject = ""
 
+        subject_last = False
         with open(self.filename, 'r') as f:
             lines = f.read().splitlines()
         lines = list(takewhile(lambda line: line != '---', lines))
@@ -46,8 +51,21 @@ class GitEmail(GitCommit):
                 date = parse(line[len(DATE_PREFIX):])
             elif line.startswith(FROM_PREFIX):
                 author = GitCommit.format_git_author(line[len(FROM_PREFIX):])
+            elif line.startswith(SUBJECT_PREFIX):
+                subject = line[len(SUBJECT_PREFIX):]
+                subject_last = True
+            elif subject_last and line.startswith(' '):
+                subject += line
+            elif line == '':
+                break
+            else:
+                subject_last = False
+
+        if subject:
+            subject = subject_patch_regex.sub('', subject)
         header = list(takewhile(lambda line: line != '', lines))
-        body = lines[len(header) + 1:]
+        # Note: commit message consists of email subject, empty line, email body
+        message = [subject] + lines[len(header):]
 
         modified_files = []
         for f in diff:
@@ -67,7 +85,7 @@ class GitEmail(GitCommit):
             else:
                 t = 'M'
             modified_files.append((target if t != 'D' else source, t))
-        git_info = GitInfo(None, date, author, body, modified_files)
+        git_info = GitInfo(None, date, author, message, modified_files)
         super().__init__(git_info,
                          commit_to_info_hook=lambda x: None)
 
diff --git a/contrib/gcc-changelog/test_email.py b/contrib/gcc-changelog/test_email.py
index 6d6596370c4..319e065ca55 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -427,3 +427,16 @@ class TestGccChangelog(unittest.TestCase):
     def test_multi_same_file(self):
         email = self.from_patch_glob('0001-OpenMP-Fix-SIMT')
         assert email.errors[0].message == 'same file specified multiple times'
+
+    def test_pr_only_in_subject(self):
+        email = self.from_patch_glob('0001-rs6000-Support-doubleword')
+        assert (email.errors[0].message ==
+                'PR 100085 in subject but not in changelog')
+
+    def test_wrong_pr_comp_in_subject(self):
+        email = self.from_patch_glob('pr-wrong-comp.patch')
+        assert email.errors[0].message == 'invalid PR component in subject'
+
+    def test_copyright_years(self):
+        email = self.from_patch_glob('copyright-years.patch')
+        assert not email.errors
diff --git a/contrib/gcc-changelog/test_patches.txt b/contrib/gcc-changelog/test_patches.txt
index 39d40b88618..463ce905167 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -1461,6 +1461,7 @@ Subject: [PATCH 0030/2034] 	PR c++/92746 - ICE with noexcept of function
 Another place that needs to specially handle Concepts TS function-style
 concepts.
 
+	PR c++/92746
 	* except.c (check_noexcept_r): Handle concept-check.
 ---
  gcc/cp/ChangeLog                            | 3 +++
@@ -1977,7 +1978,7 @@ index aac31d02b6c..56c470f6ecf 100644
 From 5194b51ed9714808d88827531e91474895b6c706 Mon Sep 17 00:00:00 2001
 From: Jason Merrill <ja...@redhat.com>
 Date: Thu, 16 Jan 2020 16:55:39 -0500
-Subject: [PATCH 0121/2034] PR c++/93286 - ICE with __is_constructible and
+Subject: [PATCH 0121/2034] Revert PR c++/93286 - ICE with __is_constructible and
  variadic template.
 
 Here we had been recursing in tsubst_copy_and_build if type2 was a TREE_LIST
@@ -3406,3 +3407,60 @@ index 00000000000..21540512e23
 +
 -- 
 2.25.1
+=== 0001-rs6000-Support-doubleword ===
+From f700e4b0ee3ef53b48975cf89be26b9177e3a3f3 Mon Sep 17 00:00:00 2001
+From: Xionghu Luo <luo...@linux.ibm.com>
+Date: Tue, 8 Jun 2021 21:48:12 -0500
+Subject: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store
+ [PR100085]
+
+gcc/testsuite/ChangeLog:
+
+	* gcc.target/powerpc/pr100085.c: New test.
+---
+diff --git a/gcc/testsuite/gcc.target/powerpc/pr100085.c b/gcc/testsuite/gcc.target/powerpc/pr100085.c
+new file mode 100644
+index 00000000000..7d8b147b127
+--- /dev/null
++++ b/gcc/testsuite/gcc.target/powerpc/pr100085.c
+@@ -0,0 +1,1 @@
++
+-- 
+2.25.1
+=== pr-wrong-comp.patch ===
+From 5194b51ed9714808d88827531e91474895b6c706 Mon Sep 17 00:00:00 2001
+From: Jason Merrill <ja...@redhat.com>
+Date: Thu, 16 Jan 2020 16:55:39 -0500
+Subject: [PATCH 0121/2034] PR some/93286 - ICE with __is_constructible and
+ variadic template.
+
+gcc/testsuite/ChangeLog:
+
+	PR c++/93286
+	* gcc.target/powerpc/pr100085.c: New test.
+---
+diff --git a/gcc/testsuite/gcc.target/powerpc/pr100085.c b/gcc/testsuite/gcc.target/powerpc/pr100085.c
+new file mode 100644
+index 00000000000..7d8b147b127
+--- /dev/null
++++ b/gcc/testsuite/gcc.target/powerpc/pr100085.c
+@@ -0,0 +1,1 @@
++
+-- 
+2.25.1
+==== copyright-years.patch ===
+From 99dee82307f1e163e150c9c810452979994047ce Mon Sep 17 00:00:00 2001
+From: Jakub Jelinek <ja...@redhat.com>
+Date: Mon, 4 Jan 2021 10:26:59 +0100
+Subject: [PATCH] Update copyright years.
+
+---
+diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
+new file mode 100644
+index 6f67552d075..32478f070e8 100644
+--- a/lto-plugin/lto-plugin.c
++++ b/lto-plugin/lto-plugin.c
+@@ -0,0 +1,1 @@
++
+-- 
+2.25.1

Reply via email to