On 09/04/2014 10:15 AM, Paul Eggert wrote: > I just checked the Emacs source code for this, and found some spurious > semicolons, but also found several legitimate instances in strings and > comments. Would the new rule work with those?
The idea is to write the rule with a regex that catches as many problem doubles as possible, while still excluding obvious legitimate instances. Shell code (and therefore configure.ac, Makefile.am, $anything.sh) is too likely to contain case statements, so the proposed rule is already limited to .[ch] files (maybe .y as well). So looking at emacs.git (commit 5c9a1ec6), I see: $ git grep '; *;' -- '**/*.[chy]' | grep -v 'for (.*; *;.*)' src/font.c: ;; GLYPHS[i] and GLYPHS[i-1] belongs to the same grapheme cluster src/font.c: ;; Be sure to cover all characters. src/lread.c: = build_pure_c_string ("^;;;.\\(in Emacs version\\|bytecomp version FSF\\)"); src/w32.c: /* Be defensive against series of ;;; characters. */ src/w32.c: char temp[MAX_UTF8_PATH], temp_a[MAX_PATH];; src/w32fns.c: Lisp_Object current_dir = BVAR (current_buffer, directory);; src/w32select.c: * ;; Generally use KOI8-R instead of the russian MS codepage for src/w32select.c: * ;; the 8-bit clipboard. src/w32select.c: * ;; Create a special clipboard copy function that uses codepage src/w32select.c: * ;; 1253 (Greek) to copy Greek text to a specific non-Unicode src/w32select.c: * ;; application. Of those, the ';; blah' comments are legitimate and probably can't be changed (font.c, w32select.c), the lread.c string literal is legitimate, a couple real problems are found (w32.c, w32fns.c), and the remaining comment in w32.c is legitimate but also easy shorten to a single ';'. Anchoring to the end of the line (changing to git grep '; *;$' finds the two real problems without flagging anything else and without missing any real problems. Looks like I'll be anchoring :) -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature