On Thu, Jan 29, 2009 at 11:48:42AM +0100, Stefano Zacchiroli wrote: > The attached patch does exactly that. The name of the flag I've chose > is "--whole-pkg", YMMV. Note that in my implementation the flag > implies "-e", because the implementation is actually (extended) regex > based. > > More in detail, I just "dress" the given atom(s) with regex boundaries > as yours. It might be seen as hackish, but I find it way better than > adding specific parsing of inter-package relationship > fields. Considering that the use case we are discussing is really > common, I believe it is worth to implement this this way.
I do think this is a bit inelegant, but all other avenues for implementation that I can think of are much more complex. In other words, I think your approach is sound. > I've changed a bit the regular expression boundaries wrt yours, mines > are: > > #define RE_PKG_BEGIN "(^| )" > #define RE_PKG_END "([, \\(]|$)" > > with the rationale that whitespaces, according to policy, do not > necessarily appear between package names and optional version > requirements; hence also '(' can denote the end of package name. Would it make sense to instead rule out any character that would be a part of a package name? > Please consider applying the attached patch. Since you are a DD, you can commit it yourself when it's ready. (Please read debian/README, first.) > From 13cb48a28f20a34a454b97b0b41c02b12733f988 Mon Sep 17 00:00:00 2001 > From: Stefano Zacchiroli <z...@upsilon.cc> > Date: Thu, 29 Jan 2009 11:21:38 +0100 > Subject: [PATCH] add '--whole-pkg' to match (eregex) on whole package names > > Using this flag given extended regex will not match substring of > package names in fields expressing inter-package relationships (e.g., > Depends, Recommends, ...). > > Passing '--whole-pkg' implies '-e'. > > (Closes: #476861) actually, proposed fix for ... You haven't signed off the patch. Please read http://git.debian.org/?p=collab-maint/dctrl-tools.git;a=blob_plain;f=dco-1.1.txt;hb=dco and if you can certify that, please add a Signed-Off-By line at the end of the patch description (see debian/README). > + char * regex_pat = NULL; > + int regex_patlen = atom->patlen + 30; Where does the magic number 30 come from? The combined length of the two regexps? If so, I would prefer making that explicit, as in: int regex_patlen = atom->patlen + strlen(RE_PKG_BEGIN) + strlen(RE_PKG_END) + 1; (I believe strlen of constant strings gets optimized out, but it's not critical since we are not in the hot loops.) ... Ah, I see you've already done this. I would prefer combining the two code patches, though. > if (atom->mode == M_REGEX || atom->mode == M_EREGEX) { > + regex_pat = calloc(1, regex_patlen); Do you use the property that the returned memory is zeroed out? > +.IP "\-\-whole\-pkg" > +Do an extended regular expression match on whole package names, > +assuming the syntax of inter-package relationship fields such as > +Depends, Recommends, ... When this flag is given you should not worry > +about sub-package names such as "libpcre3" also matching > +"libpcre3-dev". This flag implies \-e. I'd prefer allowing us to later drop the -e connection, so should say "currently implies" or similar. -- Antti-Juhani Kaijanaho, Jyväskylä, Finland http://antti-juhani.kaijanaho.fi/newblog/ http://www.flickr.com/photos/antti-juhani/
signature.asc
Description: Digital signature