Hi Paul! On Wed, Jan 04, 2006 at 01:34:34PM +0800, Paul Wise wrote: > On Tue, 2006-01-03 at 10:53 -0500, Justin Pryzby wrote: > > > I would rather see my original, more aggressive check: > > > > $description=~m/homepage/is && $description!~/^ Homepage: [^ ]*$/ > > Hmm, this misses stuff where people use webpage on the last line, or a > phrase without web or page in it, which is one thing I wanted to > specifically detect, since I noticed it a few times. Good point.
> > Packages for which this is a false-positive (such as slash, gnudip, > > and bake) > > These can be eliminated by checking for a url in the description too. It reduces some true positives also: abcmidi, achims-guestbook, airsnort, alsa-base, anjuta, ant, apache, apg, ardour-doc, aspell-bg, atlc, audacity, etc. > > Will be orders of magnitudes more rare than packages which > > are missed by the existing patch which are not missed by check above > > (such as liblingoteach4, qgo, tcsh, maxdb-sqlcli, tik, mozart, > > tagcoll, sleuthkit, tdl, xlogmaster, libsqlod75, wmpuzzle, > > python2.3-dictclient, ...). > > How about the attached combination check - does my check and also does > yours, with the changes that it checks a couple of other words, and > checks for a url in the description too. > $description =~ m/(homepage|webpage|website)/is Good, please also add "|URL|upstream" (see asterisk-sounds-extra). > && $description =~ m/[a-z]+:\/\// This is to reduce false positives, right? I don't like it. If this were an ' || ' condition, it would be great. > && $last_line!~/^ Homepage: [^ ]*$/ ){ Shouldn't this be: !~/^ {2}Homepage: [^ ]*$/ ? ^^^ Or is the contents of this "description" variable the control field without the first column of blanks? I guess that is probably it. Thanks for the patch, again! -- Clear skies, Justin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]