FritzS - gmx posted on Thu, 20 Dec 2012 20:13:18 +0100 as excerpted: > Now I use this in the score file > %-------------------------------------------- > % Chamaeleon > > [de.*, at.*] > Score: =-7444 > User-Agent: MacSOUP/D-2\.8\.3 \(Mac OS X version 10\.6\.8\(x86\)\) > X-Complaints-To: news@netfront\.net > > %-------------------------------------------- > > but the second line are ignored from pan, if I write this false > "newssss@netfront\.net" it works too.
I'm having a bit of a parsing problem with that. You're saying news@ does NOT work (ignored) but newssss@ DOES work (works)? "Too" is normally used to indicate "also", so I'd expect both to work, or both not to work, which would agree with my technical understanding of the scoring rules, but that's not what the "ignored" on one but "works too" on the other one seems to indicate, so I'm confused. > Here the original NNTP header lines from a message I want to score > X-Complaints-To: n...@netfront.net > User-Agent: MacSOUP/D-2.8.3 (Mac OS X version 10.6.8 (x86)) Note that I'm reading this thru gmane, which encrypts parts of strings that appear to be email addresses for spam-control reasons. Therefore I don't see the address in your X-Complaints-To line as you typed it, but as gmane encrypts it, which is... troublesome... when the literal string may be important. If you put spaces around the @ and change it to (at) , however, gmane leaves the obfuscated version alone. Also, \. seems to get thru, so your scorefile version news@netfront\.net came thru without encryption. So please use one of those forms (and mention it if it's not clear, the (at) form usually is, based on past experience). (I've wondered about requesting that gmane turn address encryption off for this list/group as well as the pan-dev list/group, but I guess Petr Kovar is list admin, so he'd need to be the one to email gmane, requesting it.) > Did I adapt this correct for the score file? > > What effected the ^ and the $ > sample > User-Agent: ^MacSOUP/D-2\.8\.3 \(Mac OS X version 10\.6\.8\(x86\)\)$ The ^ and $ are regex beginning and end of line anchors, respectively. So a condition line without them would accept the line with a bunch of other content at either end, while ^ at the beginning of the line indicates the regex match MUST occur at the beginning of the line, and $ at the end indicates the regex match MUST occur at the end. Thus, enclosing the regex between ^ and $ indicates that the regex must match the ENTIRE line, nothing else before or after the match. As with other regex "special" characters like \|.*?()[]{} , the "specialness" can be escaped with the \ (backslash) character, so for example, \\ matches a literal backslash and \$ can be used anywhere in the line to match a literal dollar sign. (As usual with regex, . will match any character, but you seem to have already noted that, ? means the preceding match may or may not appear, and * means any number of the preceeding, so xay* will match xay and xaaaaay but not xaabay. It'll also match xy (no a), since zero is a number.) Double-check your () vs. \(\) usage as () forms a grouping. So (ab)* will match xabababy and xy (zero is a number) but not xay or xabby. Again, \(\) escapes the specialness for a literal match. (See the misc.taxes example in the documentation for a literal $ match, as \$ . Here's the link again for convenience since I snipped that bit above.) http://www.slrn.org/docs/score.txt *BUT*, I THINK what you MAY be missing isn't anything to do with regex, but rather, the distinction between overview headers and non-overview headers. Headers contained in the overview can be scored before the message is downloaded, since they're in the overview. Headers not in the overview cannot be matched until after a message is actually downloaded, since they're not available until then (they're not in the overview). See the second paragraph (with its list of typical overview headers) of section 1.1 in score.txt as linked above. In particular, if you're using an AND scoring condition (single colon after the score), which you are, and one of ANDed conditions matches a header in the overview but another does not, you're likely to have problems, especially if that score is in one of your automatic action zones. For quite some time, pan's scoring ONLY worked on the overview headers. I believe Heinrich patched it to work on ALL headers once a message is downloaded (tho I've not actually used that functionality personally, so can't personally vouch for it actually working, when I needed it, pan didn't have the ability at all, but that was probably 7-7 years ago now), but scoring is still much more efficient on overview headers received BEFORE a message is downloaded, and as I mentioned, ANDed scoring combining overview and non-overview headers is likely to be problematic. While scoring after an article is downloaded isn't optimal, it's still useful, particularly for negative-scoring/ignoring, since while the message must still be downloaded, scoring (especially combined with the automatic scoring based actions feature) can still avoid you having to actually see and deal with the message manually. Now, neither user-agent nor x-complaints-to are traditionally in the overview file, but as score.txt mentions, the admin can add particular headers to the overviews as they find them useful. Thus, it may be that one of those headers is in your overview, and scoring against it alone works, but attempting to score against both will not work until the message is actually downloaded and the other header is available. I'm wondering if that's the problem you're actually seeing. Of course, if neither one is in your overviews, scoring against either one alone (as well as against both, ANDed) would fail, until the message was actually downloaded. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users