I was looking at implementing a Learn Spam / Learn Ham feature on my server.  Basically, I’ll have a cronjob to read users’ Learn Spam folders and use spamassassin’s learn function.  Pretty basic stuff, nothing magical going on here.  SpamAssassin’s Bayesian learn function, however, requires you teach it what ham is as well, so I want to scan the user’s inbox for ham as well.

 

So here’s the trick

 

I want to read the seen-state db of the user’s inbox to make sure the user has “seen” the message.  The code will assume that if the user has seen the message and has not moved it to the Learn Spam folder within a time period (say 3 hours), then the message is Ham and learn it as such.

 

I modified the seenstate_db parameter in imapd.conf to use flat files to take a look at the format of the file and got this:

 

7b4434cf433945c5        1 1127830200 1 1127829445 1

 

I don’t plan to keep it as a flat file, I was convert it back to skiplist and use the perl CPAN module Algorithm::SkipList  to read the skiplist instead.  Here’s what I make of the entry so far, but I would like a confirmation as to what each field means:

 

7b4434cf433945c5 – hash of the file, but I can’t figure out what kind of hash this is

1127830200  - last time the message was viewed

1127829445    - either first time the message was viewed –or-

                                                the time it was entered into the db –or-

                                                something else ;)

 

As for the ‘1’s, I assume at least one of these entries has to do with the fact it’s the 1. file in the user’s inbox, but I don’t know what the others denote.  Hence the question.

 

Can anyone shed light on this for me?

 

Also, if I were to use the perl module to open the seen state db quickly to read entries, could this cause a corruption of the seen information?

 

Thanks.

 

----
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Reply via email to