There are a number of different routes you can go, one of which is to use SolrCell (Tika) to parse mbox files and then add your own update processor that does whatever mail classification analysis you desire and then generates addition field values for the classification.

A simpler approach is to do the analysis yourself outside of Solr and then feed the mbox data for each message into SolrCell along with the specific literal field values derived from your classification analysis. SolrCell (Tika) would then parse the mail message and add your literal field values.

Or, you may want to consider fully parsing the mail messages outside of Solr so that you have full control over what gets parsed and which schema fields are used or not used, in additional to your content analysis field values.

-- Jack Krupansky

-----Original Message----- From: Ramo Karahasan
Sent: Tuesday, May 01, 2012 12:17 PM
To: solr-user@lucene.apache.org
Subject: Email classification with solr

Hello,



just a short question:



Is it possible to use solr/Lucene as a e-mail classifier? I mean, analyzing
an e-mail to add it automatically to a category (four are available)?





Thanks,

Ramo

Reply via email to