Re: Extracting content from mailman managed mail list archive

2010-03-09 Thread Chris Hostetter
: I just checked popular search services and it seems that neither : lucidimagination search nor search-lucene support this: it really depends on what you want to do ... most people i know who index email want to included quoted portions in the message because it's part of hte context of the me

Re: Extracting content from mailman managed mail list archive

2010-03-08 Thread Lukáš Vlček
I just checked popular search services and it seems that neither lucidimagination search nor search-lucene support this: http://www.lucidimagination.com/search/document/954e8589ebbc4b16/terminating_slashes_in_url_normalization http://www.search-lucene.com/m?id=510143ac0608042241k49f4afe7wcd25df3fba

Extracting content from mailman managed mail list archive

2010-03-08 Thread Lukáš Vlček
Hi, is anybody willing to share experience about how to extract content from mailing list archives in order to have it indexed by Lucene or Solr? Imagine that we have access to archive of some mailling list (e.g. http://www.mail-archive.com/mailman-users%40python.org/) and we would like to index