Hi there,
I was told before that I'd need to create a custom search component to do what I want to do, but I'm thinking it might actually be a custom analyzer. Basically, I'm indexing e-mail in XML in Solr and searching the 'content' field which is parsed as 'text'. I want to ignore certain elements of the e-mail (i.e. corporate banners), but also identify the actual content of those e-mails including corporate information. To identify the banners I need something a little more developed than a stop word list. I need to evaluate the frequency of certain words around words like 'privileged' and 'corporate' within a word window of about 100ish words to determine whether they're banners and then remove them from being indexed. I need to do the opposite during the same time to identify, in a similar manner, which e-mails include corporate information in their actual content. I suppose if I'm doing this I don't want what's processed to be indexed as what's returned in a search, because then presumably it won't be the full e-mail, so do I need to store some kind of copy field that keeps the full e-mail and is fully indexed to be returned instead? Can what I'm suggesting be done and can anyone direct me to a guide? On another note, is there an easy way to destroy an index...any custom code? Thanks for any help! -- View this message in context: http://www.nabble.com/Word-Locations---Search-Components-tp22031139p22031139.html Sent from the Solr - User mailing list archive at Nabble.com.