Re: filtering number and repeated contents

2012-06-07 Thread Jack Krupansky
Subject: Re: filtering number and repeated contents thanks Jack , I will try updateProcessor Between does SOLR store tokenized "content" in fields if field have property stored="true" ? On Tue, Jun 5, 2012 at 8:23 PM, Jack Krupansky wrote: My (very limited) understandin

Re: filtering number and repeated contents

2012-06-07 Thread Mark , N
thanks Jack , I will try updateProcessor Between does SOLR store tokenized "content" in fields if field have property stored="true" ? On Tue, Jun 5, 2012 at 8:23 PM, Jack Krupansky wrote: > My (very limited) understanding of "boilerpipe" in Tika is that it strips > out "short text", which

Re: filtering number and repeated contents

2012-06-05 Thread Jack Krupansky
My (very limited) understanding of "boilerpipe" in Tika is that it strips out "short text", which is great for all the menu and navigation text, but the typical disclaimer at the bottom of an email is not very short and frequently can be longer than the email message body itself. You may have to