Re: Word Locations & Search Components

2009-02-17 Thread Koji Sekiguchi
Hmm, Otis, very nice! Koji Otis Gospodnetic wrote: Hi, Wouldn't this be as easy as: - split email into "paragraphs" - for each paragraph compute signature (MD5 or something fuzzier, like in SOLR-799) - for each signature look for other emails with this signature - when you find an email with

Re: Word Locations & Search Components

2009-02-16 Thread Otis Gospodnetic
X To: solr-user@lucene.apache.org Sent: Monday, February 16, 2009 11:05:40 PM Subject: Re: Word Locations & Search Components Basically I'm working on the Enron dataset, and I've already de-duplicated the collection and applied a spam filter. All the e-mails after this have been parsed to

Re: Word Locations & Search Components

2009-02-16 Thread Erick Erickson
> >> content. > >> > >> I suppose if I'm doing this I don't want what's processed to be indexed > >> as > >> what's returned in a search, because then presumably it won't be the > full > >> e-mail, so do I need to store some kind of copy fie

Re: Word Locations & Search Components

2009-02-16 Thread Johnny X
m doing this I don't want what's processed to be indexed >> as >> what's returned in a search, because then presumably it won't be the full >> e-mail, so do I need to store some kind of copy field that keeps the full >> e-mail and is fully indexed to

Re: Word Locations & Search Components

2009-02-16 Thread Alexander Ramos Jardim
ne direct me to a guide? > > > On another note, is there an easy way to destroy an index...any custom > code? > > > Thanks for any help! > > > > -- > View this message in context: > http://www.nabble.com/Word-Locations---Search-Components-tp22031139p22031139.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Alexander Ramos Jardim

Re: Word Locations & Search Components

2009-02-16 Thread Grant Ingersoll
re an easy way to destroy an index...any custom code? Send in a delete by query command with the *:* query. Thanks for any help! -- View this message in context: http://www.nabble.com/Word-Locations---Search-Components-tp22031139p22031139.html Sent from the Solr - User mailing list

Word Locations & Search Components

2009-02-15 Thread Johnny X
full e-mail and is fully indexed to be returned instead? Can what I'm suggesting be done and can anyone direct me to a guide? On another note, is there an easy way to destroy an index...any custom code? Thanks for any help! -- View this message in context: http://www.nabble.com/Word-Loc