RE: Solr 3.1 and ExtractingRequestHandler resulting in blank content

2010-07-28 Thread David Thibault
From: Lance Norskog [mailto:goks...@gmail.com] Sent: Tuesday, July 27, 2010 8:09 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.1 and ExtractingRequestHandler resulting in blank content There are two different datasets that Solr (Lucene really) saves from a document: raw storage and the in

Re: Solr 3.1 and ExtractingRequestHandler resulting in blank content

2010-07-27 Thread Lance Norskog
There are two different datasets that Solr (Lucene really) saves from a document: raw storage and the indexed terms. I don't think the ExtractingRequestHandler ever automatically stored the raw data; in fact Lucene works in Strings internally, not raw byte arrays (this is changing). It should be i