from:"cloax"

Re: ExtractRequestHandler - not properly indexing office docs?

2009-06-22 Thread cloax

I've tried 'text' ( taken from the example config ) and then tried creating a new field called doc_content and using that. Neither has worked. Grant Ingersoll-6 wrote: > > What's your default search field? > > On Jun 22, 2009, at 12:29 PM, cloax wrote: >

Re: ExtractRequestHandler - not properly indexing office docs?

2009-06-22 Thread cloax

Yep, I've tried both of those and still no joy. Here's both my curl statement and the resulting Solr log output. curl http://localhost:8983/solr/update/extract?ext.def.fl=text\&ext.literal.id=1\&ext.map.div=text\&ext.capture=div -F "myfi...@dj_character.doc" Curls output: 0317 Solr log: J

Re: ExtractRequestHandler - not properly indexing office docs?

2009-06-20 Thread cloax

Thanks for the quick response. Here are the fields from the schema: I use text as the content field for the default field for the ERH. Here's the config of the ERH: last_modified true Here's the output of a curl request w/ the file: 0650

ExtractRequestHandler - not properly indexing office docs?

2009-06-19 Thread cloax

Hi there, I've got a Solr instance running and am feeding it rich binary documents to index from a Django application. The setup works just fine with pdf's, etc.. but no matter what type of MS Word document ( doc and docx ) I feed it I can't get any results when searching for content-related que