Hi all, I am encountering a problem where Solr 3.6.1 is not able to extract the text content from ODT (Open Office Document) files submitted to the ExtractingRequestHandler. I can reproduce this issue against the example schema running with jetty.
Executing a simple index request (based on the example in the wiki): curl "http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=attr_content&commit=true"<http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=attr_content&commit=true%22> -F "myfile=@testfile.odt" returns no errors, and does not generate any exceptions in the log/console. A query for doc1 returns an empty attr_content field: <arr name="attr_content"> <str></str> </arr> Oddly enough, executing an "extractOnly=true" request against the ExtractingRequestHandler with the same ODT file correctly returns the text of the file. I am wondering: * Is this a known issue? (I couldn't find any mention of this particular issue anywhere...) * Are there any workarounds or does anyone have any suggestions? Thanks, Brett.