Pretty old thread. I know. But in the end it wasn't Solr. I'm fairly
certainly that it was Tika. The autoparser wasn't pulling any of the ".doc"
file text. It came out as just blank. The documents were 1997-2003. When I
opened them in word 2010 and RESAVED them as 2010 documents they indexed
just fine. 

So I guess I wanted to put this here if anybody has a problem creating their
own custom SolrJ indexer. I think the current version of tika has some
compatibility issues with 2003 word docs. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrJ-Tika-custom-indexer-not-indexing-CERTAIN-doc-text-tp4216541p4219341.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to