Pretty old thread. I know. But in the end it wasn't Solr. I'm fairly certainly that it was Tika. The autoparser wasn't pulling any of the ".doc" file text. It came out as just blank. The documents were 1997-2003. When I opened them in word 2010 and RESAVED them as 2010 documents they indexed just fine.
So I guess I wanted to put this here if anybody has a problem creating their own custom SolrJ indexer. I think the current version of tika has some compatibility issues with 2003 word docs. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-Tika-custom-indexer-not-indexing-CERTAIN-doc-text-tp4216541p4219341.html Sent from the Solr - User mailing list archive at Nabble.com.