Not really sure...the issue seems related to text extraction so the first suspect is tika...SOLR is playing a secondary role here. If Tika is doing extraction good there should be an error, a warning on solr side (an exception, a content field too long warning or something like that)
What about the option 3a above (finest org.apache + grep "tika")? On 12 Jan 2014 17:38, "sweety" <sweetyshind...@yahoo.com> wrote: > Sorry for the mistake. > im using solr 4.2, it has tika-1.3. > So now, java -jar tika-app-1.3.jar -v C:\Coding.pdf , parses pdf document > without error or msg. > Also, java -jar tika-app-1.3.jar -t C:\Coding.pdf, shows the entire > document. > Which means there is no problem in tika right?? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110957.html > Sent from the Solr - User mailing list archive at Nabble.com. >