Hi Alex, I think you may get better help on the Tika mailing list - Solr uses Tika to parse rich text docs and extract text from them. I don't know if Tika can figure out what's from a header and a footer...
Otis ---- Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm ----- Original Message ----- > From: Alex Cougarman <acoug...@bwc.org> > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Cc: > Sent: Thursday, August 30, 2012 9:25 AM > Subject: Extract footer/header text out of Word docs > > Hi. Is it possible to specifically extract footer/header and body text out of > a > Word document using Solr? In other words, we'd like to index/store those > items in different Solr fields. > > Also, is it possible to search on specific styles within a Word document? Can > these attributes be indexed? Thanks. > > Sincerely, > Alex >