Roland, A better place to ask might in fact be tika-user mailing list. Sorry, I don't have the answer except for this pointer.
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Roland Villemoes <r...@alpha-solutions.dk> > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Sent: Tue, April 27, 2010 6:08:30 AM > Subject: Indexing all versions of Microsoft Office Documents > > Hi All, Does anyone have a running solution indexing Microsoft Office > Documents e.g. .docx .xlsx etc. ? I can see a lot of examples using Tika > for rich content extraction, but still nothing when it comes to newer > versions > of Microsoft Office? What libraries to use of not Tika? med venlig > hilsen/best regards Roland Villemoes Tel: (+45) 22 69 59 62 E-Mail: > mailto: > href="mailto:r...@alpha-solutions.dk">r...@alpha-solutions.dk Alpha > Solutions A/S Borgergade 2, 3.sal, 1300 København K Tel: (+45) 70 20 65 > 38 Web: > >http://www.alpha-solutions.dk< > target=_blank >http://www.alpha-solutions.dk/> ** This message > including any attachments may contain confidential and/or privileged > information > intended only for the person or entity to which it is addressed. If you are > not > the intended recipient you should delete this message. Any printing, copying, > distribution or other use of this message is strictly prohibited. If you have > received this message in error, please notify the sender immediately by > telephone, or e-mail and delete all copies of this message and any > attachments > from your system. Thank you.