The version of Tika in the 1.4 release definitely parses the most current Office formats (.docx, .pptx, etc.) and they index as expected.
-Jay On Mon, Jan 4, 2010 at 6:02 PM, Peter Wolanin <[email protected]>wrote: > You must have been searching old documentation - I think tika 0,3+ has > support for the new MS formats. but don't take my word for it - why > don't you build tika and try it? > > -Peter > > On Sun, Jan 3, 2010 at 7:00 PM, Roland Villemoes <[email protected]> > wrote: > > Hi All, > > > > Anyone who knows how to index the latest MS office documents like .docx > and .xlsx ? > > > > From searching it seems like Tika only supports the earlier formats .doc > and .xls > > > > > > > > med venlig hilsen/best regards > > > > Roland Villemoes > > Tel: (+45) 22 69 59 62 > > E-Mail: mailto:[email protected] > > > > > > > > -- > Peter M. Wolanin, Ph.D. > Momentum Specialist, Acquia. Inc. > [email protected] >
