If you are on Windows try the Microsoft IFilter API - it supports
current Office versions.
http://www.microsoft.com/downloads/details.aspx?FamilyId=60C92A37-719C-4077-B5C6-CAC34F4227CC&displaylang=en



On Tue, Apr 27, 2010 at 6:08 AM, Roland Villemoes <r...@alpha-solutions.dk> 
wrote:
> Hi All,
>
> Does anyone have a running solution indexing Microsoft Office Documents e.g. 
> .docx .xlsx etc. ?
>
> I can see a lot of examples using Tika for rich content extraction, but still 
> nothing when it comes to newer versions of Microsoft Office?
> What libraries to use of not Tika?
>
> med venlig hilsen/best regards
>
> Roland Villemoes
> Tel: (+45) 22 69 59 62
> E-Mail: mailto:r...@alpha-solutions.dk
>
> Alpha Solutions A/S
> Borgergade 2, 3.sal, 1300 København K
> Tel: (+45) 70 20 65 38
> Web: http://www.alpha-solutions.dk<http://www.alpha-solutions.dk/>
>
> ** This message including any attachments may contain confidential and/or 
> privileged information intended only for the person or entity to which it is 
> addressed. If you are not the intended recipient you should delete this 
> message. Any printing, copying, distribution or other use of this message is 
> strictly prohibited. If you have received this message in error, please 
> notify the sender immediately by telephone, or e-mail and delete all copies 
> of this message and any attachments from your system. Thank you.
>
>

Reply via email to