If you are on Windows try the Microsoft IFilter API - it supports
current Office versions.
http://www.microsoft.com/downloads/details.aspx?FamilyId=60C92A37-719C-4077-B5C6-CAC34F4227CC&displaylang=en
On Tue, Apr 27, 2010 at 6:08 AM, Roland Villemoes
wrote:
> Hi All,
>
> Does anyone have a runn
: Roland Villemoes
> To: "solr-user@lucene.apache.org"
> Sent: Tue, April 27, 2010 6:08:30 AM
> Subject: Indexing all versions of Microsoft Office Documents
>
> Hi All,
Does anyone have a running solution indexing Microsoft Office
> Documents e.g. .docx .xlsx etc.
Hi All,
Does anyone have a running solution indexing Microsoft Office Documents e.g.
.docx .xlsx etc. ?
I can see a lot of examples using Tika for rich content extraction, but still
nothing when it comes to newer versions of Microsoft Office?
What libraries to use of not Tika?
med venlig hilse