On Thu, 1 Jul 2010 14:09:57 +0200 Jean-Francois Dockes <j...@dockes.org> wrote:
> Celejar writes: > > Recoll currently uses antiword and catdoc for MS Word documents. Antiword > > claims to only support Word 2003 (according to the website) or lower, and > > catdoc only to Word 97 (according to the Debian package info). Unoconv > claims > > to be able to support any documents that OO.org supports, so it should > cover > > modern Word formats not covered by the current utilities. > > As far as I know, "modern word formats", which I take to be "Open Xml", are > covered by a native filter (rclopxml), based on xsltproc. > > If there are specific issues and problem documents, it would be nice to > please provide a sample. Testing on Open Xml documents has indeed been > very minimal, so I would not be very surprised if there are issues. Okay. I followed the troubleshooting advice from recoll's website, and it seems that the reason my docx's weren't being indexed was that xsltproc was not installed, and rclopxml needs it to run. So I guess that we need a "suggests" on xsltproc? Celejar -- foffl.sourceforge.net - Feeds OFFLine, an offline RSS/Atom aggregator mailmin.sourceforge.net - remote access via secure (OpenPGP) email ssuds.sourceforge.net - A Simple Sudoku Solver and Generator -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org