(Ted Harding) writes: > On 22-Jun-98 Luiz Otavio L. Zorzella wrote: > > > > Hi, folks. > > > > Is there any way to read in my linux box a word 7 .doc file? > > Mantaining the "indents" and "bolds" would be a plus, but mainly I > > just need to read the text in it. > > > > I use StarOffice to read docs, but it only reads up to word 6 files > >:^< > > A rough-and-ready way to do just what you're asking is to use the "strings" > command: > > strings wordfile.doc > wordfile.txt >
"strings" would do a good job for me, but... > and then edit wordfile.txt to clean it up. Raw "strings" will skip sequences > of > fewer than 4 ASCII characters but these are unlikely to occur in a Word > document. This method will suppress all formatting info except end-of-line, > so > you are likely to get long lines (= Word paragraphs). It will also fail to > recognise any non-US-ASCII character codes (above 127) so accented characters > and special symbols, etc, will be missed. But if you simply need to read the > text content of a Word document containing plain English text, then this > method > works fine. ... my text is in portuguese, and does have non-US chars. Is there a way to tell "strings" to accept some non-US chars? Thanks. -- Luiz Otavio L. Zorzella Product Engineer [EMAIL PROTECTED] http://www.conexware.com -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]