On Wed, Sep 16, 2009 at 3:03 PM, Carnell, James E < jecarn...@saintfrancis.com> wrote:
> > I am needing to access the text in hundreds of Microsoft .doc files on an > Ubuntu OS. I looked at win32 , but only saw support for windows. I am going > through all of these files to create a fairly simple text delimited file for > a spreadsheet. > > A) Batch convert to text files so I can access them > B) import some module that allows me to decode this format > C) Open Office allows batch conversion to .odc ,but still don't know how to > access > D) Buy a 24 pack, some Twinkies, and go watch David Hasselhoff reruns > > Opening .txt documents works fine. > > Currently get: > > inFile = open("myTestFile.doc", "r") > testRead = inFile.read() > > Traceback (most recent call last): > File "<pyshell#11>", line 1, in <module> > test = inFile.read() > File "/usr/lib/python3.0/io.py", line 1728, in read > decoder.decode(self.buffer.read(), final=True)) > File "/usr/lib/python3.0/io.py", line 1299, in decode > output = self.decoder.decode(input, final=final) > File "/usr/lib/python3.0/codecs.py", line 300, in decode > (result, consumed) = self._buffer_decode(data, self.errors, final) > UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1: > invalid data > > Any help greatly appreciated Thanks bunches. > > ubuntu comes with antiword, a program that does exactly this. I usually use > it through through the commands module in python. > > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > > -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington --------------------------------------------------------
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor