On Wed, Feb 8, 2012 at 6:09 PM, Marc Tompkins <marc.tompk...@gmail.com>wrote:
> On Wed, Feb 8, 2012 at 5:46 PM, Garry Willgoose < > garry.willgo...@newcastle.edu.au> wrote: > >> I'm reading a file output by the system utility WMIC in windows (so I can >> track CPU usage by process ID) and the text file WMIC outputs seems to have >> extra characters in I've not seen before. >> >> I use os.system('WMIC /OUTPUT:c:\cpu.txt PROCESS GET ProcessId') to >> output the file and parse file c:\cpu.txt >> >> The first few lines of the file look like this in notepad >> >> ProcessId >> 0 >> 4 >> 568 >> 624 >> 648 >> >> >> I input the data with the lines >> >> infile = open('c:\cpu.txt','r') >> infile.readline() >> infile.readline() >> infile.readline() >> >> the readline()s yield the following output >> >> '\xff\xfeP\x00r\x00o\x00c\x00e\x00s\x00s\x00I\x00d\x00 \x00 \x00\r\x00\n' >> '\x000\x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00\r\x00\n' >> '\x004\x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00\r\x00\n' >> >> Now for the first line the title 'ProcessId' is in this string but the >> individual characters are separated by '\x00' and at least for the first >> line of the file there is an extra '\xff\xfe'. For subsequent its just >> '\x00. Now I can just replace the '\x**' with '' but that seems a bit >> inelegant. I've tried various options on the open 'rU' and 'rb' but no >> effect. >> >> Does anybody know what the rubbish characters are and what has caused >> the. I'm using the latest Enthought python if that matters. >> >> You're trying to read a Unicode text file byte-by-byte. It'll end in > tears... > The "\xff\xfe" at the beginning is the Byte Order Marker or BOM. > > Here's a quick primer on Unicode: > http://www.joelonsoftware.com/articles/Unicode.html > > In particular, this phrase: > "we decided to do everything internally in UCS-2 (two byte) Unicode, which > is what Visual Basic, COM, and Windows NT/2000/XP use as their native > string type." >
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor