Package: libspreadsheet-parseexcel-perl Version: 0.2603-2 Severity: normal [ Remember that submitters aren't CC'ed when you reply to bug reports. ]
Here's another one: <http://www.imars.com/~fbriere/nulls.xls> It looks fine if you look at it in gnumeric/OOo, or print its contents in an ISO-8859-1 terminal: perl -MSpreadsheet::ParseExcel -le 'print Spreadsheet::ParseExcel->new->Parse("products.xls")->{Worksheet}[0]{Cells}[0][0]->Value' Manteau en cuir 3/4 boutonné à l'avant, avec garnitures aux empiècements avant et arrière, et doublure isolante. Importé. Étoff (Though you may notice the cutoff after 128 characters.) Here's where it gets interesting: perl -MSpreadsheet::ParseExcel -le 'print unpack "H*", Spreadsheet::ParseExcel->new->Parse("products.xls")->{Worksheet}[0]{Cells}[0][0]->Value' 014d0061006e007400650061007500200065006e0020006300750069007200200033002f003400200062006f00750074006f006e006e00e9002000e00020006c0027006100760061006e0074002c002000610076006500630020006700610072006e006900740075007200650073002000610075007800200065006d0070006900e800630065006d0065006e007400730020006100760061006e00740020006500740020006100720072006900e800720065002c00200065007400200064006f00750062006c007500720065002000690073006f006c0061006e00740065002e00200049006d0070006f0072007400e9002e000a00c90074006f0066006600 (Or better yet, pipe it through hexdump -C for prettier results.) This is actually the verbatim contents of the file, which I suspect has to do with the 255-char limitation of Excel95. Or maybe not. But it's clear that SS:PE: a) only read the first 255 characters of that cell b) didn't know how to decode the contents (It also replaced the last character with 0x0a, if you look carefully.) -- System Information: Debian Release: 3.1 APT prefers unstable APT policy: (500, 'unstable') Architecture: i386 (i686) Kernel: Linux 2.6.10-deb Locale: LANG=en_CA, LC_CTYPE=en_CA (charmap=ISO-8859-1) Versions of packages libspreadsheet-parseexcel-perl depends on: ii libole-storage-lite-perl 0.14-2 simple class for OLE document inte ii perl 5.8.4-8 Larry Wall's Practical Extraction -- no debconf information