Package: html2text Version: 1.3.2a-5 Severity: normal Nowhere in the man page does it suggest that -width -1 is legal, so why am I using it? Well, it seemed to work for me in one application, and saved my having to pick a magic number. After all, html2text didn't say it was a usage error. Then came another application and html2text crashed, having produced glibc heap corruption output. After reducing the HTML to a minimal file, I see that html2text is trying to malloc for a width of -1. On my amd64 system, that's been sign-extended to more memory than I have. I wonder if html2text should refuse the argument, cap it at a sane value or just report the malloc failure in a more revealing way.
$ cat /tmp/corpus <html> <head> </head> <body> <hr> </body> </html> $ valgrind --db-attach=yes ~/tmp/html2text-1.3.2a/html2text -width -1 file://tmp/corpus ==28439== Memcheck, a memory error detector. ==28439== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==28439== Using LibVEX rev 1854, a library for dynamic binary translation. ==28439== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. ==28439== Using valgrind-3.3.1-Debian, a dynamic binary instrumentation framework. ==28439== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. ==28439== For more details, rerun with: -v ==28439== ==28439== Warning: silly arg (-2) to malloc() ==28439== Invalid write of size 1 ==28439== at 0x41587F: Area::Area(unsigned long, unsigned long, char, char) (Area.C:177) ==28439== by 0x41EFC8: HorizontalRule::format(unsigned long, int) const (format.C:543) ==28439== by 0x418367: _ZL6formatPKSt4listI8auto_ptrI7ElementESaIS2_EEmmiRSo (format.C:1386) ==28439== by 0x41A024: Body::format(unsigned long, unsigned long, int, std::ostream&) const (format.C:201) ==28439== by 0x41A175: Document::format(unsigned long, unsigned long, int, std::ostream&) const (format.C:159) ==28439== by 0x403D98: MyParser::process(Document const&) (html2text.C:110) ==28439== by 0x40E3A4: HTMLParser::yyparse() (HTMLParser.y:275) ==28439== by 0x403C22: main (html2text.C:378) ==28439== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==28439== ==28439== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- y ==28439== starting debugger with cmd: /usr/bin/gdb -nw /proc/28444/fd/1014 28444 GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Attaching to program: /proc/28444/fd/1014, process 28444 Reading symbols from /usr/lib/valgrind/amd64-linux/vgpreload_core.so...done. Loaded symbols for /usr/lib/valgrind/amd64-linux/vgpreload_core.so Reading symbols from /usr/lib/valgrind/amd64-linux/vgpreload_memcheck.so...done. Loaded symbols for /usr/lib/valgrind/amd64-linux/vgpreload_memcheck.so Reading symbols from /usr/lib/libstdc++.so.6...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /usr/lib/debug/libm.so.6...done. Loaded symbols for /usr/lib/debug/libm.so.6 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /usr/lib/debug/libc.so.6...done. Loaded symbols for /usr/lib/debug/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.7.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 0x000000000041587f in Area (this=0x5920638, w=18446744073709551615, h=1, c=61 '=', a=0 '\0') at Area.C:177 177 while (p != end) { p->character = c; p->attribute = a; p++; } (gdb) list 162 { 163 } 164 165 Area::Area( 166 size_type w /*= 0*/ , 167 size_type h /*= 0*/ , 168 char c /*= ' '*/ , 169 char a /*= Cell::NONE*/ 170 ) : 171 width_(w), 172 height_(h), 173 cells_(malloc_array(Cell *, h)) 174 { 175 for (size_type y = 0; y < h; y++) { 176 Cell *p = cells_[y] = malloc_array(Cell, w), *end = p + w; 177 while (p != end) { p->character = c; p->attribute = a; p++; } 178 } 179 } 180 181 Area::Area(const char *p) : 182 width_(strlen(p)), 183 height_(1), 184 cells_(malloc_array(Cell *, 1)) 185 { 186 cells_[0] = malloc_array(Cell, width_); 187 Cell *q = cells_[0], *end = q + width_; 188 while (q != end) { q->character = *p++; q->attribute = Cell::NONE; q++; } 189 } 190 191 Area::Area(const string &s) : (gdb) bt #0 0x000000000041587f in Area (this=0x5920638, w=18446744073709551615, h=1, c=61 '=', a=0 '\0') at Area.C:177 #1 0x000000000041efc9 in HorizontalRule::format (this=0x591fe00, w=18446744073709551615) at format.C:543 #2 0x0000000000418368 in format (elements=0x591f6b8, indent_left=0, w=18446744073709551615, halign=0, os=@0x63e400) at format.C:1386 #3 0x000000000041a025 in Body::format (this=0x591f670, indent_left=0, w=18446744073709551615, halign=0, os=@0x63e400) at format.C:201 #4 0x000000000041a176 in Document::format (this=0x591f620, indent_left=0, w=18446744073709551615, halign=0, os=@0x63e400) at format.C:159 #5 0x0000000000403d99 in MyParser::process (this=0x7feffd3d0, document=@0x591f620) at html2text.C:110 #6 0x000000000040e3a5 in HTMLParser::yyparse (this=0x7feffd3d0) at HTMLParser.y:275 #7 0x0000000000403c23 in main (argc=4, argv=0x7feffd5b8) at html2text.C:378 (gdb) -- System Information: Debian Release: 5.0.8 APT prefers oldstable APT policy: (500, 'oldstable') Architecture: amd64 (x86_64) Kernel: Linux 2.6.26-2-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages html2text depends on: ii libc6 2.7-18lenny7 GNU C Library: Shared libraries ii libgcc1 1:4.3.2-1.1 GCC support library ii libstdc++6 4.3.2-1.1 The GNU Standard C++ Library v3 html2text recommends no packages. Versions of packages html2text suggests: ii curl 7.18.2-8lenny4 Get a file from an HTTP, HTTPS or ii wget 1.11.4-2+lenny2 retrieves files from the web -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org