Package: html2text
Version: 1.3.2a-5
Severity: normal

Nowhere in the man page does it suggest that -width -1 is legal, so why am I 
using it?
Well, it seemed to work for me in one application, and saved my having to pick 
a magic number.
After all, html2text didn't say it was a usage error.
Then came another application and html2text crashed, having produced glibc heap 
corruption output.
After reducing the HTML to a minimal file, I see that html2text is trying to 
malloc for a width of -1.
On my amd64 system, that's been sign-extended to more memory than I have.
I wonder if html2text should refuse the argument, cap it at a sane value or 
just report the malloc failure in a more revealing way.

$ cat /tmp/corpus
<html>
<head>
</head>
<body>
<hr>
</body>
</html>
$ valgrind --db-attach=yes ~/tmp/html2text-1.3.2a/html2text -width -1 
file://tmp/corpus
==28439== Memcheck, a memory error detector.
==28439== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==28439== Using LibVEX rev 1854, a library for dynamic binary translation.
==28439== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==28439== Using valgrind-3.3.1-Debian, a dynamic binary instrumentation 
framework.
==28439== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==28439== For more details, rerun with: -v
==28439== 
==28439== Warning: silly arg (-2) to malloc()
==28439== Invalid write of size 1
==28439==    at 0x41587F: Area::Area(unsigned long, unsigned long, char, char) 
(Area.C:177)
==28439==    by 0x41EFC8: HorizontalRule::format(unsigned long, int) const 
(format.C:543)
==28439==    by 0x418367: _ZL6formatPKSt4listI8auto_ptrI7ElementESaIS2_EEmmiRSo 
(format.C:1386)
==28439==    by 0x41A024: Body::format(unsigned long, unsigned long, int, 
std::ostream&) const (format.C:201)
==28439==    by 0x41A175: Document::format(unsigned long, unsigned long, int, 
std::ostream&) const (format.C:159)
==28439==    by 0x403D98: MyParser::process(Document const&) (html2text.C:110)
==28439==    by 0x40E3A4: HTMLParser::yyparse() (HTMLParser.y:275)
==28439==    by 0x403C22: main (html2text.C:378)
==28439==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==28439== 
==28439== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- y
==28439== starting debugger with cmd: /usr/bin/gdb -nw /proc/28444/fd/1014 28444
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...
Attaching to program: /proc/28444/fd/1014, process 28444
Reading symbols from /usr/lib/valgrind/amd64-linux/vgpreload_core.so...done.
Loaded symbols for /usr/lib/valgrind/amd64-linux/vgpreload_core.so
Reading symbols from /usr/lib/valgrind/amd64-linux/vgpreload_memcheck.so...done.
Loaded symbols for /usr/lib/valgrind/amd64-linux/vgpreload_memcheck.so
Reading symbols from /usr/lib/libstdc++.so.6...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /usr/lib/debug/libm.so.6...done.
Loaded symbols for /usr/lib/debug/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /usr/lib/debug/libc.so.6...done.
Loaded symbols for /usr/lib/debug/libc.so.6
Reading symbols from /lib/ld-linux-x86-64.so.2...Reading symbols from 
/usr/lib/debug/lib/ld-2.7.so...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x000000000041587f in Area (this=0x5920638, w=18446744073709551615, h=1, c=61 
'=', a=0 '\0') at Area.C:177
177         while (p != end) { p->character = c; p->attribute = a; p++; }
(gdb) list
162     {
163     }
164     
165     Area::Area(
166       size_type w /*= 0*/ ,
167       size_type h /*= 0*/ ,
168       char      c /*= ' '*/ ,
169       char      a /*= Cell::NONE*/
170     ) :
171       width_(w),
172       height_(h),
173       cells_(malloc_array(Cell *, h))
174     {
175       for (size_type y = 0; y < h; y++) {
176         Cell *p = cells_[y] = malloc_array(Cell, w), *end = p + w;
177         while (p != end) { p->character = c; p->attribute = a; p++; }
178       }
179     }
180     
181     Area::Area(const char *p) :
182       width_(strlen(p)),
183       height_(1),
184       cells_(malloc_array(Cell *, 1))
185     {
186       cells_[0] = malloc_array(Cell, width_);
187       Cell *q = cells_[0], *end = q + width_;
188       while (q != end) { q->character = *p++; q->attribute = Cell::NONE; 
q++; }
189     }
190     
191     Area::Area(const string &s) :
(gdb) bt
#0  0x000000000041587f in Area (this=0x5920638, w=18446744073709551615, h=1, 
c=61 '=', a=0 '\0') at Area.C:177
#1  0x000000000041efc9 in HorizontalRule::format (this=0x591fe00, 
w=18446744073709551615) at format.C:543
#2  0x0000000000418368 in format (elements=0x591f6b8, indent_left=0, 
w=18446744073709551615, halign=0, os=@0x63e400) at format.C:1386
#3  0x000000000041a025 in Body::format (this=0x591f670, indent_left=0, 
w=18446744073709551615, halign=0, os=@0x63e400) at format.C:201
#4  0x000000000041a176 in Document::format (this=0x591f620, indent_left=0, 
w=18446744073709551615, halign=0, os=@0x63e400) at format.C:159
#5  0x0000000000403d99 in MyParser::process (this=0x7feffd3d0, 
document=@0x591f620) at html2text.C:110
#6  0x000000000040e3a5 in HTMLParser::yyparse (this=0x7feffd3d0) at 
HTMLParser.y:275
#7  0x0000000000403c23 in main (argc=4, argv=0x7feffd5b8) at html2text.C:378
(gdb) 

-- System Information:
Debian Release: 5.0.8
  APT prefers oldstable
  APT policy: (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL 
set to en_US.UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages html2text depends on:
ii  libc6                       2.7-18lenny7 GNU C Library: Shared libraries
ii  libgcc1                     1:4.3.2-1.1  GCC support library
ii  libstdc++6                  4.3.2-1.1    The GNU Standard C++ Library v3

html2text recommends no packages.

Versions of packages html2text suggests:
ii  curl                     7.18.2-8lenny4  Get a file from an HTTP, HTTPS or 
ii  wget                     1.11.4-2+lenny2 retrieves files from the web

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to