Hi Chris,

first off, I don't want to dispute that your program is useful. I just
do not think that it is quite ready to be in a stable release and listed
some things that I believe to be stoppers.
BTW: my main use case I remember is the python2.x chm documentation, I
didn't wait for chm2pdf to finish with that because it took too long.

Just a quick comment about the links and the efficency:
- for the removal of anchors: I would really prefer to have these
  preserved by default and maybe have an option to remove them
  (and in this direction). Alternatively: How about bookkeeping in the
  following way: Compile a hash (dict or anydbm) mapping linked
  anchor-URLs to file positions where they are linked to
    unverified links->[(file, pos1a,pos1b,pos1c),(file2,
                       pos2a,pos2b)...]
  and a python set (or dict with foo->None entries) of verified
  anchors as you go through the files, record each link and anchor
  (throwing out unverified links once you hit an anchor).
  After that, you need to go through the had and delete the anchor at
  those position. Because 1) links get shorter by this 2) you can just
  have more whitespace in the <a href="  "WHITESPACEHERE>, you just need
  to overwrite very specific locations and not even rewrite the whole
  file.
- For the actual replacement: Just do the replacements in one go, then
  you don't need all of that.
  You can easily do this by declaring a function that does the
  link replacement and above accounting and then passing it as the
  replacement argument to re.sub.

For the security issues: The shell-escaping when using system is really
important because you might accidentally overwrite or remove important
stuff when using filenames with spaces or so (imagine calling chm2pdf
from your home dir on a file "MyInfoAbout Mail andStuff.chm" and then
you remove "Mail" because of the spaces.
As for the TMP dir: it is best to use TMP as patched unless the user
explicitly specifies a different work (instead of /tmp/SOMETHING) dir.
Using HOME can be very costly (e.g. when HOME is networked) and it's not
cleaned up (as TMP is on reboot) should your program die for some reason
without cleaning up and would be cluttered. All other programs (well
almost, some have the same bug) work in safely created TMP subdirs
unless the user very specifically elects not to. Yours should, too. This
may not be of importance to you individually, but when a user installs
1000 packages that is something that one should be able to rely on being
consistent across all of them.

Kind regards

T.
-- 
Thomas Viehmann, http://thomas.viehmann.net/



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to