Package: www.debian.org Severity: normal User: www.debian....@packages.debian.org Usertags: script news
Dear Webmasters, tonight I noticed that in the RSS feed of the new DPN (2012/16) there were some comments, so I started looking at the dwn-to-rdf.pl script. I am not a Perl expert, but I came to a little workaround trying to make the script ignore the lines that start with a '#' character. I did not test the patch thoroughly, but a run with the same index.wml with the comments gave a dwn.en.rdf that seemed clean to me. As suggested in #debian-www by taffit, I removed the comments in the DPN source file and now I'm sending here the patch as a bug report. I'm almost sure that this is not the best solution, but maybe it's a start. Best regards, Mark -- System Information: Debian Release: wheezy/sid APT prefers testing APT policy: (500, 'testing') Architecture: i386 (i686) Kernel: Linux 3.2.0-3-686-pae (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash -- . ''`. | GPG Public Key : 0xCD542422 - Download it from http://is.gd/fOa7Vm : :' : | GPG Fingerprint : 0823 A40D F31B 67A8 5621 AD32 E293 A2EB CD54 2422 `. `'` | Powered by Debian GNU/Linux, http://www.debian.org `- | Try not. Do, or do not. There is no try. - Master Yoda, TESB.
Index: dwn-to-rdf.pl =================================================================== RCS file: /cvs/webwml/webwml/english/News/weekly/dwn-to-rdf.pl,v retrieving revision 1.19 diff -u -u -r1.19 dwn-to-rdf.pl --- dwn-to-rdf.pl 16 Apr 2011 23:50:00 -0000 1.19 +++ dwn-to-rdf.pl 21 Aug 2012 21:56:04 -0000 @@ -168,7 +168,9 @@ while (<F>) { # prevent double utf-8 encode by XML::RSS $_ = decode_utf8($_) if ($charset eq 'utf-8') ; - if (/^<p><strong>(.*)<\/strong>(?:<br \/>)?\s*(.*)/) { + if (/^#.*$/) { + } + elsif (/^<p><strong>(.*)<\/strong>(?:<br \/>)?\s*(.*)/) { $headline = $1; $body = $2."\n"; chop ($headline) if ($headline =~ /\.$/);
signature.asc
Description: Digital signature